3. How genotype and phenotype are measured and researched

We have just introduced the biological mechanisms linking genotype and phenotype. Next, we will discuss the details of how this connection is studied, including how data about DNA and RNA is captured, organised and stored, and how this data is used in computational biology research.

This chapter begins with a short description of popular sequencing technologies, as this is relevant to both DNA and RNA.

Then in the second section, we will retrace the steps we took in the previous chapter, looking again at DNA, RNA, proteins, and phenotypes in turn, but this time considering the data gathered about each of these entities, and the data gathered about the connections between them. Sprinkled throughout the chapter, as they become relevant, I describe some specific examples of resources and tools used in bioinformatics and computational biology that are relevant to this thesis.

Two types of tools and resources, however, have their own sections. The first are biological ontologies, which are efforts to unify some of the information gained in the experiments just described in earlier parts of this chapter. Secondly, predictive computational biology methods and the ecosystem of competitions that are often used to validate them are also described separately in section 3.4. In this section, I also explain my contribution to the update to the SUPERFAMILY resource[3].

I then describe some of the potential sources of bias in the data and tools used throughout this thesis, followed by my contribution to a project designed to counter some of these issues, the Proteome Quality Index (PQI)[2].

Finally, I summarise the data we currently have (and don’t have) on the link between genotype and phenotype.

Contributions in this Chapter

This chapter primarily summarises the work of others, but it also contains my contributions to the following collaborative projects:

  • 2014 Superfamily update paper[3]

    • Added some cyanobacteria genomes to the resource

    • Contributed to paper-writing/editing

  • The Proteome Quality Index paper[2]

    • Contributed to development of metrics for measuring proteome quality

    • Contributed to paper-writing/editing