logo

Phenotype from Genotype

Front Matter

  • Declaration
  • Abstract
  • Acknowledgements
  • Table of Contents

Background

  • 1. Thesis style and philosophy
  • 2. How phenotype arises from genotype
    • 2.1. Big questions: What is genetically determined, and how?
    • 2.2. Biological molecules: DNA, RNA, Proteins and the central dogma of molecular biology.
    • 2.3. A closer look at DNA: Genomes, Genes, and Genetic Variation
    • 2.4. Looking more closely at proteins: function, structure and classification
    • 2.5. Phenotype
    • 2.6. Summary: how genotype and phenotype are linked
  • 3. How the link between genotype and phenotype is researched
    • 3.1. Sequencing and microarrays
    • 3.2. From genotype to phenotype: what is measured
    • 3.3. Ontologies
    • 3.4. Predictive computational methods
    • 3.5. Sources of bias in computational biology
    • 3.6. Proteome Quality Index
    • 3.7. Summary

Phenotype prediction

  • 4. Phenotype prediction with Snowflake
    • 4.1. Introduction
    • 4.2. Snowflake Algorithm
    • 4.3. Creating Snowflake inputs
    • 4.4. Preprocessing
    • 4.5. Considerations for Clustering SNPs
    • 4.6. Testing Snowflake on ALSPAC data
    • 4.7. Discussion

Tissue-specific gene expression

  • 5. Filtering computational predictions with tissue-specific expression information
    • 5.1. Introduction
    • 5.2. Algorithm
    • 5.3. Data
    • 5.4. Validation method
    • 5.5. Filip results
    • 5.6. Discussion and Future work
  • 6. Ontolopy
    • 6.1. Introduction
    • 6.2. Functionality
    • 6.3. Ontolopy tools and practices
    • 6.4. Example uses: mapping samples to diseases or phenotypes
    • 6.5. Example use: mapping samples to tissue-related phenotypes
    • 6.6. Discussion
    • 6.7. Future Work
  • 7. Combining RNA-seq datasets
    • 7.1. Introduction
    • 7.2. Data Acquisition
    • 7.3. Data Wrangling
    • 7.4. Results and discussion

Concluding remarks

  • 8. Concluding remarks

End Matter

  • Appendix
  • Bibliography
Theme by the Executable Book Project

How the link between genotype and phenotype is researched

3. How the link between genotype and phenotype is researched¶

We have just introduced the biological mechanisms linking genotype and phenotype. Next, we will discuss the details of how this connection is studied, including how data about DNA and RNA is captured, organised and stored, and how this data is used in computational biology research.

This chapter begins with a short description of popular sequencing technologies, as this is relevant to both DNA and RNA.

Then in the second section, we will retrace the steps we took in the previous chapter, looking again at DNA, RNA, proteins, and phenotypes in turn, but this time considering the data gathered about each of these entities, and the data gathered about the connections between them. Sprinkled throughout the chapter, as they become relevant, I describe some specific examples of resources and tools used in bioinformatics and computational biology that are relevant to this thesis.

Two types of tools and resources, however, have their own sections. The first are biological ontologies, which are efforts to unify some of the information gained in the experiments just described in earlier parts of this chapter. Secondly, predictive computational biology methods and the ecosystem of competitions that are often used to validate them are also described separately in section 3.4. In this section, I also explain my contribution to the update to the SUPERFAMILY resource[3].

I then describe some of the potential sources of bias in the data and tools used throughout this thesis, followed by my contribution to a project designed to counter some of these issues, the Proteome Quality Index (PQI)[2].

Finally, I summarise the data we currently have (and don’t have) on the link between genotype and phenotype.

Contributions in this Chapter

This chapter primarily summarises the work of others, but it also contains my contributions to the following collaborative projects:

  • 2014 Superfamily update paper[3]

    • Added some cyanobacteria genomes to the resource

    • Contributed to paper-writing/editing

  • The Proteome Quality Index paper[2]

    • Contributed to development of metrics for measuring proteome quality

    • Contributed to paper-writing/editing

previous

2.6. Summary: how genotype and phenotype are linked

next

3.1. Sequencing and microarrays

By Natalie Zelenka
© Copyright 2020.