2.5. Phenotype

2.5.1. What is phenotype?

Phenotypes are observable traits, which can range from neutral (like height, skin colour, or eye colour) to disabling (e.g. chronic fatigue syndrome) or life-threatening (e.g. cancers), to very specific measurements (e.g. level of calcium in blood). Since phenotypes can have various levels of specificity, they can also be hierarchical, an individual could display “abnormal muscle morphology”, or more specifically “facial muscle atrophy”, which means we have to decide at what level to record phenotypes. Human phenotype information is private information, and some phenotypes are not easy to measure, so information about human phenotypes is not always easy to access.

2.5.2. How do proteins influence phenotype?

The easiest phenotypes to understand genetically are Mendelian. In Mendelian phenotypes, a single mutation is responsible for a phenotype, and we can assume that the mutation changes, reduces, or stops entirely the functionality of the protein, and that this protein is the main actor involved in the trait. An example of this in humans is the OPN1MW gene which encodes for green-light absorbing pigment necessary to create green light absorbing cones in the eye: the allele that causes a non-functional OPN1MW gene therefore causes red-green colourblindness.

The way in which Mendelian genetics affect a phenotype can vary. In humans, for a SNP with two alleles, there are three possible calls: homozygous wild type (two copies of the most common allele), heterozygous (one copy of the most common allele and one copy of the rarer allele), and homozygous mutant (two copies of the rarer allele). Sometimes having one copy of the rarer allele is enough to cause a phenotype, but sometimes two copies are required. Not all SNPs are disease-causing at all, i.e. have any disease-causing combinations of alleles.

As well as mutations, phenotypes can be caused by chromosomal abnormalities (extra or missing sections of chromosomes). In this case, the mechanism is the increased or decreased gene expression of the affected section of the chromosome which is influencing phenotypic differences.

Proteins can affect the same phenotype indirectly, through protein-protein interaction networks, through interaction with the metabolism (the body’s creation of small chemicals, like sugars, fatty acids, and vitamins), and through interaction with the environment of the cell. The environment of the cell is of course in turn influenced from the human-scale environment: what we eat, whether we smoke, the air we breathe, and our body’s response to outside stimuli.

2.5.2.1. Limits

For many disease phenotypes (e.g. Breast Cancer, Asbestosis), a genetic mutation might predict an increased probability of having the phenotype, given similar environmental conditions. And there are phenotypes which may not be linked to genetic variation at all, but may be entirely influenced by the environment: for example medical conditions that are the result of poisoning. In these cases, we might imagine that there is a mutation that humans could have that would prevent or reduce the poison reaction, but since no one has it, we can’t study this by looking at human mutations.

To get a little philosophical (metaphysical) for just a paragraph, some phenotypes may not even exist. That is, they might not be natural categories such that there is a straight-forward and physical thing that decides membership to the category[35]. As an example, consider an imaginary poorly-understood syndrome, it might be diagnosed if you have some of a list of symptoms, but the syndrome might actually be four separate diseases with four totally separate causes and the treatments might only work for one of these diseases. Some phenotypes might even be social constructs; there is a long-running debate among psychologists about whether some mental health conditions and other psychological and behavioural concepts are socially constructed[36,37]. If phenotypes are not based in the physical, then we will likely have difficulty accurately predicting them from genetics.

2.5.2.2. Ethical considerations

Aside from the fact that predicting non-physical concepts is difficult, there are also ethical considerations in trying to predict socially constructed phenotypes. If we try to predict sexual orientation from genetics [38], then we might turn out to be measuring something else which indirectly influences sexual orientation, for example a protein that influences how open people are to new experiences, or something that in turn influences that. And in trying to predict intelligence from genetics, for example, we are likely finding associations between variables like how much you have practiced IQ tests or whether you are in the same cultural group as those that created them[39], reinforcing racist ideas[40].

Even if all phenotypes were natural concepts, predicting the genetic basis of some phenotypes could be harmful[41], for example finding a gay gene could be motivated by, or lead to a search for “treatments” to “cure” homosexuality even if it did have a physical basis.

Physical measurements can also be problematic for similar reasons. Take measurements of facial features for example: this brings to mind the image of nazis measuring skulls. Where physical measurements are proxies for measuring the social construct of race, these kinds of phenotypes can be similarly worrying. Facial recognition technology[42] is often criticised on this basis[43].

It is for these reasons, that the majority of modern concepts of phenotype are based in medical concepts, where looking for a link between genotype and phenotype can have a potential life-saving or life-improving benefit. This scenario still comes with serious ethical considerations, however. Many disabled people do not want cures for their disabilities[44], and people are also worried that the development of genetic screenings for disabilities will effectively result in a genocide of disabled people[45].

Another concern is that people may accidentally find out about phenotypes that they are predisposed to that they do not wish to know about. This is particularly worrying if there are not any existing preventative/proactive measures to avoid a future diagnosis, and if they are not be able to access genetic counselling. For example, 23andMe have a system whereby you must opt-in to viewing reports about your health for some illnesses.

2.5.3. The future computational biologists want

The eventual destination of this field is a full understanding of how our individual genomes and their interaction with the environment affects us. With this understanding, we would anticipate a much wider application of both personalised medicine and gene therapies. These therapies are not yet a common occurrence: the eleven approved cell and gene therapies come from a pool of such 500 clinical trials[22].

Perhaps it makes sense that we are not finding drug targets quickly, as we still don’t know the functionality of approximately 20% of human genes[46]. And genes are only a small part (1-2%) of our DNA[47], and the part we understand best. Beyond our DNA, there are many other aspects of our cellular and social environments that will have an effect on which parts of our genes are being actively used, and how much. This section provides an overview of our current scientific model for how DNA affects phenotype, so that we can identify the sources of information that we do have and can make use of.

Despite what we don’t know, this is also the moment when we have a unique hope to unravel some of these mysteries. We have openly available, expertly curated, databases containing the great collective knowledge of many experiments about our DNA, how it is being used, and what traits it affects. These databases are being filled at an alarming speed by researchers around the world with the advent of new technologies. Perhaps it is now possible to begin to synthesise some of this collective knowledge into a fuller understanding of complex traits.