Chapter 4 Discussion

If alteration of gene expression due to DNA methylation changes contributes to the identified phenotypic adaptations between light conditions we would expect to see observable trends in our PCA plots and correlation heatmaps. We would also expect to see significant results in our differential methylation analyses.

The PCA plots and correlation heatmaps do not present any clear trends in overall methylation percentage at the CpG site and region level. Neither the clear nor the lilac replicate populations were consistently related to each other in these forms of analyses. The green replicate populations were consistently related to each other which provides some evidence of overall methylation distinction. However, the green populations were also all quite closely related to many of the other populations on both the PCA plot and correlation heatmaps. While never completely separated from the other populations in any of the analyses, the foundation population was observably separated from the main clump of populations at the tile scale. This provides some support to the idea that the artificial light conditions created an overall methylation percentage distinction from the foundation population.

While these results are far from providing strong evidence that methylation changes drive the adaptations to altered light environments, they do not detract from the results we obtained from our differential methylation analysis. This is because the PCA plots and correlation heatmaps observe overall trends of the entire methylome, much of which is unlikely to be altered by the experimental conditions even if methylation changes were driving phenotypic differences between populations. The adaptations to altered light conditions are likely to be found at specific genes where their repression and activation contribute to the known phenotypic changes in the guppy populations.

None of the dmrseq differential methylation results were statistically significant, nor could the only dmrs that were close to being considered significant be annotated to within a gene or to the promoter region of a gene. The results of this analysis would suggest that methylation changes did not meaningfully contribute to the sexually selected adaptations to the altered light environments. However, we cannot come to this conclusion simply based on these results as a lack of statistical power for the dmrseq tool to detect significant methylation changes may be at the heart of the lack of these significant results.

This is why the use of another tool to perform the differential methylation analysis is important. dmrseq creates a generalized least squares model and then performs a Wald test on the model to assess the strength of the covariate we are interested in, in this case, the light condition comparison (Korthauer et al. 2019). Methylkit, on the other hand, performs differential methylation analysis using a logistic regression when dealing with biological replicates (Akalin et al. 2012). As dmrseq and methylkit approach the analysis in a different way, gaining multiple perspectives on the same data allows us to gain a wider view of DNA methylation differences between the light condition groups.

The methylkit analysis at the most zoomed in scale, focusing on the CpG sites, also yielded limited results. It is observable that the methylation percentage differences between the light groups is very large for the clear versus lilac comparison CpG sites and yet they are still not considered statistically significantly different. This indicates that the reason for the limited number of results is due to insufficient statistical power as opposed to minimal methylation changes. The differentially methylated CpG sites for each light condition that could be annotated to a promoter region were all annotated to the NASP gene promoter region. The tile methylkit analysis yielding thousands of results which could be annotated to both the genes and promoter regions, suggests that the aggregating of CpG sites to regions provided far greater statistical power for the methylkit tool. These results indicate the opposite of what the dmrseq results suggests, that methylation changes are playing a role in the adaptations to altered light environments in these guppies.

The enormous discrepancy in results between the two tools raises the questions, is dmrseq not sensitive enough to detect differential methylation or is methylkit too sensitive in detecting differential methylation? The answer to both of these questions may be yes, meaning that only some of the identified tiles by methylkit are actually differentially methylated. What we know for sure is that the two tools utilise different techniques to detect dmrs and control for the false discovery rate (FDR). The tool dmrseq uses a generalised least squares model to perform an inference that generates region-level statistics (Korthauer et al. 2019) whereas methylkit performs the detection on individual CpG sites and then simply aggregates them within a 1 kbp range (Akalin et al. 2012). The false discovery rate is the adjustment on the p-value (generating a q-value) which is required when performing large volumes of statistical tests as the likelihood of discovering false positive results increases with each test. Both use the Benjamini-Hochberg procedure (Benjamini and Hochberg 1995) (Korthauer et al. 2019) however methylkit also uses a sliding linear model to control for this (Akalin et al. 2012). These differences between the two tools are likely to be responsible for the discrepancy in sensitivity.

Considering our question if methylkit is too sensitive in detecting differentially methylated regions, it would be safest to focus on those tiles which are associated with extremely low q-values. The two tiles associated with the most extremely low q-values in the clear versus green and green versus lilac comparison were both annotated to the promoter regions of the NASP and gpc5b gene.

The CpG sites and tiles were only annotated to the promoter region and not within the NASP or gpc5b gene and both lilac and clear have a greater methylation percentage in these tiles. This would suggest that the NASP gene and/or the gpc5b gene are being repressed in the green populations (Bock 2012). When observing the bar charts, violin plots and beeswarm plots of these two NASP and gpc5b related tiles (section 4.6) we can see that the foundation population has similar methylation percentages as the green 1 and 3 populations. This suggests that the increase in gene activity of the NASP and/or gpc5b granted a selective advantage in the clear and lilac populations. It makes sense that we see similar results for these two light conditions as they both simulate small gap natural light environments whereas the green light simulates forest shade natural light environments (section 2.1.2).

The NASP gene codes for the nuclear autoantigenic sperm protein which is a histone chaperone (Finn et al. 2008). A small percentage of genes have been experimentally studied. Most gene’s molecular and biological functions are derived using a bioinformatics tool called Phylogenetic Annotation and Inference Tool (PAINT) which uses experimental data and phylogenetic trees to infer the function of the gene (Gaudet et al. 2011). There is no experimental research done nor are there inferences made for the NASP gene’s function in guppies. In the zebrafish, the closest model organism to the guppy, NASP is assumed to play a role in histone binding and play a role in the biological functions CENP-A containing nucleosome assembly and DNA replication-dependent nucleosome assembly. The gene is expressed in the eye, immature eye, pectoral fin, pharyngeal arch, and ventricular zone (Ruzicka et al. 2019). These two animals are far less studied than humans and therefore we are likely to find far more information regarding the gene’s function when observing its function there. In humans there is experimental evidence which supports the gene’s molecular function being histone binding (Zhang et al. 2016) and protein-containing complex binding (Kato et al. 2015). From experimental evidence combined with phylogenetic based inference, the gene is assumed to play a role in numerous biological functions. These include blastocyst development, DNA replication, nucleosome assembly and histone exchange (Tagami et al. 2004), and G1/S transition of the mitotic cell cycle (Gaudet et al. 2011).

Since we obtained the DNA from the eyes and brain of the fish it is important that this gene is expressed in the eye. How repressing the activity could confer a sexually selected advantage under altered light conditions is not particularly clear. A reasonable speculation is in regard to histone function and nucleosome assembly. These molecules play an essential role in the epigenetics of an organism and therefore altering the expression of the NASP gene may contribute to other epigenetic changes which confer an evolutionary advantage (Virani et al. 2012). Further investigation of this gene’s function in guppies and the nucleosome organisation of the populations would be essential to make more conclusive claims on this matter.

There is no experimental research done on the gpc5b gene in guppies or zebrafish. The gene is assumed to play a role in the regulation of signal transduction in guppies. In zebrafish, the gene codes for a cell surface proteoglycan that bears heparan sulfate and is expressed in the blastomere, central nervous system, floor plate, intestine and notochord (Ruzicka et al. 2019). The gene is assumed to play a role in cell migration, positive regulation of the canonical Wnt signaling pathway and the regulation of protein localisation to the membrane (Gaudet et al. 2011). Both the central nervous system and ectoderm form from the cells of the neural plate, a key embryonic developmental structure (University of Lausanne). From this we can draw an admittedly tenuous connection between this gene which is expressed in the central nervous system that may have control over cell migration and the colour surface patterning in male guppies. The arrangement and colour of the male ectoderm plays an important role in sexual selection, therefore the altering of the expression of genes which control pigment cell migration may confer an evolutionary advantage. Further investigation of this gene’s expression in other areas of the guppy and its biological function in guppies would be required to make claims beyond tenuous speculation.

One of the most significantly differentially methylated tiles in the clear versus lilac comparison was annotated to the promoter region of the gene zgc:153964 or what is now called the ptk6b gene. No experimental evidence or inferences have been made in regard to the gene’s molecular or biological function in guppies. In zebrafish the gene codes for the protein tyrosine kinase and inferences have been made to determine its biological function (Ruzicka et al. 2019). The gene is assumed to play a role in cell differentiation, innate immune response, transmembrane receptor protein, and the tyrosine kinase signaling pathway (Gaudet et al. 2011). A direct link from these functions to the experimental conditions is difficult to make. Investigation of the biological function of this gene in guppies may reveal if alterations to the expression of this gene would contribute to an evolutionary advantage for guppies under altered light conditions.

A tile located within the promoter region of a gene named hmx1 was highlighted due to its inferred connection to the biological functions camera-type eye development, retinal cone cell development and retinal morphogenesis in camera-type eyes (Gaudet et al. 2011). While there were minimal methylation percentage differences between the light conditions in the barchart, violin plot and beeswarm chart, there was a significant methylation percentage difference between the foundation population and the light condition groups. These results suggest that this change in DNA methylation in the experimental populations contribute to providing phenotypic adaptations in the altered light environments.

In the clear versus green comparison, the NFIB gene was highlighted which contained four differentially methylated tiles located in its promoter region. The methylation was greater in the green light group and therefore this gene’s activity is likely repressed in the green populations (Bock 2012). There is no experimental evidence as to the gene’s function in guppies however it has been inferred that it plays a role in DNA replication; and regulation of transcription, DNA-templated. In humans, the gene codes for Nuclear factor 1 B-type and is a transcriptional activator of GFAP which is essential for proper brain development. Through both experimental research and inference, the gene is known to play a role in a very large list of biological functions. The important functions that can be connected to adaptive traits to our experimental conditions include brain development (Schanze et al. 2018), commissural neuron axon guidance, glial cell differentiation, and hindbrain development (Gaudet et al. 2011). The brain of course plays an extremely important role in vision and therefore changes to these biological functions involved in the brain and central nervous system may confer a selective advantage in altered light environments. Further investigation of the biological functions of this gene in guppies would be needed to support this speculation.

While the mitch enrichment analysis yielded no significantly enriched pathways, there were still some notable results. GTP binding in the clear versus green comparison and GTPase activity in the green versus lilac comparison both being some of the most enriched pathways are interesting results. GTPases are proteins which play a role as biological switches. They can be activated in response to transmembrane proteins such as those mediating smell, taste and importantly vision (Gilman 1987). The GTPase transducin is involved in the translation of photons to an electrical signal in the retina. The activation of this protein, which occurs when photons hit the retina and activate rhodopsin, governs the intensity of light experienced by the organism (Ebrey and Koutalos 2001). Adjustments to the intensity of light experienced by the fish could be very important adaptations to altered light environments. Another notable result from the GvL mitch analysis is the enrichment of the pathway ‘brain development’ for reasons mentioned earlier.

Due to our questions of methylkit’s sensitivity in detecting differential methylation, we should be skeptical of our cluster profiler over-representation analysis results from the analysis where tiles were selected if they were associated with a q-value of <0.05. The results of the analysis in which tiles were only selected if they were associated with a q-value of <0.001 are likely to be more reliable.

The clear versus green over-representation analysis where all tiles associated with a q-value of <0.001 were selected found that the pathway ‘regulation of GTPase’ was significantly enriched. This is important for the reasons mentioned earlier. The green versus clear over-representation analysis where all tiles associated with a q-value of <0.001 were selected found that many pathways related to fatty acids were significantly enriched. This can be linked, again tenuously, to the male surface colour patterning of males as fatty acids are important in the deposition of carotenoids as scale pigments (Kodric-Brown 1989).

The enrichment of ‘Regulation of cell migration’ in the clear versus green over-representation analysis where those tiles methylated in the positive direction and associated with a q-value of <0.05 were selected is an interesting result because of the aforementioned reasons. ‘Neural crest cell’ enrichment in the same analysis is interesting for a similar reason, as neural crest cells are the developmental cells that ultimately differentiate into the vertebrate chromatophores in guppies (Kottler et al. 2014). ‘Melanocyte differentiation’ enrichment in the clear versus lilac over-representation analysis where those tiles methylated in the positive direction and associated with a q-value of <0.001 were selected is an interesting result for similar reasons to that of the ‘neural crest cell’ enrichment. This is because melanocytes are pigment producing cells in many animals, including guppies (Goding 2007). Differentiation in these and neural cells are likely to contribute to the known phenotypic changes seen in these guppy populations. The significant results of ‘Retina morphogenesis in camera-type eye’ enrichment in the clear versus lilac over-representation analysis where those tiles methylated in the positive direction and associated with a q-value of <0.05 were selected and ‘embryonic camera-type eye development’ enrichment in the green versus lilac over representation analysis where those tiles methylated in the negative direction and associated with a q-value of <0.05 were selected.

4.1 Conclusion

There is some conflicting evidence as to whether methylation changes are driving the known adaptations to altered light environments in the guppy populations. However, overall this study has provided a large amount of evidence that methylation changes are in some way contributing to the known and potentially unknown phenotype changes. A number of genes associated with significantly differential methylation tiles and enriched biological pathways have been highlighted within this thesis. Many more genes could not be highlighted due to the sheer number of identified differentially methylated regions. The gene expression levels of these genes are likely being altered due to methylation differences between the populations. Many genes need to be investigated further in order to confirm their importance in contributing to adaptations. Ultimately these results provide evidence that methylation changes contribute to short scale evolution in vertebrates.