Abstract The genome sequence available for different Plasmodium species is a valuable resource for understanding malaria parasite biology. However, comparative genomics on its own cannot fully explain all the species-specific differences which suggests that other genomic aspects such as regulation of gene expression play an important role in defining species-specific characteristics. Here, we developed a comprehensive approach to measure transcriptional changes of the evolutionary conserved syntenic orthologs during the intraerythrocytic developmental cycle across six Plasmodium species. We show significant transcriptional constraint at the mid-developmental stage of Plasmodium species while the earliest stages of parasite development display the greatest transcriptional variation associated with critical functional processes. Modeling of the evolutionary relationship based on changes in transcriptional profile reveal a phylogeny pattern of the Plasmodium species that strictly follows its mammalian hosts. In addition, the work shows that transcriptional conserved orthologs represent potential future targets for anti-malaria intervention as they would be expected to carry out key essential functions within the parasites. This work provides an integrated analysis of orthologous transcriptome, which aims to provide insights into the Plasmodium evolution thereby establishing a framework to explore complex pathways and drug discovery in Plasmodium species with broad host range. Abbreviations: IDC, intraerythrocytic developmental cycle; P. falciparum, PF, Plasmodium falciparum; P. vivax, PV, Plasmodium vivax; P. knowlesi, PK, Plasmodium knowlesi; P. berghei, PB, Plasmodium berghei; P. chabaudi, PC, Plasmodium chabaudi; P. yoelii, PY, Plasmodium yoelii; PCC, Pearson correlation coefficient; Ds, Divergent score; Ph, Phase; PBS, Phosphate-buffered saline; K[a], non-synonymous site; K[s], synonymous site; hr, hour; SRCC, Spearman Rank Correlation Coefficient; AU, approximately unbiased; BP, bootstrap probability; MPM, Malaria Parasite Metabolic Pathway Keywords: Comparative transcriptomics, Plasmodium species, Evolution, Microarray, Transcriptome, Drug targets Highlights * • Comparison of variations in mRNA abundance across six different Plasmodium species. * • Transcriptional conservation and divergence of Plasmodium syntenic orthologs. * • Pattern of Plasmodium transcriptome evolution are established. * • Transcriptionally conserved orthologs represent attractive intervention targets. Malaria remains a major public health concern despite global efforts in the fight against this disease. The intraerythrocytic stage of the malaria parasites is currently in the spotlight for anti-malarial intervention and vaccine targets. The primary goal of this study is to generate a comprehensive and directly comparable transcriptome dataset across multiple Plasmodium species originating from different hosts. We establish that specific pathways and intraerythrocytic stages are more transcriptionally diverged than others, reflecting transcriptional evolutionary diversity. We further propose a panel of transcriptionally conserved genes as potential drug targets. 1. Introduction Protozoan belonging to the Plasmodium species are obligate intracellular parasites that display substantial developmental complexity during their life cycle in the vertebrate hosts and mosquito vectors. The development of single nucleated parasite cells into multi-nucleated schizonts through several rounds of mitosis closely resembles embryonic development of multicellular organisms with a majority of genes changing their expression during this period ([43]Piras et al., 2014, [44]Quint et al., 2012, [45]Bozdech et al., 2003, [46]Irie and Kuratani, 2011). The intraerythrocytic developmental cycle (IDC) exhibits a tightly regulated transcriptional cascade in which essentially every gene in the genome is targeted to a specific stage of the parasite development. This precise control of gene expression ultimately governs critical functional processes for Plasmodium species to thrive within the host erythrocytes ([47]Bozdech et al., 2003, [48]Le Roch et al., 2003). We have previously reported that orthologous expression between two human Plasmodium species, Plasmodium falciparum and Plasmodium vivax, showed transcriptional divergence in 30% of the genes, some of which could be linked to known functional differences between these two species ([49]Bozdech et al., 2008). This suggested that while there is high conservation of the overall pattern of genome activity during the IDC of the two Plasmodium species, there is transcriptional variation of a subset of genes that enable the parasites to adapt to the individual host niches presumably defined by variations in nutrients, metabolites, as well as other cellular components ([50]Kafsack and Llinas, 2010, [51]Srivastava et al., 2015). However, no studies so far have addressed the extent of transcriptional diversity of Plasmodium species from distinct host erythrocytes. Here, we developed a comprehensive analysis of the IDC transcriptional profiles; from uniform data processing to extensive orthology annotation, which allow us to directly compare variations in mRNA abundance across six different Plasmodium species and significantly extends previous work using pairwise comparison ([52]Bozdech et al., 2008). The Plasmodium conserved syntenic orthologs show significant transcriptional divergence at the earliest stages of parasite development, with transcriptional phylogeny pattern that strictly follows the Plasmodium species mammalian hosts. Furthermore, changes in the expression of key putative transcriptional regulators are implicated in the transcriptional diversity. Critically the work also provides the tools for the identification of new and so far uncharacterized drug targets, as the orthologs that show a conserved transcription pattern across the species are likely to carry out critical conserved functions in all Plasmodium species. 2. Materials and methods 2.1. Sample collection for microarray analysis 2.1.1. Rodent malaria parasites All studies involving mice were approved by the institutional animal care and use committee (IACUC) of the Nanyang Technological University, Singapore. Male BALB/c mice 6–7 weeks old, bred specific pathogen free (SPF) at the Nanyang Technological University Animal Resource Facility, were infected with either cryopreserved stocks of parasites or by syringe passage from a pre-existing infected mouse. Mice were infected by intraperitoneal injections of Plasmodium berghei ANKA, Plasmodium chabaudi AS or Plasmodium yoelii 17 × parasitized erythrocytes and parasitaemia and parasite stages were monitored by thin blood smears stained with Giemsa. For P. berghei and P. yoelii infection, mice were terminal bleed and the stage-specific parasitized erythrocytes were separated via Nycodenz density gradient. The ring stage interface was isolated, washed and subjected to ex vivo culture, which was then collected every 2 hr over the course of 24 hr over a complete IDC life-cycle. Mice infected with P. chabaudi were terminal bled every 2 hr under anesthesia over the course of 24 hr. Blood was collected and filtered through Plasmodipur filters (Eurodiagnostica, Netherlands) to remove white blood cells and then washed with PBS. The washed blood was flash-frozen in liquid nitrogen and stored at − 80 °C until further use. 2.2. RNA extraction, cDNA preparation and DNA microarray hybridization Total RNA was isolated using a standard protocol using trizol/chloroform extraction as described by ([53]Bozdech et al., 2003). For preparation of the target DNA for microarray hybridization, Switch Mechanism at the 5′ end of Reverse Transcription (SMART) PCR approach was employed ([54]Petalidis et al., 2003). Thereafter, cDNA was synthesized by using the reverse transcriptase (PowerScript, Clontech BD) for 2 hr at 42 °C. This was followed by PCR amplification with Taq Polymerase (NEB) and the resulting PCR product was purified using MiniElute DNA purification kit (Qiagen). The purified DNA was labeled with fluorescent dye Cy5 (Amersham). A reference pool comprising of equal mass of total RNA samples representing all developmental stages of the parasite was prepared and labeled with Cy3 (Amersham). The microarray hybridization was carried at 65 °C in the automated hybridization station (MAUI, USA). In these two channels competitive hybridizations, RNA from each time point was labeled by Cy5 and hybridized against a reference RNA pool labeled with Cy3. Data acquired were analyzed by GenePix Pro software (Axon Instruments USA). 2.3. Microarray data processing and analysis 2.3.1. Reannotation of oligos All the oligonucleotides used in this study for the rodent malaria parasites microarray were designed by OligoRankPick as previously described ([55]Hu et al., 2007). The rodent malaria parasites microarray contained 13,224 60-mer oligos. The unprocessed microarray hybridization spots for P. falciparum, P. vivax and P. knowlesi were derived from previously published data; P. falciparum Dd2 strain ([56]Foth et al., 2011), P. vivax smru1 strain ([57]Bozdech et al., 2008) and P. knowlesi PkHa strain ([58]Lapp et al., 2015). The oligonucleotides used in the current and previous microarray studies were blasted against the genomes of the respective Plasmodium species from PlasmoDB release 8.2. As a result, a total of 5276 P. falciparum genes, 4700 P. knowlesi genes, 5017 P. vivax genes, 6251 P. yoelii genes, 4486 P. chabaudi genes and 4401 P. berghei genes were uniquely represented on the corresponding microarray datasets. 2.3.2. Normalize and data filtering All microarray hybridization spots obtained from all six Plasmodium species were subjected to “normexp” background correction followed by LOWESS (locally weighted scatterplot smoothing) normalization within each array and quantile normalization between arrays using Limma package of R. Log[2] ratios of Cy5 over Cy3 intensities were calculated for each spot to represent expression value of a particular probe except those with signal intensity < 1.5 times the background intensity for both Cy5 and Cy3 fluorescence. For each gene, the expression value was estimated as the average of all probes representing it. Overall, 4750 (90% of genes designed on the microarray) P. falciparum genes, 4670 (99%) P. knowlesi genes, 4884 (97%) P. vivax genes, 5486 (88%) P. yoelii genes, 3990 (89%) P. chabaudi genes and 3787 (86%) P. berghei genes display expression profiles with one or zero missing value across each IDC life cycle. These “processed” microarray expression dataset was used for subsequent analysis. The raw and processed microarray data for P. yoelii, P. berghei and P. chabaudi have been deposited into Gene Expression Omnibus (GEO accession number: [59]GSE80015). 2.3.3. Phaseogram The expression profile of each gene is modeled using sine function which has been described in detail ([60]Lapp et al., 2015). Briefly, the formula is E(t) = A × sin (ωt - α) + C. where E(t) is the log[2] ratio sample/reference control at the t time point of sample collection, A is the amplitude of expression profile across life cycle, C is the vertical offset of profile from zero, ω is the angular frequency given by 2π/h and h is the length of a complete IDC duration expressed in hours (details see below), and α is the horizontal offset of profile from zero. α was projected into an interval ranging from 0 to 2π with, π/2 of α representing the peak expression at early ring stage matching to 0. The converted α was subsequently used as phaseogram (Ph) to indicate the IDC timing where gene expression peaks. Ph for each gene was sorted from early to late IDC for complete visualization of genes expressed in the complete IDC. 2.3.4. Estimation of IDC duration and adjustment of phaseogram The timing for complete IDC varies between the six Plasmodium species and the time points collected for each sample were different. Therefore, h, which is the length of complete IDC duration was optimized and projected within 44 hr to 54 hr for P. falciparum, P. vivax and 20 hr to 36 hr for P. knowlesi and the other three rodent strains to determine the best fit sine function model ([61]Lapp et al., 2015). As a result, the optimized h is 48 hr for P. falciparum, 49 hr for P. vivax, 29 hr for P. knowlesi, 27 hr for P. yoelii, 28 hr for P. chabaudi genes and 29 hr P. berghei. To minimize the effect of asynchronous parasites between species samples, the phaseogram was adjusted for P. vivax, P. knowlesi, P. yoelii, P. chabaudi and P. berghei to P. falciparum reference. For example, for P. vivax phaseogram adjustment, the best matching time points with P. falciparum for each P. vivax time point was estimated by the highest Spearman Rank Correlation Coefficient (SRCC) values between global transcription profiles of syntenic orthologs. Next, the average shift of time using those best matching time point pairs with SRCC < 0.2 was calculated, which resulted in 25 time points for each gene per species on a sine wave model. 2.3.5. Delta phase, ΔPh The dissimilarity of transcription profiles between two genes is given by: [MATH: Phab=minPha-Phb2π-Pha-Phb,ab. :MATH] Where ∆ Ph is the distance of Ph expression timing between gene a and gene b, Ph[a] and Ph[b] is the expression timing of gene a and b respectively. 2.3.6. Medoid gene and Ds value The metric of transcription divergence among multiple genes, given by Ds, is the average ∆ Ph of each gene to the medoid gene. K-medoids clustering methods was applied, where K equal to 1, and took the genes in question as a cluster. The medoid gene of this cluster was determined based on the dissimilarity matrix, which contained ∆ Ph of all gene pairs. Ds value is the average dissimilarity of this cluster based on its medoid. Calculation was conducted using the package ‘cluster’ of R. 2.3.7. Detection of outlier gene and species For a syntenic orthologs, outlier of one species is defined as the ortholog gene with the most divergent expression timing or Ph compare to the other species member within an ortholog group. To detect the outlier for each ortholog group, we construct the dissimilarity matrix of species based on ∆ Ph. Outlier is define as the ortholog gene of a particular species which maximally contributes to the sum of ∆ Ph of that dissimilarity matrix. 2.3.8. Dendrogram of transcriptome relationship Dendrogram was constructed by applying Ward hierarchical clustering method based on the dissimilarity matrix containing the distance of each pair of species, ∆ Ph. The distance of two species is defined as; [MATH: DAB=i=0n PhiAB2,AB :MATH] where D(A,B) is the distance of species A and B, n is the total number of syntenic orthologous gene which is equal to 2312 and ∆ Ph[i] (A,B) if the distance of expression timing between the ith gene of species A and its orthologous gene in species B. The dendrogram cluster was subjected to 100 times bootstrapping to estimate the percentage of approximately unbiased (AU) p-value and bootstrap probability (BP). 2.3.9. K[a]/K[s] ratio K[a]/K[s] ratio is the ratio of the number of nonsynonymous substitutions per non-synonymous site (K[a]) to the number of synonymous substitutions per synonymous site (K[s]). Selective pressure of protein-coding genes between six Plasmodium species was estimated by calculating K[a]/K[s] ratio for each pair of syntenic orthologs for a total of 2312 orthologs. The syntenic orthologs were aligned using ClustalW ([62]Larkin et al., 2007) and K[a]/K[s] were calculated using package ‘seqinr’ of R ([63]Charif and Lobry, 2007). Dendrogram indicating evolution relationship between species was constructed based on the dissimilarity matrix in which the distance of two species was defined by the mode of K[a] values. 2.3.10. Functional enrichment analysis Functional enrichment analysis was carried out by identifying syntenic orthologs sets that are significantly over-represented in studied genes list by comparing to annotated gene sets from P. falciparum MPM database. Hypergeometric test was applied to calculate the level of significance of indicated orthologous gene sets from the MPM database. 3. Results 3.1. Lack of conservation of specific ortholog expression in Plasmodium species' life cycle Despite the long evolutionary time scale of about 100 million years ([64]Escalante and Ayala, 1995, [65]Escalante and Ayala, 1994), the overall genome organization and content across the sequenced Plasmodium species is highly conserved, with about 4000 conserved syntenic genes located within the central core regions of the 14 chromosomes ([66]Hall et al., 2002, [67]Carlton et al., 2008a, [68]Pain et al., 2008, [69]Gardner et al., 2002, [70]Hall et al., 2005). On the other hand, species-specific differences are attributed to the variant multigene families that reside predominantly at the subtelomeric regions. In this study, we generated the IDC transcriptomes for the rodent malaria parasites P. berghei, P. yoelii and P. chabaudi and compared these with the previously generated data for the human malaria species P. falciparum ([71]Foth et al., 2011), P. vivax ([72]Bozdech et al., 2008), and the simian malaria species P. knowlesi ([73]Lapp et al., 2015) (Supplementary Table 1, Supplementary Fig. 1a). The IDC transcriptional profiles of each gene were first best fitted onto a sine-wave function as described in the experimental procedure section ([74]Fig. 1a, Supplementary Fig. 1b). This allowed us to define several parameters that reflect the temporal aspects of gene expression and thus make direct species-to-species comparisons. The quality of the newly generated data was further verified by comparisons with the recently published P. berghei RNA-seq data showing high levels of correlation (Supplementary Fig. 1c) ([75]Otto et al., 2014). Overall, 3787 to 5486 genes (including species-specific genes) exhibited specific IDC transcriptional profiles in the six Plasmodium species ([76]Table 1). In contrast, between 29 and 1436 genes are unique to individual species with specific transcriptional profiles ([77]Table 1, Supplementary Fig. 1d), where the majority of them are genes restricted to subtelomeric ends and are involved in antigenic variation, host-parasite interactions, cytoadherence and erythrocyte aggregation for all six species, similar to previous genomic studies ([78]Kooij et al., 2005, [79]Kyes et al., 2001, [80]Dzikowski et al., 2006, [81]Galinski and Corredor, 2004). Interestingly, while the distribution of peak time expression remains relatively constant for the syntenic orthologs, the species-specific genes show remarkable variability (Supplementary Fig. 1d). Fig. 1. [82]Fig. 1 [83]Open in a new tab Smoothed transcriptome and comparative correlation transcriptomic analyses of P. falciparum, P. vivax, P. knowlesi, P. berghei, P. yoelii and P. chabaudi. (a) Overall intraerythrocytic developmental cycle (IDC) transcriptome profiling for P. falciparum (4750 genes), P. vivax (4884 genes), P. knowlesi (4670 genes), P. yoelii (5486 genes), P. berghei (3787 genes), and P. chabaudi (3990 genes). The phaseograms were generated from the log2 expression ratio and each profile was median centered. The phaseograms also include the expression of 2312 syntenic orthologous genes present in all 6 Plasmodium species. (b) Histograms showed overall distribution of Pearson correlation coefficients (PCCs) calculated from the smoothed transcriptome dataset between P. knowlesi, P. chabaudi, P. berghei and P. yoelii using P. falciparum as reference species. (c) Venn diagram analysis of co-expressed genes with PCC scores of ≥ 0.5–1.00 in different Plasmodium species. The numbers in brackets beside each species pairs (Pf vs Pv, Pf vs Pk, Pf vs Pb, Pf vs Pc, Pf vs Py) represent total number of genes with PCCs of ≥ 0.50. The numbers inside the Venn diagram represent total number of overlapped orthologous genes between the species pair. Table 1. Summary table of number of genes that are syntenic orthologs and with specific transcriptional profile across all six Plasmodium species. Microarray data analysis __________________________________________________________________ Pf Pk Pv Py Pb Pc Reannotation of oligos for microarray data analysis 5276 4700 5017 6251 4401 4486 After data normalization and filtering (related to [84]Fig. 1) 4750 4670 4884 5486 3787 3990 Syntenic orthologs and species-species specific genes Syntenic orthologs present in at least one or more species (OrthoMCL DB) 4148 4286 4275 3577 3708 3783 Syntenic orthologs present in all six species (OrthoMCL DB) 3374 3374 3374 3374 3374 3374 Syntenic orthologs present in at least one or more species with specific transcriptional profile 3175 3182 3190 3161 2951 3025 Syntenic orthologs present in all six species with specific transcriptional profile 2312 2312 2312 2312 2312 2312 Species-specific genes with specific transcriptional profile (non-orthologs, non-syntenic) 84 245 457 1436 29 136 [85]Open in a new tab Previous studies using Pearson correlation coefficient (PCC) to compare P. falciparum to P. vivax had indicated that PCC between 1 and 0.5 represent strongly correlated pattern, 0.5 to − 0.2 represent a moderate timing shift in gene expression, while PCC < − 0.2 indicate dramatic changes of gene expression along the IDC ([86]Bozdech et al., 2008) (Supplementary Fig. 2). Using the same approach, we compared all orthologous gene pairs with P. falciparum as the reference (Pf-Pk, Pf-Pc, Pf-Pb and Pf-Py) and showed high correlations between 50 and 65%, intermediate to low correlations for 18% to 23% and anti-correlation for 18% to 26% ([87]Fig. 1b). P. yoelii appears most divergent from P. falciparum with 725 (23.1%) and 815 (25.9%) gene pairs with low and anti-correlations, respectively ([88]Fig. 1b). The overall distribution of highly correlated orthologous genes (PCC 1–0.50) showed only 474 genes conserved among the six species ([89]Fig. 1c). Even when using any group of five out of the six species the total number of conserved genes would increase by only 104 to 341 genes ([90]Fig. 1c). This suggests that only a relatively small subset of genes remained transcriptionally conserved throughout Plasmodium evolution while the majority underwent (some level of) transcriptional diversification. The results also indicate a relatively limited constraint on transcriptional timing across the syntenic orthologs providing a considerable flexibility for species-specific adaptation among Plasmodium species. 3.2. Pattern of evolution in Plasmodium transcriptional regulation To compare the temporal character of expression between orthologous genes, we define the time-shift (delta phase, ∆ Ph) relationship between the ortholog pairs derived from their IDC gene expression profiles. ∆ Ph between the species' orthologs for each gene expression profile was first projected onto a polar coordinate system with timing of gene expression being depicted by values ranging from 0 to 2π (Supplementary Fig. 3). Phase adjustment, Ph was performed by offsetting the peak of gene expression timing to the peak gene expression time point at the early ring-stage of development ([91]Bozdech et al., 2008) (Supplementary Table 2). This method takes advantage of a uniform periodic character of the gene expression profiles that allows much more efficient and representative normalization of gene expression profiles compared to PCC-based similarity or Euclidean distance tests generally used to study other organisms ([92]Pereira et al., 2009, [93]Glazko and Mushegian, 2010). In total, we established ∆ Ph for the 2312 syntenic ortholog groups with IDC expression datasets for all six Plasmodium species. K-means clustering for the ∆ Ph values revealed eight distinct clusters that represent the overall evolutionary pattern of transcriptional regulation in Plasmodium ([94]Fig. 2a). The eight clusters consist of gene groups that are (i.) fully conserved in all six species; (ii.) show no similarity in gene expression in any species pair; (iii.) genes with diverse expression in one of the species while similar in the other five. In particular, (i) there are 818 genes whose transcriptional timing have been maintained throughout evolution ([95]Fig. 2a), representing 35.4% of the 2312 syntenic orthologs. This transcriptionally conservation suggests a crucial role of these genes in Plasmodium biology. On the other hand, (ii) there are 309 genes that are completely devoid of any transcriptional conservation among the Plasmodium species. Although the role(s) of these genes is generally unknown, this large diversity suggests their involvement in highly dynamic evolutionary processes in Plasmodium. Finally, (iii) there are six clusters ranging from 136 to 357 genes, in which one Plasmodium species shows a diversion from the other five ([96]Fig. 2a). To further clarify these findings we calculated the overall transcriptional divergence (Ds score) among orthologs as the average ∆ Ph between each species gene to the medoid species gene ([97]Fig. 2b). When including all 6 species, about 50% of the ortholog groups have a Ds6 value of < 0.75 (conserved transcription). This can be significantly increased to approximately 80% if one (Ds5) or two (Ds4) outlier gene(s) with the most divergent Ph are removed ([98]Fig. 2b, Supplementary Table 2). This supports the model that the majority of transcriptional diversion occurred in individual single species diverting gene expression from a putative ancestral transcriptional profile. Finally, the ∆ Ph distance measure separates the Plasmodium species into two distinct clades that delineates precisely along the mammalian host species ([99]Fig. 2c). This suggests that transcriptome-based phylogeny is shaped by the adaptation of the different Plasmodium species in their respective mammalian host erythrocytes; the human P. falciparum and P. vivax, the rhesus macaque P. knowlesi and the rodent Plasmodium species. Fig. 2. [100]Fig. 2 [101]Open in a new tab Delta phase (ΔPh) and Ds as the measurement of transcriptional divergence. (a) ΔPh is calculated based on the absolute difference of Ph or peak in gene expression timing between any two orthologs in two different species. Heatmap shows the syntenic orthologs with K-means clustering (100 runs) of ΔPh of species pair P. vivax; Pv, P. knowlesi; Pk, P. chabaudi; Pc, P. berghei; Pb and P. yoelii; Py with reference to P. falciparum; Pf. Yellow represents highly divergent genes with large delta phase, while black represents highly conserved genes with small delta phase compare to P. falciparum. (b) Ds score is the average ΔPh between each orthologs to the medoid ortholog measuring the transcription divergence of a gene across multiple species (see method). Overall frequency distribution and cumulative proportion of Ds measured from 2312 syntenic orthologous genes present in all six Plasmodium species (Ds6), any five Plasmodium species (Ds5) or any four Plasmodium species (Ds4). (c) Hierarchical clustering of the six Plasmodium species based on the IDC phaseogram using Wards algorithm of clustering and dissimilarity matrix defined by ∆ Ph (see methods). Numbers adjacent to the branch points are percentage of approximately unbiased (AU), P-value (in red) and bootstrap probability (BP) (in green). 3.3. Sequence and transcription-based evolution in Plasmodium species In Plasmodium, diversity of genomic sequence is considered to be a major factor of speciation, in particular involving species-specific gene families encoding of factors of host-parasite interaction ([102]Frech and Chen, 2011). In the next step we wished to evaluate the evolutionary role of transcription in the context of genome sequence diversion. For that, we first estimated the number of non-synonymous substitutions per non-synonymous site (K[a]) and the number of synonymous substitutions per synonymous site (K[s]) for the syntenic orthologous gene groups for each species pair ([103]Fig. 3a). The distribution of K[a]/K[s] ratio reflects the level of selection pressure on protein-coding genes throughout the evolution of Plasmodium parasites species ([104]Fig. 3a), with the rodent malaria parasites showing a greater difference in the change of protein-coding genes within the species group. One of the speculated factors of the rodent malaria parasites evolution is the geographical aspect of adaptation to their hosts that subsequently led to significant changes in the coding regions within the rodent species compared to their primate counterparts. Accordingly, the overall transcriptome in the rodent Plasmodium species exhibits a greater divergence compared to the primate-infecting species ([105]Fig. 3b). This suggests broader diversification of the rodent-infecting species driven by both sequence diversity and transcriptional divergence. We next assembled a dendrogram based on non-synonymous K[a] ratio, which reflects the phylogenetic relationships among the Plasmodium species based on the sequence homology. Here, P. vivax and P. knowlesi are more closely related than P. falciparum and P. berghei, which forms a separate cluster together with P. yoelii and slightly more distant P. chabaudi ([106]Fig. 3c). This is consistent with the previously constructed phylogenetic topology based on partial mitochondrial genomes ([107]Escalante et al., 1998, [108]Carlton et al., 2008b) as well as genes encoding surface antigens ([109]Weedall et al., 2008). The sequence-based phylogenetic tree(s) are different from the transcriptional differences that is reflective of its mammalian host lineage (as mentioned above) ([110]Fig. 2c). The sequence-based relationships seem to be driven by other (than host) factors such as AT content of the genome (with P. falciparum having a much higher AT content than P. knowlesi and P. vivax) and possibly reticulocyte preferences ([111]Craig et al., 2012).