Abstract Gene expression of Plasmodium falciparum (Pf) liver-stage (LS) parasites has remained poorly characterized, although they are major vaccine and drug targets. Using a human liver-chimaeric mouse model and a fluorescent parasite line (PfNF54^CSPGFP), we isolated PfLS and performed transcriptomics on key LS developmental phases. We linked clustered gene expression to ApiAP2, a major family of transcription factors that regulate the parasite life cycle. This provided insights into transcriptional regulation of LS infection and expression of essential LS metabolic and biosynthetic pathways. We observed expression of antigenically variant PfEMP1 proteins and the major Pf protein export machine PTEX and identified protein candidates that might be exported by LS parasites. Comparing Pf and P. vivax LS transcriptomes, we uncovered differences in their expression of sexual commitment factors. This data will aid LS research and vaccine and drug target identification for prevention of malaria infection. Subject terms: Malaria, Parasite genomics, Transcriptomics __________________________________________________________________ Transcriptomics of gene expression in Plasmodium falciparum liver-stage parasites reveals transcriptional regulation, metabolic pathways and antigen expression, facilitating vaccine and drug target identification. Main Plasmodium falciparum (Pf) parasites cause the most devastating form of human malaria and are responsible for the majority of clinical cases and deaths, followed by Plasmodium vivax (Pv) parasites in terms of health impact^[68]1. Extensive malaria control and treatment efforts have significantly reduced malaria morbidity and mortality in the past two decades. However, this decline has stagnated over the past 7 years^[69]2,[70]3 and malaria is rising again with 241 million cases and 619,000 deaths annually in 2021^[71]1. This necessitates the search for new interventions based on a better understanding of malaria parasite biology. Plasmodium sporozoites, deposited into the skin of human hosts by Anopheles mosquito bites, invade hepatocytes and develop into exo-erythrocytic schizonts^[72]4, which release merozoites, initiating the symptomatic blood stage (BS) infection of the parasite life cycle. The asymptomatic LS infection is considered a promising target for malaria vaccine development and drug prophylaxis. Relatively low numbers of sporozoites are transmitted by mosquitoes and only a fraction of them successfully infect the liver and undergo complete LS development^[73]5. Furthermore, complete LS elimination prevents the onset of BS infection and onward sexual-stage parasite transmission^[74]6–[75]8. The two licensed malaria vaccines RTS/S and R21 aim to prevent liver infection but do so imperfectly and thus show only moderate efficacy^[76]9. However, vaccination with live-attenuated malaria parasites has demonstrated that complete protection from infection is achievable^[77]10–[78]14. Lastly, drugs targeting LS would be less prone to the development of drug resistance compared with drugs targeting the BS and would provide protection from infection^[79]15. Yet, the gene expression landscape of PfLS remains largely uncharacterized due to technical challenges. Pf sporozoites have an almost exclusive tropism for primary human hepatocytes, infection rates are low in cultured hepatocytes, and abnormal LS development is common in human hepatoma cell lines^[80]16,[81]17. Studies of Pf and PvLS biology advanced with improved in vitro primary human hepatocyte culture systems^[82]18–[83]20 and the human liver-chimaeric mouse models^[84]21,[85]22, the latter of which enable in vivo LS studies. One of these models are FRG NOD mice, transplanted with primary human hepatocytes (FRG huHep mice). These mice are highly susceptible to infection with Pf and Pv sporozoites and support full LS development and parasite transition to BS infection when they are additionally engrafted with human red blood cells (huRBCs)^[86]21,[87]23–[88]25. Using FRG huHep mice in combination with a fluorescent PfNF54^CSPGFP line to obtain LS infections, we herein conducted a comprehensive transcriptome analysis of PfLS development in vivo. Results PfGFP parasites enable isolation of infected hepatocytes We created a fluorescent Pf parasite line (PfNF54^CSPGFP) using the promoter region of the circumsporozoite protein gene (CSP) to drive strong expression of green fluorescent protein (GFP) (Extended Data Fig. [89]1a) in pre-erythrocytic parasite stages, including PfLS^[90]21,[91]24. PfNF54^CSPGFP parasites were characterized throughout the parasite life cycle and showed normal asexual BS growth and gametocyte maturation. Furthermore, parasite infection in mosquitoes was comparable to wildtype (WT) PfNF54 parasites (Extended Data Fig. [92]1b). CSP expression initiates during oocyst development^[93]26, and we confirmed that GFP expression occurred in the different mosquito stages by both western blot and live fluorescence microscopy (Extended Data Fig. [94]1c–e). We then assessed PfNF54^CSPGFP parasite competence in successfully completing LS development. The PfNF54^CSPGFP parasites, injected as sporozoites into FRG huHep mice repopulated with huRBCs, successfully transitioned to BS infection by day 7 post infection (PI) (Extended Data Fig. [95]1f). Extended Data Fig. 1. PfNF54^CSPGFP parasites display normal characteristics throughout life cycle. [96]Extended Data Fig. 1 [97]Open in a new tab A) Strategy to generate the PfNF54^CSPGFP parasite line. Black arrows indicate PCR primer binding location used for the diagnostic PCR analysis. Gel shows PCR results with primer sets to amplify recombinant DNA template of pCSP-GFP construct and show 5’ integration (primers 1 and 3), 3’ integration (primers 4 and 2) and wild type DNA template (primers 1 and 2). Primers for Pf18S rRNA were used as loading DNA control. B) Comparison of the infection prevalence, number of oocysts per midgut and salivary glands sporozoites per mosquito, between PfNF54^CSPGFP line (fifteen individual mosquitoes’ infections) and PfNF54 WT parasites (nine individual mosquitoes’ infections). Data are presented as means ± SD for the first graph and median values for the remaining graphs. C) Live imaging of PfNF54^CSPGFP in the mosquito stages to assess GFP expression, top panel day 11 midguts, middle panel day 14 salivary glands, lower panel day 15 dissected sporozoites. Scale bar 50 μm. D) Percent oocysts expressing GFP under CSP promoter observed in the midguts under fluorescent microscope on Day 7 and 11 post PfNF54^CSPGFP infected blood feed to the mosquitoes. In two biological replicates. Data are presented as means ± SD. E) Western blot analysis of CSP and GFP expression in the lysate prepared from 1 million sporozoites and ten PfNF54^CSPGFP infected mosquito midguts of Day 5, 7, 9 and 11 post infected blood feed. F) Pf18S rRNA measured at different time point by qRT-PCR after the blood was collected from FRGhuHep mice infected with PfNF54^CSPGFP on day 7 post sporozoite infection and then cultured in vitro. To isolate PfLS GFP^+ parasite-infected hepatocytes (PfGFP^+), FRG huHep mice infected with PfNF54^CSPGFP sporozoites were euthanized on days 2, 4, 5 and 6 PI (Fig. [98]1a), and processed by fluorescence-activated cell sorting (FACS). A stringent gating strategy was applied during FACS to isolate pure PfGFP^+ hepatocytes and minimize uninfected hepatocyte contamination (Extended Data Fig. [99]2a). A total of 2,500–3,000 PfGFP^+ cells were isolated for each LS timepoint. Based on FACS analysis, infection rates averaged 0.1–0.15% (Fig. [100]1b). Purity of the PfGFP^+ infected hepatocyte populations were confirmed by live fluorescence microscopy (Fig. [101]1c). Sorted cells were then processed for RNA-seq library creation and next generation sequencing (NGS). Fig. 1. PfNF54^CSPGFP parasites enable the isolation of PfLS-infected human hepatocytes from FRG huHep mice and transcriptome analysis. [102]Fig. 1 [103]Open in a new tab a, Experimental design. FACS isolation of GFP^+ Pf-infected human hepatocytes (PfGFP^+) from FRG huHep mice on days 2, 4, 5 and 6 PI. Created with BioRender.com. b, Histogram showing percentage of PfGFP^+ hepatocytes as compared to the total at each timepoint (n = 3) (refer to Extended Data Fig. [104]2a). Data are presented as mean ± s.d. c, Live imaging of PfGFP^+ hepatocytes (green) before and after sorting. DNA stained with Hoechst dye (blue). Merge with DIC image. Scale bar, 50 μm. d, Ribbon graph showing the mean gene expression of H. sapiens (purple) and P. falciparum (green) transcriptomes over time. Ribbons represent the mean ± s.d. e, Stacked bar plot showing the number of DEGs for each PfLS developmental timepoint compared to sporozoites. f, Volcano plot showing DEGs between sporozoites and PfLS parasites at day 2 PI. g, Bubble plot with GO terms enriched (P[adj] < 0.05) in the upregulated genes at day 2 PI. Bubbles represent unique GO terms (y-axis) and bubble colours were assigned arbitrarily. The size of the bubbles illustrates the number of genes involved in each GO term, colour coded for biological process (red), cellular component (green) and molecular function (blue). Extended Data Fig. 2. Liver stage transcriptome. [105]Extended Data Fig. 2 [106]Open in a new tab A) Representative FACS gating. i) Forward and side scatter gating based on general size to find hepatocytes. ii) Gating to distinguish single cells from doublets. iii) Live hepatocytes gating based on DAPI. iv) Gating of infected primary hepatocytes based on GFP content. B) Bubble plot showing GO term analysis of the enriched genes in Pf sporozoites (p < 0.06, adj. p < 0.23). Bubbles represent unique GO terms (y-axis) and bubble colours were assigned arbitrarily. The size of the bubbles display by different biological processes is positively correlated with the number of genes involved in each pathway. Color-coded (pink) biological process, (green) cellular component and (blue) molecular function. C) Heatmap showing time course cluster analysis of Pf sporozoites (red), Day 4 (light blue), Day 5 (green) and Day 6 (dark blue). Gene expression values are shown as Z-scores. D) Line-graph showing TPM expression of lisp1 and AP2-EXP2 during the progression of LS development. In three biological replicates. Data are presented as means ± SD. E) RNA-FISH analysis of AP2-EXP2 (green) and LISP1 (magenta) in PfLS schizont from FRGhuHep mice at Day 7 PI. DNA of the parasites is visualized by DAPI staining (blue). Scale bar: 10 μm. The PfLS parasite transcriptomes We sequenced a total of 23 PfGFP^+ LS RNA-seq library samples and in addition, generated a PfNF54^CSPGFP sporozoite transcriptome that was used alongside four previously generated Pf sporozoite transcriptome datasets^[107]27 to comparatively analyse steady-state RNA levels. The average normalized transcript abundance of LS-infected human hepatocytes showed detection of mostly human gene transcripts at day 2 PI, the early timepoint of infection. The proportion of the Pf transcript representation increased over time, reaching the highest value at day 6 PI (Fig. [108]1d and Supplementary Table [109]1). This was consistent with PfLS cell mass expansion and genome replication during LS schizogony^[110]28. Using a cut-off of ≥1 CPM (counts per million) in the three biological replicates, we detected over 4,000 Pf gene transcripts for each timepoint, except for the day 2 timepoint where we detected 1,525 genes passing the threshold in at least one replicate (Supplementary Table [111]1). Differential expression analysis during early (day 2) and mid (day 4) LS development revealed that most of the differentially expressed genes (DEGs) were less abundant (~70%), with only ~28–36% of transcripts showing higher transcript abundance when compared with sporozoites. Conversely, the proportion of more abundant transcripts during late LS (days 5 and 6) increased to ~57–59% (Fig. [112]1e and Supplementary Table [113]1). To explore the gene expression patterns underlying LS parasite transformation into trophozoites, we investigated the DEGs of the PfLS day 2 dataset in comparison to sporozoites (Fig. [114]1f). Gene ontology enrichment analysis (GO term) of transcripts less abundant in LS trophozoites (false discovery rate (FDR) ≤ 0.5) revealed pathways involved in sporozoite motility ([115]GO:0048870, [116]GO:0071976) and cell invasion ([117]GO:0044409, [118]GO:0005515) (Extended Data Fig. [119]2b and Supplementary Table [120]1), including transcripts encoding MyoA, CSP, CelTOS, TLP, GAMA and TRAP (Fig. [121]1f and Supplementary Table [122]1). Conversely, transcripts that were significantly increased in LS trophozoites showed enrichment in translational regulation ([123]GO:0006412) and ribosome biogenesis ([124]GO:0005840, [125]GO:0003735) pathways (Fig. [126]1g). At day 2 PI, genes associated with the modulation of host cells ([127]GO:0020013) were upregulated in comparison to sporozoites. Genes enriched in this pathway included two classes of the clonally variant genes families (CVGs), 14 members of the rifin gene family and one member of the stevor gene family. Furthermore, 30 of the upregulated transcripts at day 2 encoded apicoplast-targeted proteins ([128]GO:0020011) (Supplementary Table [129]1). These genes were further analysed by metabolic pathway enrichment analysis using KEGG and MetaCyc databases. This showed a preponderance of gene products mediating fatty acid biosynthesis and isoprenoid biosynthesis, which has previously been shown to be essential for rodent malaria parasite LS development^[130]29–[131]31. To further assess transcriptomes associated with PfLS development, we performed time-course gene expression analysis on the mid-LS timepoint (day 4) and the two later-LS timepoints (days 5 and 6), excluding the day 2 timepoint due to its relatively low parasite sequence read counts. Using sporozoites as a reference, we identified 1,510 genes significantly associated with temporal changes in LS development that were further categorized into 9 unique transcriptional clusters (Extended Data Fig. [132]2c and Supplementary Table [133]1). To identify key biological processes, GO term analysis found that ~25% of the LS-expressed genes are not annotated, resulting in low statistical significance for this analysis. To increase statistical power, co-expression clusters were categorized by expression trends in 5 major transcriptional profiles (I–V) (Fig. [134]2a and Supplementary Table [135]1). Profile I (Clusters 1 and 2) included genes downregulated throughout LS infection, including pathways mediating cell motility and adhesion ([136]GO:0071976, [137]GO:0048870 and [138]GO:0098609) (Fig. [139]2b). Profile II (Clusters 3 and 8) included genes that were expressed at lower levels in comparison to sporozoites. The pathways enriched in this group were for genes that play a role in infection of the host ([140]GO:0044409), regulation of transcription ([141]GO:0006355) and lipid metabolic processes ([142]GO:0006629) (Fig. [143]2b). Profile III (Clusters 4 and 6) included genes for which transcripts increased significantly (with a log[2](fold change (FC)) ≥ 2) during LS development when compared with sporozoites. The enriched pathways represented key biological processes such as RNA binding ([144]GO:0003723), ribosome biogenesis ([145]GO:0022625, [146]GO:0005840, [147]GO:0003735 and [148]GO:0022627) and translation ([149]GO:0006412, [150]GO:0005852 and [151]GO:0002181). Profile IV (Clusters 5 and 9) included genes highly expressed in LS. Within this profile, genes were largely associated with mitochondrial processes ([152]GO:0005739) and amino acid metabolism. Furthermore, we observed upregulation of oxidative stress responses ([153]GO:0006979), indicating a parasite response to regulate redox homeostasis. Cluster 7 was the sole member of Profile V, characterized by genes upregulated during late LS development. Most of the genes fall into iron-sulfur cluster assembly ([154]GO:0016226), indicating the importance of iron metabolism during LS maturation. These results summarize the core PfLS transcriptome during an in vivo infection and we validated two LS-expressed transcripts—liver specific protein 1 (LISP1) and the transcription factor (TF) AP2-EXP2—using RNA-fluorescence in situ hybridization (RNA-FISH) in day 7 LS schizonts (Extended Data Fig. [155]2d,e). Fig. 2. Time-course cluster analysis of the PfLS transcriptome. [156]Fig. 2 [157]Open in a new tab a, Boxplot showing the expression trend of the 9 clusters identified in Extended Data Fig. [158]2c. The expression was further grouped into 5 major transcriptional profiles (I–V). Boxplots display the median (50th percentile), 75th and 25th percentiles (upper and lower hinges), and the largest and smallest value within 1.5× interquartile range (IQR) from the hinge (whiskers). b, GO term analysis of the genes with significant temporal expression changes in the five profiles. Bubbles represent unique GO terms (y-axis) and bubble colours were assigned arbitrarily. The size of the bubble illustrates the number of genes involved in each GO term, colour coded for biological process (red), cellular component (green) and molecular function (blue) (P < 0.01, P[adj] < 0.22). GO term enrichment analysis was computed using the hypergeometric test for overrepresentation, and Benjamini–Hochberg adjusted P values were used to account for multiple comparisons. c, Maximum-weight connected subgraphs (MWCS) of each profile, identifying functional modules. Nodes (proteins) identified within a time-course cluster have larger point size, while their interaction partners have smaller point size; node colours show the mean log[2]FC based on the colour bars in panel d, across days 4, 5 and 6 versus sporozoites. Edge weights represent the random forest classifier score, where higher scores indicate higher-confidence interactions; weights and protein complex IDs are those reported in ref. ^[159]32. d, log[2]FC of genes identified in protein-protein interaction (PPI) subnetworks associated with time/temporal changes in expression for days 4, 5 and 6 liver-stage samples when compared to sporozoites. We next integrated an interactome analysis of Pf protein–protein interaction (PPI) networks, generated from BS schizonts of three Plasmodium species^[160]32, with the LS gene expression data by utilizing the time-course cluster assignments and differential expression analysis to identify functional PPI modules (Fig. [161]2c and Supplementary Table [162]1). Time-course transcriptional profiles I and II comprised genes that were found to be downregulated, while profiles III–V were upregulated compared with sporozoites (Fig. [163]2d). Thus, maximally scored subgraphs for profiles I and II were restricted to the genes in that temporal cluster and the set of downregulated genes, where a gene must be downregulated in at least 2 of 3 LS timepoint datasets. Similarly, modules identified for profiles III–V were limited to genes in each time-course cluster and genes upregulated in at least 2 of 3 timepoints. Profile I modules identified downregulation of proteolysis proteins (ROM1/4) and inner pellicle membrane complexes (IMC1g and ELC), among others (Fig. [164]2c,d). Profile II functional networks revealed decreased expression of the AP2-EXP TF and chaperone complexes (HSP101, ClpY) (Fig. [165]2c,d). Subnetworks in Profile III were enriched in translation/ribosome and immunoproteasome-associated proteins (Fig. [166]2c,d). Transfer RNA (tRNA) ligase networks, including isoleucine, cysteine and alanine tRNA ligases, as well as messenger (m)RNA translation/ribosome, were among the highest-scored functional modules for Profile IV. Profile V identified an iron-sulfur cluster assembly protein and clustered-asparagine-rich protein (SufC/CARP) network, RPB8 RNA polymerase protein–protein interactors and proteins that may be involved in DNA protein kinase complexes (Fig. [167]2c,d). Furthermore, we noticed that only 20/124 gene IDs from Profile V were present in the PPI identified in the BS, confirming that this specific gene cluster should receive attention for additional investigation. Transcriptional regulation during LS development Regulation of Plasmodium gene expression involves complex interactions between TFs, DNA and various regulatory elements, including chromatin and epigenetic marks^[168]33,[169]34. Plasmodium has a limited number of identified TFs, with the well-studied apicomplexan AP2 (ApiAP2) family members being the most prominent. These play critical roles in regulating asexual BS infection, sexual-stage development and mosquito infection^[170]35–[171]37 but the roles of TFs in PfLS development remain mostly unknown. We therefore assessed ApiAP2 expression patterns throughout LS development (Extended Data Fig. [172]3a) and associated them with enriched DNA-target motifs of genes within co-expression clusters (Fig. [173]3a). Notably, on day 2 PI, motifs for ApiAP2 factors regulating RBC invasion, AP2-I and an uncharacterized ApiAP2 (PF3D7_0420300), were significantly enriched in the set of upregulated genes (DEGs, FC > 2.0, P[adj] < 0.05) when compared with sporozoites (Extended Data Fig. [174]3b)^[175]38. Conversely, ApiAP2 members with established roles in gametocyte and mosquito stages, such as the female-specific AP2-FG, AP2-O2 and the heterochromatin component AP2-HC, were overrepresented in the downregulated genes^[176]39–[177]41. AP2-L exhibited a similar transcript abundance with Cluster 5, where its DNA motif was also enriched (Extended Data Fig. [178]3c and Supplementary Table [179]1). This finding suggests a role of PfAP2-L in regulating the expression of LS-specific genes, as previously shown for rodent malaria parasites^[180]42. Furthermore, uncharacterized ApiAP2s, including PF3D7_1305200, PF3D7_0613800 and the transcriptional repressor AP2-G5, were enriched during LS development based on motif enrichment analysis of the time-course transcriptional clusters^[181]43. One of the ApiAP2 TF motifs that exhibited significant enrichment during LS was for AP2-G2. In accordance with previous observation^[182]40,[183]44,[184]45, PfAP2-G2 might act as a repressor during PfLS development, resulting in decreased target gene expression (Fig. [185]3a, Cluster 2). In addition, an AP2 (PF3D7_1222400) which was highly upregulated in LS and highly correlated with multiple expression clusters (Supplementary Table [186]1) is restricted to malaria parasite species that infect primates (Extended Data Fig. [187]3a), suggesting a unique role in LS infection of primate hosts^[188]46. Extended Data Fig. 3. AP2s expression in the liver stage. [189]Extended Data Fig. 3 [190]Open in a new tab A) Heatmap displaying the AP2 motifs that are enriched in upregulated and downregulated DEGs at the early time point (Day 2 PI) vs sporozoites, and on the right, it showcases the corresponding enriched DNA motifs identified. B) Heatmap showing the expression levels of all AP2 genes during the progression of LS development. C) Line plot depicting the gene expression trends of AP2 genes within each cluster exhibiting enriched motifs. Fig. 3. Motif enrichment analysis. [191]Fig. 3 [192]Open in a new tab a, ApiAP2s (red). b, Newly identified orthologous transcription factors (TFs) (blue). Motif enrichment analysis was carried out for each timepoint cluster. Left: line plot illustrating the gene expression (log[2]CPM, y axis) trend within each cluster at each timepoint (x axis); individual points represent the mean gene expression per biological replicate (green ribbon). Ribbon widths display the mean ± s.d. expression across all replicates at each timepoint. Expression of the TF whose motif is most significantly enriched within that cluster is highlighted in the grey and non-green ribbons; points represent expression of the TF for each biological replicate and ribbon width is defined as above. Middle: heat map representing the log[2] positive enrichment values of all TF motifs overrepresented in the cluster. Motif enrichment analysis was carried out using the binomial test with background sequences randomly sampled from the Pf genome; enrichment values represent the log[2]-transformed ratio of the observed number of sequences containing the TF motif. Motif enrichment analysis P values and Benjamini–Hochberg P[adj] for each cluster are reported in Supplementary Table [193]1. Right: the most highly enriched DNA motif for each cluster. We next employed OrthoMCL^[194]47 to identify novel binding motifs and TFs not belonging to the ApiAP2 family. We gathered a dataset (1,533,059 OrthoMCL accessions) corresponding to 772,570 UniProt accessions, encompassing 571 different species orthologue groups. Subsequently, we cross-referenced this dataset with 1,956 motifs sourced from the Jaspar 2022 Database. This extensive investigation yielded 12 motifs (Supplementary Table [195]1), neither identical nor reverse complements of each other, predicted to bind previously identified putative Pf TFs^[196]48. These TFs belong to transcription factor families, such as TATA-box-binding proteins and the initiation co-activator heteromeric CCAAT-binding factors NFYB (MA0502.2, UniProt [197]P25208) and NFYC (MA1644.1, UniProt [198]Q13952), which exhibited enrichment and displayed a negative correlation with the genes expressed in Clusters 1, 6 and 9 (Fig. [199]3b). PfLS PTEX, PEXEL and PNEP expression Protein export is well studied during PfBS infection but remains largely unstudied during PfLS development. Most exported proteins contain the plasmepsin V cleavable Plasmodium export element (PEXEL), characterized by a pentameric R/KxL/IxE/D/Q motif^[200]49–[201]54. Pf also expresses PEXEL-negative exported proteins (PNEPs)^[202]55,[203]56. Both PEXEL and PNEP proteins converge at the Plasmodium translocon of exported proteins (PTEX), which mediates their transport across the parasitophorous vacuole membrane (PVM) into the host cell^[204]57,[205]58. PTEX comprises EXP2, HSP101 and PTEX150 and is flanked by the auxiliary proteins PTEX88 and TRX2 (ref. ^[206]57). We first examined whether PfLS express the PTEX machinery. We found transcripts for each PTEX component on days 4, 5 and 6 PI (Fig. [207]4a), with EXP2 showing the most abundant transcripts, probably due to its dual role as a member of the PTEX complex and as a nutrient channel^[208]59,[209]60. We confirmed the expression and localization of PTEX proteins at the PVM by immunofluorescence microscopy (Fig. [210]4b), including expression of HSP101, which performs ATP-dependent unfolding of substrates for translocation across the PVM^[211]61,[212]62. Fig. 4. Expression of the PTEX translocon, and PEXEL and PNEP-encoding genes during LS development. [213]Fig. 4 [214]Open in a new tab a, Normalized median expression values (log[2]TPM) of genes encoding the PTEX translocon across the course of LS development in 3 biological replicates. Boxplots display the median (50th percentile), 75th and 25th percentiles of expression (upper hinge and lower hinge), and the largest and smallest values within 1.5× IQR from the hinge (whiskers). b, Immunofluorescence images showing expression of PTEX components in PfLS schizonts from FRG huHep mice on day 7 PI. Top to bottom: anti-EXP2, anti-PTEX150 and anti-HSP101 (red). The sections were counterstained with DAPI (blue) and anti-PfCSP (green). Scale bar, 10 μm. c, Heat map displaying the Z-score of the mean TPM from the 3 biological replicates for genes encoding PEXEL and PNEP proteins on days 4, 5 and 6 PI. The accompanying pie charts illustrate the percentage of genes with peak expression on each of days 4, 5 and 6 PI (top) and the percentage across the combined 3 timepoints (bottom). d, Heat map demonstrating the Z-score of the TPM averages from the 3 biological replicates for genes encoding ‘Secreted + PLM (PEXEL-like motif)’ proteins on days 4, 5 and 6 PI. The accompanying pie charts illustrate the percentage of genes with peak expression on each of days 4, 5 and 6 PI (top) and the percentage across the combined 3 timepoints (bottom). LOGOs representing PEXEL residues used to identify the Secreted + PLM proteins. Highly conserved amino acids such as the initial arginine (R) are represented as larger letters, and more interchangeable residues appear smaller in the motif (top: canonical PEXEL motif, bottom: relaxed PEXEL motif). We next bioinformatically mined the PfLS transcriptomes for PEXEL-^[215]54,[216]63,[217]64 and PNEP-encoding transcripts^[218]55,[219]56. Approximately 19% of genes encoding PEXEL proteins (PEXEL genes) and 32% of genes encoding PNEPs (PNEP genes) were transcribed during LS development. Subsets of PEXEL and PNEP genes peaked in expression on each day, indicating transcription dynamicity during PfLS development (Fig. [220]4c and Supplementary Table [221]2). PEXEL motifs are typically located 15–30 residues from the C terminal of the signal peptide cleavage site^[222]50,[223]54. However, its position in LISP2 is greater than 300 residues from this position^[224]65,[225]66. We therefore searched for additional genes to identify proteins similar to LISP2 and compiled a list of 589 genes encoding ‘Secreted + PEXEL-like motif’ (PLM) proteins of which ~60% showed transcripts during PfLS development (Fig. [226]4d and Supplementary Table [227]2). The PLM motifs comprised both canonical pentameric sequences and relaxed, 6-residue motifs^[228]54. In addition, we observed transcripts of genes encoding a potential exportome on day 2 PI, comprising 8 PEXEL proteins, 11 PNEPs and 23 Secreted + PLMs (Extended Data Fig. [229]4). Extended Data Fig. 4. Analysis of PfLS exportome. [230]Extended Data Fig. 4 [231]Open in a new tab A) Heatmap displaying the Z-scores for TPM averages from the three biological replicates are displayed for genes encoding PEXEL, PNEP and ‘Secreted + PLM’ proteins on Day 2, 4, 5 and 6 PI. The accompanying pie chart illustrates the percentage of genes with peak expression on Day 2 PI. PfLISP2 knockout parasite characterization in FRG huHep mice To further characterize an LS-specific protein, we analysed PfLISP2 expression and localization at different LS developmental timepoints. Immunofluorescence analysis using antibodies against PfLISP2 (ref. ^[232]66) and PfCSP showed protein expression in accordance with transcript expression (Fig. [233]5a,b). PfLISP2 protein was detectable in LS starting at day 2 PI, and robust expression of LISP2 was then observed at days 4 and 6 PI, with a localization similar to CSP (Fig. [234]5b). To investigate PfLISP2 function during LS, we deleted the lisp2 gene (Pflisp2^−) from the Pf genome (Extended Data Fig. [235]5a,b) using established methodologies^[236]67. Pflisp2^− parasites showed normal BS and mosquito stage development compared with WT PfNF54 (Fig. [237]5c and Extended Data Fig. [238]5c). To assess LS development, PfNF54 WT and Pflisp2^− sporozoites were used to infect FRG huHep mice, and LS size was assessed comparatively. Interestingly, at day 6 PI, Pflisp2^− LS were significantly smaller when compared with PfNF54 LS (Fig. [239]5d and Extended Data Fig. [240]5d), and this difference was more pronounced at day 7 PI (Fig. [241]5d,e), indicating a potential role for PfLISP2 in late LS growth. To further assess the impact of the growth defect, we tested LS-to-BS transition of Pflisp2^− parasites (Fig. [242]5f)^[243]21. While 3 FRG huHep mice infected with PfNF54 WT sporozoites showed parasite transition to BS infection (Fig. [244]5g), 3/15 mice infected with Pflisp2^− sporozoites remained BS negative and in vitro culture did not recover any parasites from these mice, further showing the negative impact of the LISP2 gene deletion on LS development. Fig. 5. PfLISP2 localization and gene deletion. [245]Fig. 5 [246]Open in a new tab a, TPM values of CSP and LISP2 at days 2, 4 and 6 PI in 3 biological replicates. Data are presented as mean ± s.e.m. b, PfLS schizonts from FRG huHep mice immunostained with DAPI (blue), anti-CSP (green) and anti-LISP2 (red) at days 2, 4 and 6 PI. Scale bars, 10 μm. c, Comparison of the number of salivary gland sporozoites per mosquito between the PfNF54 line and Pflisp2^–. Each dot represents a biological replicate. Data are presented as mean ± s.e.m. d, Comparison of LS parasite size (based on area at the parasite’s largest circumference) between PfNF54 (3 mice) and Pflisp2^– knockout parasites (15 mice) on days 6 and 7 PI. Data are presented as mean ± s.e.m. Statistical analysis was carried out using two-way analysis of variance (ANOVA) using a mixed-effects model (REML). ***P < 0.001, *P < 0.05, ^NSP > 0.05. e, Pflisp2^− LS schizont from FRG huHep mice immunostained with DAPI (blue), anti-MSP1 (green) and anti-mTiP (red) at day 7 PI. Scale bar, 5 μm. f, Experimental design to analyse BS breakthrough of Pflisp2^– on days 6 and 7 PI; huRBCs were injected intravenously to enable transition of LS parasites to BS infection. On day 7, mice were euthanized, blood was collected and used for RT–qPCR analysis to detect parasite 18S rRNA. The remaining blood was transferred to in vitro asexual parasite culture. Created with BioRender.com. g, Parasite BS densities in the blood of FRG huHep mice at the time of transition on day 7 PI, with 3 mice in the control group (PfNF54) and 15 mice in the Pflisp2^− group. Data are presented as mean ± s.e.m. Extended Data Fig. 5. Pf lisp2^– parasites analysis. [247]Extended Data Fig. 5 [248]Open in a new tab A) The schematic depicts generation of Pflisp2^– via CRISPR/Cas9-mediated gene editing. Primers used for confirming parasite genotypes are shown and the length of PCR amplicons are indicated in the table. The agarose gel electrophoresis indicates gene deletion of Pf lisp2^– in two clones A3 and H1. Expected amplicon sizes for WT and PfLARC2 and the pFC plasmid are indicated in the schematic. B) The schematic and the corresponding Southern blot for the Pf NF54 and Pflisp2^– locus. Genomic DNA from Pf NF54 and Pflisp2^– clones A3 and H1 was digested with SpeI and HindIII and probed with a 5′ LISP2 probe to confirm gene deletion of Pf LISP2. C) Asexual BS parasitemia is compared between Pf NF54 (black line) and Pflisp2^– clones A3 (green line) and H1 (red line) over three replication cycles. Data represent mean ± SD. n = 3 biological replicates. Statistical analysis was carried out using 2-way ANOVA. ns, not significant. P > 0.05 is taken as ns. Graphs comparing (D) Immunostaining of Pflisp2^– schizont from FRGhuHep mice at Day 6 PI, immunostained with DAPI (blue), anti-CSP (green) and anti-LISP2 (red). Scale bar corresponds to 10 μm. LS transcriptome conservation and differences between Pf and Pv To assess conservation of pathways in LS of human malaria parasites, we next compared the transcriptomes of late Pf and Pv LS schizonts. Since the two species show differences in the duration of LS maturation, namely, 6.5 days for Pf and 9 days for Pv^[249]21,[250]23, we selected Pf day 6 PI and Pv day 8 PI and confirmed LS maturity by the expression of MSP1 (Supplementary Table [251]1). Because PvLS cannot be isolated by FACS, we performed total RNA extraction of Pv-infected FRG huHep mouse livers at day 8 PI, followed by enrichment for Pv mRNA using magnetic pulldown with custom-made baits tiling the entire Pv genome^[252]19,[253]68. Hybrid capture selection was followed by Illumina sequencing. All sample sequences were aligned to the H. sapiens, M. musculus and Pv P01 genomes as performed for the Pf dataset (Extended Data Fig. [254]6a). We first assessed the relatedness of Pf and Pv coding genomes and identified 4,485 Pf genes with Pv orthologues (78%) (Fig. [255]6a and Supplementary Table [256]1). We observed a substantial overlap of gene expression, with 79% of orthologous genes expressed at ≥1 TPM in all biological replicates of late LS for both species. Among the genes concordant in expression, we found gene products that play key roles during LS development, including LISP2 (Supplementary Table [257]1). GO term enrichment analysis for co-expressed genes showed enrichment in pathways important for parasite development, such as mitochondria-targeted gene products ([258]GO:005739) and protein transport-associated gene products ([259]GO:006886, [260]GO:0015031) (Extended Data Fig. [261]6b). Extended Data Fig. 6. P. falciparum and P. vivax expressed orthologues and pathways. [262]Extended Data Fig. 6 [263]Open in a new tab A) Density plot showing the mean gene expression (log[2]TPM) aligning against H. sapiens (purple) and P. vivax (blue). B) Bubble plot showing GO term analysis of the expressed genes in Pf and Pv (p < 0.06, adj. p < 0.23), GO terms enriched (p < 0.06, adj. p < 0.19) in genes expressed only in PvLS transcriptome and terms enriched (p < 0.03, adj. p < 0.21) in those expressed only in Pf. The size of the circles display by different biological processes is positively correlated with the number of genes involved in each pathway. Color-coded (pink) biological process, (green) cellular component and (blue) molecular function. C) RNA-seq shows var gene transcription during PfLS at day 6 PI. Data are presented as median value of three biological replicates. the box limits are the upper and lower quartiles. Transcripts corresponding to each var gene (shown on the x axis) are indicated as log[2]TPM, indicated on the y axis. Boxplots (box and whiskers plot) display the median (50th percentile), upper hinge as the 75th percentile, and lower hinge as the 25th percentile of expression; whiskers extend from the hinge to the largest or smallest value within 1.5 * IQR (interquartile range) from the hinge, respectively. D) Heatmap illustrating the expression patterns of P. vivax clonally variant gene families in the three biological replicates. Fig. 6. Comparative transcriptome analysis of late Pf and Pv liver stages. [264]Fig. 6 [265]Open in a new tab a, Venn diagram of orthologous genes expressed at ≥1 TPM in all 3 biological replicates of Pf or Pv. Overlapping section identifies genes detected in both Pf and Pv transcriptomes. b, Heat map illustrating the expression patterns of clonally variant gene families throughout the course of LS development. c, PfLS schizonts from FRG huHep mice immunostained with DAPI (blue), anti-PfHSP70 (red) and anti-PfATS (green) at day 7 PI. Scale bar, 10 μm. d, TPM values of a selection of gametocyte genes in the Pf and Pv datasets. Results are presented as mean ± s.d. of 3 independent experiments; **P[adj] = 0.007. e, RNA-FISH analysis of AP2-g (yellow) in PvLS schizont from FRG huHep mice at day 9 PI (top) and PfLS schizont from FRG huHep mice at day 7 PI (bottom). DNA of the parasites is visualized by DAPI staining (blue). Scale bar, 10 μm. f, Quantification of the number of PvLS and PfLS schizonts from FRG huHep mice at days 9 and 7 PI, respectively, expressing AP2-g by RNA-FISH. Results are presented as mean ± s.d. of 3 independent experiments; two-way ANOVA, ****P < 0.0001. g, Number of PfLS parasites positive for CSP and GFP recognizing PfNF54^AP2GGFP parasites from FRG huHep mice at day 7 PI. Results are presented as mean ± s.e.m of 3 independent experiments; two-way ANOVA, ****P < 0.0001. h, Flow cytometry analysis of DNA dye+ (blue) and DNA dye+AP2-GFP+ (red) BS parasites following PfNF54^AP2GGFP sporozoite infection in FRG huHep mice. The dotted line denotes the limit of detection (LOD), defined as the mean percentage of DNA dye+AP2-GFP+ events from the uninfected RBCs control. N = 3 mice. Results are combined and presented as mean ± s.e.m. of 2 independent experiments. Among the PfLS-expressed genes without Pv orthologues, we found expression of multiple members of the CVG families (Fig. [266]6b). We focused on var genes (Extended Data Fig. [267]6c), encoding PfEMP1 proteins, which mediate infected red blood cell (iRBC) attachment to the vascular endothelium and are antigenically variant during PfBS infection^[268]69. Previous findings suggested that var gene members can be transcribed in gametocytes, ookinetes, oocysts and sporozoites^[269]27,[270]70–[271]72. We found transcripts for multiple var genes during PfLS development. Notably, there was consistent transcript detection of 10 var genes in all 3 biological replicates (Fig. [272]6b and Extended Data Fig. [273]6c), belonging to the UpsB and UpsC subtypes^[274]73. The PF3D7_0421300 and PF3D7_0809100 var genes, the latter of which is also expressed in sporozoites^[275]27, showed the highest var gene transcript abundance during LS development. We further investigated PfEMP1 protein expression using a pan antibody recognizing the PfEMP1 semi-conserved intracellular region (ATS) to examine expression in day 7 PfLS parasites. A punctate expression pattern was observed (Fig. [276]6c), confirming PfEMP1 expression in day 7 PfLS schizonts. Similarly, among the PvLS genes without Pf orthologues, we identified transcripts of multiple members of the pir and vir CVG families (Extended Data Fig. [277]6d). We next assessed whether Pf sexual commitment initiates during LS development, which might result in the release of exo-erythrocytic Pf merozoites ready to differentiate into gametocytes, as previously shown at the transcriptomic level for PvLS schizonts^[278]19,[279]74–[280]76. We did detect transcripts for the sexual-stage-specific proteins Pvs16 and Pvs25 and the AP2-g transcription factor in PvLS schizonts, the latter of which is the master regulator of sexual commitment during BS infection^[281]77–[282]79 (Fig. [283]6d). In contrast, we did not detect the expression of AP2-g in the PfLS transcriptomes (Fig. [284]6d). To further evaluate whether mature PfLS schizonts exhibit sexual commitment, we performed RNA-FISH on day 7 Pf-infected livers and day 9 Pv-infected livers with PfAP2-g and PvAP2-g specific probes, respectively. RNA-FISH confirmed the absence of AP2-g transcripts in all PfLS schizonts examined (Fig. [285]6e,f and Extended Data Fig. [286]7a), while AP2-g RNA signal was detected in a subset of PvLS schizonts (Fig. [287]6e,f). To further investigate when Pf forms sexually committed parasites, we used an endogenously GFP-tagged AP2-G reporter line (PfNF54^AP2GGFP) (Extended Data Fig. [288]7b,c)^[289]80. The PfNF54^AP2GGFP line showed no transmission defect (Extended Data Fig. [290]7d), and sporozoites were generated and used to infect 3 FRG huHep mice. The infected mice were infused with huRBCs at day 7 PI and subsequently, liver tissue and infected blood samples were collected. The liver tissue sections were stained with anti-GFP antibodies previously validated on BS parasites (Extended Data Fig. [291]7e) to assess AP2-G protein expression in mature PfLS schizonts. The blood samples were further processed for flow cytometry to evaluate whether LS merozoites and/or early ring BS parasites expressed AP2-G. Similarly to the transcriptomes and RNA-FISH results, we did not detect any GFP-positive PfLS schizonts (Fig. [292]6g and Extended Data Fig. [293]7e). Flow cytometry analysis of the iRBCs showed no AP2G-GFP-positive BS parasites at day 7 PI (Fig. [294]6h and Extended Data Fig. [295]7f,g), confirming that PfLS parasites do not directly produce sexual-stage progeny from any egressing exo-erythrocytic merozoites. We further cultured the iRBCs in vitro and sampled cultures every 48 h to assess when GFP expression, and thus sexual commitment, occurred. AP2-G-GFP-positive BS parasites were first observed at 48 h post exo-erythrocytic merozoite egress (day 9 post sporozoite infection). The proportion of AP2-G-GFP-positive BS parasites then increased 96 h post exo-erythrocytic merozoite egress (Fig. [296]6h). Extended Data Fig. 7. AP2s expression in Pf liver stage. [297]Extended Data Fig. 7 [298]Open in a new tab A) RNA-FISH analysis of AP2-g (yellow) in PfNF54 enriched schizonts to assess probe specificity. DNA of the parasites is visualized by DAPI staining (blue). Scale bar: 10 μm. B) The Agarose gel electrophoresis indicates six clones of the PfNF54^AP2GGFP transgenic parasite line. Clone G8 (bold) was selected for further propagation and used in functional assays. C) Sexual conversion assay of the PfNF54^AP2GGFP G8 clone, was performed to estimate the sexual conversion rate (CR) and the parasite multiplication rate (PMR) in comparison with NF54 WT. D) Analysis of the infection prevalence, number of oocysts per midgut and salivary glands sporozoites per mosquito, of PfNF54^AP2GGFP line. In two biological replicates. E) Immunostaining of PfNF54^AP2GGFP erythrocytic enriched schizonts (i) and PfLS schizont from FRGhuHep mice at Day 7 PI (ii), immunostained with DAPI (blue), anti-GFP (green) and anti-MSP2 or anti-CSP (red). Scale bar corresponds to 10 μm. F) Gating strategy. Single cells were gated based on side scatter (SSC-H vs SSC-A) followed by gating on the red blood cells. Any possible white blood cells were avoided from further analysis by gating on CD45 negative population. Finally, the BS parasites were gated based on positive DNA staining by SPY650-DNA dye, whereas AP2 expressing parasites were gated based on positive DNA staining and GFP content. G) Representing flow layouts. N=3 mice. Results are combined and presented as means ±SEM from two independent experiments. Source data are provided as a Source data file. Discussion PfLS infection is an attractive target for vaccine and drug development since LS elimination prevents the onset of symptomatic BS infection and onward parasite transmission. Yet, comprehensive PfLS gene expression profiles have not been reported. Here we successfully conducted a transcriptome analysis of PfLS throughout their intrahepatocytic development. Our analysis revealed active core genes and pathways driving PfLS development. The successful mosquito-to-human parasite transition of sporozoites requires rapid availability of gene products that mediate the successful establishment of the intrahepatocytic LS replication niche^[299]4. We observed downregulation of the sporozoite translational repression machinery at day 2 post hepatocyte infection, which is likely critical for sporozoite transformation into intracellular trophozoites and their establishment within hepatocytes^[300]33,[301]81. These data extend previous findings indicating that Plasmodium parasites use translational repression during sporozoite transmission, storing transcripts in the sporozoite stage, which are then presumably translated after hepatocyte infection^[302]81. Furthermore, the LS transcriptome analysis delineated gene expression patterns and pathways that sustain the parasite’s drastic metamorphosis from sporozoite to trophozoite stage and its ability to then enter exo-erythrocytic schizogony. During LS development, fatty acid and isoprenoid biosynthetic metabolic pathways were enriched as early as day 2 PI^[303]44,[304]82,[305]83, and later in LS development, amino acid synthesis and numerous metabolic pathways were upregulated. In addition, gene products involved in redox homeostasis, probably used to counteract the oxidative stress generated in infected hepatocytes, were also upregulated^[306]44,[307]84. To analyse putative regulation of gene expression in LS, we examined TF DNA-binding motif enrichment in the upstream sequences of co-expressed genes. Our findings uncovered a link between the uncharacterized ApiAP2 TF genes PF3D7_0420300 and PF3D7_0613800 and Cluster 7, which was enriched for genes that are upregulated during late LS development. In addition, ApiAP2 PF3D7_1222400, exclusive to primate malaria parasites, was upregulated during LS development. This AP2 could potentially control expression of LS genes playing unique roles in primate hepatocyte infection. Despite dramatic differences in host cellular environments between erythrocytes and hepatocytes^[308]85–[309]87, LS might depend on the export of proteins to counteract hepatocyte intracellular defenses, obtain nutrients and thereby support LS survival and growth in the liver. We found that PfLS parasites expressed all the components of the PTEX machinery of protein export at the transcript and protein level, showing the presence of EXP2, PTEX150 and the unfoldase HSP101 (refs. ^[310]61,[311]62). Our identification of HSP101 in LS confirms that all the members of the PTEX machinery are expressed in LS and this supports the notion that protein export might play a crucial role in intrahepatocytic parasite development. In agreement with this, a recent conditional knockout study of PfEXP2 and PfPTEX150 showed the important role of these PTEX components in LS development^[312]66. On the basis of the LS transcriptomes, we predicted an LS exportome consisting of 102 PEXEL, 136 PNEP and 353 Secreted + PLM proteins. Our further study of PfLISP2 by gene knockout revealed that this putative exported protein plays a role in LS growth. However, the analysis of LISP2 cellular localization found no evidence that this PEXEL-motif-containing protein is exported beyond the LS compartment. We identified transcripts of members of the var gene family and observed PfEMP1 (ref. ^[313]69) protein expression in mature LS schizonts. These findings suggest that during parasite transmission and infection of new human hosts, resetting of var gene expression begins in the mosquito stages and is finalized in the LS, explaining the broad var gene repertoire observed in the first generation of BS parasites that emerge in human participants experimentally infected with Pf sporozoites^[314]88,[315]89. Our comparative transcriptome analysis of Pv and Pf late LS developmental stages revealed that although there is considerable conservation of LS gene expression across the core conserved genomes of these distantly related human malaria parasites^[316]90, unique features emerged. Importantly, we found that in contrast to late PvLS, late PfLS did not show any evidence of commitment to sexual-stage formation, hence the latter cannot generate a population of exo-erythrocytic merozoites committed to sexual-stage development. Our analysis of Pf parasites that had transitioned from LS to BS infection further ascertained this notion, showing that Pf must undergo at least one full asexual replication cycle in the blood for sexual commitment to occur. This finding might have implications for Pf transmission dynamics in malaria-endemic areas. Finally, the datasets presented herein have limitations and future work is needed to further explore new research questions that have arisen from the data. We measured transcript abundance in pooled isolated LS parasites; single-cell studies will be crucial in uncovering any parasite heterogeneity during liver-stage development. In addition, the use of FRG huHep mice, which lack a functional immune system, limited our ability to assess the role of the human host responses in shaping LS parasite transcriptomic profiles. Furthermore, our findings on LS protein export underscore the need to experimentally validate protein export into the PfLS-infected hepatocyte for predicted exported proteins and to establish potential functional roles of PfEMP1 variant proteins in LS. Methods Creation of Pf parasite line expressing GFP under CSP promoter The plasmid pEFGFP used to create 3D7HT-GFP was modified by replacing the EF1α promoter with a 1.2 kb DNA fragment from the 5′ UTR immediately upstream to the start codon of the PfCSP gene (CSP promoter)^[317]91. Plasmid integrity was confirmed by DNA sequencing and the confirmed plasmid was used for transfection of PfNF54. PfNF54 parasite culture was synchronized at ring stage with 5% sorbitol 2 days before transfection. On the following day, trophozoites were selected by incubation in 0.7% gelatin solution. Ring stages were transfected by electroporation at 0.31 kV and 950 µF with a BioRad Gene Pulser (BioRad)^[318]92. Cultures were put under drug pressure starting at 6 h post transfection using 5 nM WR99210 (Jacobus Pharmaceuticals). Integration was confirmed by PCR on parental population and clones obtained by limiting dilution as previously described using primers as follows: forward primer 5F 5′-TGAGCTTATTTCAATTGTTGTGT-3′; reverse primer 6R 5′-GGTACCGCTAATTTTCTCATCATGAATTGT-3′; forward primer 1F 5′-GGAGGGAAAAAATAATCGTACA-3′; reverse primer 3R 5′-TCAACCAAGTCATTCTGAGA-3′; forward primer 4F 5′ -GTAATTTATGGGATAGCGAT-3′; reverse primer 2R 5′-GTTTTTCATCGAAATGCGTA-3′. Mosquito rearing and sporozoite production Anopheles stephensi mosquitoes were reared and maintained following standard procedures outlined in Methods in Anopheles Research MR4 ([319]https://www.beiresources.org/Publications/MethodsinAnophelesResea rch.aspx). Mosquitoes were kept at 27 °C and 75% humidity in temperature- and humidity-controlled incubators on 12 h light/dark cycles within a secured ACL2 Facility. Cotton pads soaked in 8% dextrose and 0.05% PABA solution were placed daily on the top net of mosquito cages. PfNF54 WT, PfNF54^CSPGFP, PfNF54^lisp2− and PfNF54^ap2gGFP asexual parasites were maintained by subculturing at 2% parasitemia in RPMI-1640 medium (25 mM HEPES, 2 mM l-glutamine medium containing 10% human serum and 50 μM hypoxanthine) and maintained at 37 °C in an atmosphere of 5% CO[2], 5% O[2] and 90% N[2]. Gametocyte cultures were set up at 1% parasitemia and 5% haematocrit, and the asexual parent cultures had a parasitemia between 3 and 7%. The media of the gametocyte cultures were changed daily for 15 days while keeping the plates/flask on a slide warmer during media change to prevent dramatic temperature changes. Mature gametocyte cultures were spun at 800 g for 2 min at 37 °C in a temperature-controlled centrifuge. Parasitized red blood cell pellet was resuspended at 0.5% gametocytemia in 50:50 serum:blood mix and used for standard membrane feeding as previously described^[320]93. For Pf sporozoite production, 3–7-day-old female mosquitoes were used for every infectious blood meal. Mosquito infections were evaluated on day 7 by checking oocyst prevalence and oocyst number in 10–12 dissected mosquito midguts. Sporozoite numbers were determined by dissecting and grinding salivary glands on day 15 post feeding. These sporozoites were used for infecting FRGhuHep mice. Mice FRG huHep mice (female, >4 months of age) were purchased from Yecuris and were maintained in pathogen-free BSL2+ conditions with 12 h light/dark cycle, 72 °F temperature and 45% humidity at the Center for Global Infectious Disease Research, Seattle Children’s Research Institute (SCRI). All animal procedures were performed following the regulations of the SCRI’s Institutional Animal Care and Use Committee (IACUC). The animal procedures were approved by the IACUC under the 00480 protocol. Repopulation of human hepatocytes in FRG huHep mice was confirmed by measuring human serum albumin levels, and only animals with human serum albumin levels >4 mg ml^−1 corresponding to 70% repopulation of human hepatocytes were used as previously described^[321]21. Animals were cycled on 8 mg l^−1 of NTBC drug once a month for 4 days to maintain hepatocyte chimaerism. Mice were taken off NTBC before and during experimentation. Analysis of PfNF54^CSPGFP LS-to-BS transition in FRG huHep mice by quantitative PCR with reverse transcription (RT–qPCR) FRG huHep mice were intravenously (i.v.) infected with 1 million PfNF54^CSPGFP sporozoites. To support parasite transition from LS to BS, mice were injected with 400 μl of human RBCs i.v. at 70% haematocrit on days 6 and 7 post infection. The blood was then collected by cardiac puncture after exsanguinating mice on day 7. Fifty microlitres of blood were added to NucliSENS lysis buffer (bioM rieux) and frozen immediately at −80 °C, and the rest of the blood was transferred to in vitro culture. Fifty microlitres of blood samples were collected from in vitro culture on days 9, 11 and 15 post infection in mice (that is, days 2, 4 and 8 of in vitro culture), added to NucliSENS lysis buffer and frozen at −80 °C. All samples were processed and analysed for the presence of 18S rRNA as follows: The RT–qPCR reaction was performed using 35 µl SensiFAST Probe LoROX one-step kit (Bioline) and 15 µl of extracted eluate. Plasmodium 18S rRNA primers/probes (LCG BioSearch Technologies) were as follows: forward primer PanDDT1043F19 (0.2 µM): 5′-AAAGTTAAGGGAGTGAAGA-3′; reverse primer PanDDT1197R22 (0.2 µM): 5′-AAGACTTTGATTTCTCATAAGG-3′; probe (0.1 µM): 5′-[CAL Fluor Orange 560]- ACCGTCGTAATCTTAACCATAAACTA[T(Black Hole Quencher1)]GCCGACTAG-3′[Spacer C3]. Cycling conditions were reverse transcription (RT) (10 min) at 48 °C, denaturation (2 min) at 95 °C and 45 PCR cycles of 95 °C (5 s) and 50 °C (35 s). Isolation of PfNF54^CSPGFP LS-infected primary human hepatocytes FRG huHep mice were intravenously injected with 2–3 million PfNF54^CSPGFP sporozoites per mouse. The primary hepatocytes were collected by perfusing and digesting the livers on days 2, 4, 5 or 6 post sporozoite infection using modified protocol^[322]94. Briefly, the mice were deeply anaesthetized with ketamine (100 mg kg^−1 body weight)/xylazine (10 mg kg^−1 body weight) solution, and the livers were perfused and digested with perfusion buffers I (0.5 mM EGTA in 1× DPBS without Ca^2+ and Mg^2+) and II (50 μg ml^−1 liberase TL with 800 μM CaCl[2] in 1× DPBS), respectively. The hepatocytes were dispersed in 1× DMEM complete medium and centrifuged twice at low speed (50 g) for 2 min at 10 °C. The cell pellet was resuspended in the complete medium and viability was tested using trypan blue staining. The final cell concentration was adjusted to 2 × 10^6 per ml and further used for FACS. Approximately 2,500–3,000 GFP^+ and GFP^− cells were sorted directly into QIAzol lysis buffer. RNA-seq library preparation For Pf RNA-seq libraries preparation, total RNA was extracted from sorted infected primary human hepatocytes using miRNeasy micro kit (QIAGEN) according to manufacturer instructions, including on-column DNase digestion. Libraries were prepared using SMART-seq v.4 Ultra Low Input (Clontech) and were sequenced on the Illumina NextSeq 500 as 75-bp paired-end reads. The resulting data were demultiplexed using bcl2fastq2 (Illumina) to obtain fastq files for the downstream analysis. A minimum of 3 biological replicates were analysed; technical replicate libraries for each biological replicate were also sequenced. Additional raw sequence reads from Pf sporozoite RNA-seq samples (N = 4) were retrieved from the Sequence Read Archive ([323]PRJNA344838)^[324]27. For Pv RNA-seq library preparation, total RNA was extracted from the FRGhuHep mice livers infected with 1 million Pv sporozoites (field strain) using TRIzol (Thermo Fisher) and purified using RNeasy mini kit (Qiagen) according to manufacturer instructions. A SureSelect XT custom oligo library was designed with Agilent (Design ID: S0782852) to enrich for Pv-specific complementary (c)DNA among the pool of human, mouse and parasite cDNA obtained from the RNA extraction from the humanized mouse liver. A total of 85,000 probes of 120-bp size were tiled every 100 bp across the entire Pv Sal I genome. Sequences >30% similar to human sequences were excluded. Sequencing libraries were prepared according to the SureSelect XT RNA Target Enrichment for Illumina Multiplexed Sequencing protocol from Agilent (Ref: 5190-4393). Libraries were analysed using a BioAnalyzer and quantified using qPCR. Illumina libraries were sequenced on Mi-seq as 75-bp single-end reads. Data analysis Quality control of fastq files was performed using FastQC v.0.12.1 software; fastqs from paired biological and technical replicate liver-stage samples were concatenated to increase sequencing depth and coverage. Reads were aligned to a custom reference genome containing H. sapiens (GRCh38, Ensembl gene annotations (v.106)^[325]95), M. musculus (PRJNA20689, GRCm39, Ensembl gene annotations (v.103)^[326]96) and P. falciparum genome (PlasmoDB, PlasmoDB-58_Pfalciparum3D7 (ref. ^[327]97)) with STAR 2.7.9. The alignment was completed with default parameters with the addition of ‘–twopassMode Basic’ and ‘–quantMode GeneCounts’ to produce a gene count matrix. Gene-level counts were normalized to TPM. P. vivax samples were processed identically, except with the reference genome containing H. sapiens, M. musculus and the P. vivax genome ([328]PRJEB14589, PlasmoDB-58_PvivaxP01 (ref. ^[329]68)). Differential expression and clustering analysis All analyses were conducted in the R v.4.1 statistical environment. The raw gene count matrix for liver-stage PfLS and sporozoite samples, which included the publicly available data from [330]PRJNA344838, underwent batch correction using ComBat-seq (sva v.3.42.0) with default parameters^[331]98. The adjusted counts were used in differential gene expression analysis with Limma voom v.3.50.3; genes with absolute log[2]FC > 1 and FDR < 0.05 were retained^[332]99. Batch-corrected gene counts were normalized to the trimmed mean of M-values (TMM) and converted to log[2] scale before differential expression analysis, unsupervised hierarchical clustering and time-course regression analysis^[333]100. Examination of changes in PfLS gene expression over time compared to sporozoites was conducted using maSigPro v.1.66.0 with a polynomial degree of 2 (quadratic)^[334]101. Comparison of Pf and Pv samples was carried out with TPM-normalized gene counts (unadjusted). For each species independently, genes were selected if expressed at ≥1 TPM in all 3 biological replicates; the expressed genes were converted into Pf orthologues using PlasmoDB v.58. Orthologues from Pf and Pv were overlapped and visualized using ggVennDiagram (v.1.2.2)^[335]102. Gene ontology analyses were performed for biological processes, cellular component and molecular function terms using ClusterProfiler (v.4.2.2)^[336]103, and were retained for further analysis if −log[10](P) > 1.2 (P < 0.06). Available GO terms for Pf were downloaded from PlasmoDB v.58. Protein–protein interaction network analysis Protein–protein interaction (PPI) networks were retrieved from ref. ^[337]32; in brief, the PPIs were defined using quantitative mass spectrometry of protein isolates from 3 species of Plasmodium schizonts in the erythrocytic stage. The ref. ^[338]32 PPI graph contained 1,642 nodes and 23,544 edges by including only those proteins found to be expressed in liver-stage Pf RNA-seq data from this study. Maximally weighted connected subgraphs (MWCS) were utilized to identify a subset of nodes in the PPI graph that are most connected using the BioNet v.1.54.0 R package. For each time-course profile (5 total profiles), MWCS ‘modules’ were identified by fitting a beta-uniform mixture (BUM) model to a P-value distribution from the genes’ time-course regression analysis and differential expression analysis. The modules were scored using the fitted BUM model, with the FDR cut-off being the 5th or 10th percentile most-significant adjusted P values (FDR ≤ 6.14 × 10^−10 for all profiles). MWCS for each profile were then selected using the fast Heinz algorithm v.2.0. The final MWCS modules for each profile contained the proteins whose gene expression was found to be associated with temporal changes via the time-course regression analysis as well as their interaction partners that exhibited differential expression. Motif enrichment analysis Motif enrichment analysis was completed with the monaLisa v.1.0.0 Bioconductor package^[339]104 with background=‘genome’, where query promoter sequences were compared to promoter region sequences randomly sampled from the Pf genome using default parameters, except genome.oversample=10 and binomial enrichment test. Motifs were retained if −log[10](P) > 1.2 (P < 0.06). Genomic DNA sequences were derived from PlasmoDB-58_Pfalciparum3D7_Genome.fasta and promoter regions were defined as −1,000 bp from transcript start site using coordinates from PlasmoDB-58_Pfalciparum3D7.gff. PfAP2 transcription factor position weight matrices were downloaded from the supplementary materials of ref. ^[340]35. Pf transcription factor orthologues were identified using the OrthoMCL^[341]47 orthologue group IDs provided on PlasmoDB v.58. Stable gene identifiers from all available species belonging to orthologue groups from Pf PlasmoDB v.58 were retrieved from OrthoMCL DB v.6.14 (orthomcl.org). The stable gene identifiers were converted to species-specific Uniprot identifiers using UniProt.ws v.2.38.1 and used to retrieve the orthologues’ transcription factor motif position weight matrices from the JASPAR2022 v.0.99.7 Bioconductor annotation package. PTEX, PEXEL, PNEP and Secreted + PLM analysis The PfLS transcriptome was analysed to identify transcripts corresponding to published PfPEXEL and PNEP proteins (see Supplementary Table [342]2 for proteins and references). For transcriptional analysis