Abstract Artemisinin, the key antimalarial drug, is synthesized in Artemisia annua glandular secretory trichomes (GSTs), yet their development and artemisinin’s precise cellular origins are unclear. Utilizing single-nucleus RNA sequencing and spatial transcriptomics, we construct a high-resolution cellular atlas mapping metabolic dynamics across GST development. We define three developmental states: the initiation phase, transcriptional activation of core metabolic pathways establishes fundamental cellular machinery; the intermediate phase, marked lipid metabolism activation with coordinated fatty acid and wax biosynthesis, accompanied by active photosynthetic activity; the terminal differentiation phase, metabolite specialized through spatial compartmentalization of terpenoid and lipid biosynthetic pathways. Notably, we discover that six specific secretory cells within the 10-cell GSTs constitute the primary site for artemisinin production. We identify hundreds of hub genes potentially contributing to trichome development or artemisinin biosynthesis. Overall, this study systematically elucidates GST development and artemisinin biosynthesis, revealing its spatial production mechanism and providing essential cellular and genetic foundations for metabolic engineering and fundamental trichome biology. Subject terms: Plant morphogenesis, Secondary metabolism, Transcriptomics __________________________________________________________________ This study provides a cellular atlas of A. annua glandular trichomes using single-nucleus and spatial transcriptomics. It defines three developmental states and identifies hundreds of hub genes for trichome morphogenesis and artemisinin synthesis. Introduction Artemisinin is a sesquiterpene lactone characterized by its endoperoxide bridge. As the cornerstone of artemisinin-based combination therapies (ACTs), it’s also the WHO-recommended first-line treatment for Plasmodium falciparummalaria^[38]1. Beyond its antimalarial efficacy, emerging evidence demonstrates artemisinin and its derivatives’ therapeutic potential against diverse pathologies^[39]2–[40]4, substantially expanding its clinical relevance. This multifaceted bioactive compound is derived from A. annua (Asteraceae), within which specialized glandular secretory trichomes (GSTs) act as the biosynthetic factories for the production of artemisinin^[41]5. These epidermal structures are multifunctional sites for artemisinin synthesis, secretion, and storage^[42]6,[43]7, therefore GST density, developmental maturity, and metabolic efficiency critically govern yield. The artemisinin biosynthetic pathway originates from the precursor molecules isopentenyl pyrophosphate (IPP) and dimethylallyl pyrophosphate (DMAPP), synthesized via the spatially compartmentalized methylerythritol phosphate (MEP) pathway in plastids and the mevalonate (MVA) pathway in the cytosol, respectively^[44]8,[45]9. Although the enzymatic cascade from farnesyl pyrophosphate (FPP) to dihydroartemisinic acid (DHAA) has been partially elucidated - involving amorpha-4,11-diene synthase (ADS)^[46]10, cytochrome P450 monooxygenase CYP71AV1 (CYP71AV1)^[47]11, alcohol dehydrogenase 1 (ADH1)^[48]12, double bond reductase 2 (DBR2)^[49]13, and aldehyde dehydrogenase 1 (ALDH1)^[50]14, recently, a dihydroartemisinic acid dehydrogenase (AaDHAADH) has been identified, which catalyzes the bidirectional conversion between artemisinic acid (AA) and DHAA^[51]15- the terminal conversion from DHAA to artemisinin remains mechanistically ambiguous. Particularly contentious is whether the final conversion involves exclusively non-enzymatic photooxidation or requires undiscovered enzymatic catalysis^[52]16–[53]18, a key knowledge gap hindering microbial bioengineering approaches^[54]19–[55]21. Beyond artemisinin biosynthesis, the GSTs of A. annua function as evolutionarily optimized metabolic center, producing commercially pivotal specialized metabolites including monoterpenes, sesquiterpenes, flavonoids, aliphatic compounds, coumarins, and essential oils^[56]22,[57]23. Notably, these secretory micro-organs represent a phylogenetically conserved adaptation, serving as dedicated biosynthetic factories for bioactive compounds essential to human societies. This functional convergence is exemplified by cannabinoids in Cannabis sativa., nicotine in Nicotiana tabacum., menthol in Mentha canadensis., and α-tomatine in tomato (Solanum lycopersicum). Despite two decades of research dissecting artemisinin biosynthesis within A. annua GST, fundamental understanding of these multicellular ‘chemical factories’ remains fragmented, particularly regarding their developmental programming and metabolic regulatory networks^[58]24. While non-glandular trichome development is well characterized in model systems like Arabidopsis thaliana (which lacks GST), the mechanism of GST morphogenesis and metabolic specialization remains obscure. Consequently, a systematic understanding of GST development that integrates morphogenesis with spatiotemporal control of specialized metabolism is imperative. Such knowledge is essential both to maximize artemisinin production and to unlock the biotechnological potential of uncharacterized GST-derived metabolites across plant species. The structural complexity and diversity of GSTs are evident across plants. For instance, tomato possesses seven distinct trichome types, four of which (types I, IV, VI, and VII) are GSTs. Tobacco contains two types of GSTs (long glandular and short glandular trichomes). In A.annua, a single GST type exists, yet it comprises ten cells organized into specialized substructures. Specifically, these GSTs consist of two stalk cells (SCs), two basal cells (BCs), and six secretory cells partitioned into two apical secretory cells (ACs) and four subapical secretory cells (SAs) surrounding a subcuticular cavity (Fig. [59]1a, Supplementary Fig. [60]1a)^[61]6,[62]25. This intricate cellular architecture suggests functional specialization that conventional bulk tissue analyses fail to resolve. Earlier laser microdissection studies produced conflicting conclusions: initial work detected no artemisinin pathway genes (ADS, CYP71AV1, DBR2) in SAs^[63]26, whereas subsequently refined protocols identified biosynthetic activity across both ACs and SAs^[64]27,[65]28. These discrepancies underscore the limitations of low-resolution cellular sampling and emphasize the necessity for single-cell interrogation. Fig. 1. Artemisinin dynamics in A. annua leaves. [66]Fig. 1 [67]Open in a new tab a Schematic of leaf cross-section showing GST structure and TT. b Leaf development stages (S1-S4) in one-month-old plants. Left: Leaf morphology under bright field (scale bars: 2 cm, 1 mm). Right: Leaf morphology under 488 nm excitation, and the autofluorescence signals indicate chloroplasts (white triangles), mature GSTs (yellow triangles), and immature GSTs (red triangles) (scale bars: 1 mm, 0.5 mm). c GST density (mean ± SD, n = 5 biologically independent samples) and LC-MS/MS quantification of artemisinin (AR) and its precursors artemisinic acid (AA), dihydroartemisinic acid (DHAA) (mean ± SD, n = 3 biologically independent samples). Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful high-throughput technology enabling the comprehensive analysis of gene expression profiles at the single cell resolution, effectively addressing the inherent cellular heterogeneity in complex tissues and organs^[68]29,[69]30. This innovative methodology has been effectively implemented in plant systems research, driving significant progress in model organisms and agronomically important species^[70]31–[71]35. Recent advancements in single-nucleus RNA sequencing (snRNA-seq) circumvent the technical artifacts induced by protoplast isolation, preserving native transcriptional states while addressing cellular heterogeneity^[72]36–[73]39. When integrated with spatial transcriptomics and metabolomics, this approach enables unprecedented resolution in mapping biosynthetic activities to specific cellular niches^[74]40. While single-cell omics are now widely applied in plant biology^[75]41–[76]43, their implementation in medicinal plants remains challenging due to specific technical hurdles, including secondary metabolite interference, cell wall complexity, or limited reference genomes. Our study pioneers integrating snRNA-seq with spatial transcriptome to deconvolute the developmental-metabolic continuum in A. annua GST, establishing a transformative framework for plant secretory structure research. We verified four GST-specific marker genes and identified hundreds of promising hub genes potentially regulating artemisinin biosynthesis or trichome development, providing mechanistic insights into the functional diversification and compartmentalization of plant secretory architectures and the metabolomic orchestration underlying specialized secondary metabolite biogenesis. Results Developmental dynamics of GST and artemisinin biosynthesis in A. annua leaves Bright-field and epifluorescence microscopy (488 nm excitation) of A. annua leaves across four developmental stages (30-day-old plants) revealed stage-specific morphological and metabolic transitions in GSTs (Fig. [77]1b). Stage 1 (S1) leaves exhibited curled morphology and predominantly immature GSTs with red autofluorescence. Transition to Stage 2 (S2), leaf showed partial expansion accompanied by a 2.6-fold increase in GST density, and the autofluorescence of GSTs mainly changed to yellow. Mature Stage 3 (S3) leaves exhibited fully expanded morphology with GSTs demonstrating predominant yellow autofluorescence, while residual red-fluorescent GSTs represented only a minor fraction of the total GST population. Concurrently, the GSTs density underwent a 45% reduction compared to S2 peak levels. Senescing Stage 4 (S4) leaves exhibited dark-green pigmentation and completely loss of red-fluorescent GSTs, retaining only yellow-fluorescent GSTs. Our preliminary studies, encompassing systematic observations of numerous intact leaves, consistently revealed this developmental shift in GST autofluorescence from red (immature) to yellow (mature). Figure [78]1b presents a representative image from these observations. Quantitative LC-MS/MS analysis revealed dynamic artemisinin metabolite profiles across these developmental stages (Fig. [79]1c). Artemisinin content increased 7.2-fold from S1 to S3 (from 0.5 to 3.6 mg/g), directly correlating with GST maturation. Subsequently, the senescence of GSTs in S4 led to a 22% reduction in artemisinin (2.8 mg/g). Notably, S2 exhibited a peak in GST density compared to adjacent developmental stages, defining the S1-to-S2 transition as the pivotal morphogenetic window for GST proliferation. This developmental transition aligns spatiotemporally with leaf expansion dynamics and provides an ideal model to investigate the molecular mechanisms underlying GST initiation, particularly the temporal coordination of specialized metabolite biosynthesis. Generation of the GST atlas for A. annua Obtaining high-quality single-nucleus transcriptomic data from GSTs encountered particular technical challenges compared to whole-leaf analyses. In A. annua leaves, GSTs constitute rare epidermal protrusions ( < 1% of total leaf cells), with each GSTs comprising a 10-cell structure. Mature GSTs measure ~50 μm in diameter (Supplementary Fig. [80]1a) and are encased in a cuticular sheath that resists conventional enzymatic protoplast isolation protocols. Whole leaf snRNA-seq approaches are at risk of GSTs exclusion during quality filtering due to their scarcity and fragility. A mild mechanical extraction method was designed to circumvent these constraints, enabling efficient isolation of intact GSTs from over 400 leaves across S1 and S2 developmental stages (Fig. [81]2a, Supplementary Fig. [82]1b-j). Nuclear extraction was optimized for both GST-enriched and whole-leaf samples, followed by droplet-based snRNA-seq. Flow cytometry eliminated cellular debris before chromium controller loading (Supplementary Fig. [83]2). Sequencing on the Illumina platform yielded 688 million reads. Post-quality control, 8,334 leaf nuclei (936 mean genes per nucleus) and 7,995 GST-enriched nuclei (565 mean genes per nucleus) were retained (Supplementary Data [84]1). Fig. 2. Single-nucleus atlas resolves cellular heterogeneity and validates GST clusters in A. annua leaves. [85]Fig. 2 [86]Open in a new tab a Workflow for snRNA-seq: nuclei from GST-enriched sample (up) and the whole leaves(down) were isolated, sorted, and processed for library construction. b Integrated UMAP of 15 transcriptionally distinct clusters, color-coded by cell identities: GST, TT, EC, mesophyll cell (MC), guard cell (GC), vascular cell (VC), and proliferating cell (PC). c Dot plot of cell type-specific marker genes. Dot size represents expression proportion, and color intensity shows mean normalized expression (log (TPM + 1)). d Consensus annotation of 15 clusters into seven broad cell populations: GST, TT, MC, EC, PC, VC, and GC. e GST identity validation: Left, Violin plots of GST-specific genes ([87]AA618140, [88]AA036960, [89]AA529190); Right, RNA in situ hybridization experiments were performed with three independent biological replicates, all yielding consistent results. Representative images are shown. Red arrows indicate GST-specific signals; black arrows mark TT regions lacking expression (scale bar: 50 μm). Integrated analysis revealed 15 transcriptionally distinct clusters via UMAP visualization (Fig. [90]2b). GST-enriched clusters (7,8,9,11,12,14,15) exhibited spatial segregation from leaf-dominant clusters (1,2,4,6,10), with differential abundance patterns (Supplementary Fig. [91]3a, b). Differential expression analysis revealed 1,341 genes with significant transcriptional divergence, comprising 685 upregulated and 656 downregulated genes in GST-enriched sample. Gene Ontology (GO) functional annotation and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of the 685 upregulated DEGs in GST-enriched sample revealed that these genes are primarily involved in the biosynthesis of fatty acids, cutin, lignin, terpenoids, wax, sphingolipids, and flavonoids, as well as biological processes such as fatty acid metabolism, cuticle development, and lipid catabolism (Supplementary Fig. [92]3c). These findings highlight the functional relevance of these genes to specialized metabolic pathways and developmental processes in GST. The 15 clusters were precisely annotated through a multi-evidence integration strategy combining: (1) β-Glucuronidase (GUS) staining validated genes exclusive to GST in A. annua, (2) evolutionarily conserved marker genes from orthologous systems, and (3) Pearson correlation-based cluster kinship analysis (Fig. [93]2c, Supplementary Data [94]2, Supplementary Fig. [95]4). Four clusters (7,11,12,14) were unequivocally identified as GST through two hallmark features: Spatial enrichment in trichome-associated zones and elevated expression of GST-specific genes (AaGSW1^[96]44, AaGSW2^[97]45, ADS, DBR2, CYP71AV1, ALDH1, ADH1). Clusters 8,9,15 displayed developmental proximity to GST but lacked characteristic GST markers (AaGSW1, AaGSW2), defining them as T-shaped non-glandular trichome (TT). Based on the reported marker gene expression patterns in Arabidopsis thaliana, clusters 3, 13, 6, and 2 were annotated as epidermal cell (EC), guard cell (GC), vascular cell (VC), and proliferating cell (PC), respectively. Clusters 1, 4, 5, and 10 were identified as mesophyll cell (MC). Consequently, seven distinct cell populations were ultimately delineated (Fig. [98]2d). Spatially resolved RNA in situ hybridization of three GST-specific markers ([99]AA618140, [100]AA036960, and [101]AA529190) confirmed the annotation accuracy (Fig. [102]2e, Supplementary Fig. [103]5). Furthermore, multiple cell types marker genes were identified through the analysis of differentially expressed genes across distinct cell populations and validated by qRT-PCR (Supplementary Fig. [104]6a, b; Supplementary Data [105]3). These findings provide valuable molecular resources for future investigations into the developmental processes of A. annua leaf and GST, as well as the specialized metabolic pathways associated with these cell types. Spatial transcriptomics reveals the distribution and molecular signatures of GSTs in A. annua leaves Spatial transcriptomics mapping of A. annua leaves resolved GST/TT spatial architectures and validated seven snRNA-seq clusters (Fig. [106]3a). Given the unique anatomical features of A. annua leaves, we developed specific protocols for cryosections and tissue permeabilization, establishing an optimized workflow for spatial transcriptomics in this medicinal species (Supplementary Fig. [107]7a). From the leaf samples, we obtained 849 million sequencing reads, with 96.1% of barcodes identified as valid, ensuring high data quality. Four leaves exhibiting dense GSTs with intact morphology were selected for high-resolution spatial transcriptomic profiling at 14 μm resolution, yielding 1,178 spatially resolved transcriptomic spots and 25,162 detected genes (Supplementary Data [108]4, Supplementary Fig. [109]7b). Fig. 3. Spatial mapping of cell types and artemisinin biosynthesis in A. annua. [110]Fig. 3 [111]Open in a new tab a Workflow: Fresh leaf tissues were embedded, vertically sectioned, stained with TBO, and processed for spatial transcriptome sequencing. b Spatial distribution of cell clusters in the leaves of A. annua. Representative images from four independent leaves are shown. Top, overlay of spatial clusters (14 μm resolution). Bottom, toluidine blue-stained leaf section. Red arrows indicate GSTs, black arrows mark TTs. c Spatial expression of GST top ten marker genes (violin plots: log (CPM + 1)). d RNA in situ hybridization experiments of the [112]AA050580 gene were performed with three independent biological replicates, all yielding consistent results. Representative images are shown. Red arrows indicate GST-specific signals; black arrows mark TT regions lacking expression (scale bar: 50 μm). e Integrating snRNA-seq with spatial transcriptomics via Seurat. f Co-localization of artemisinin biosynthetic genes in GST domains (color gradient: blue → red, low → high). g Integrative validation of snRNA-seq and spatial transcriptomic concordance using SciBet, SingleR, RCTD, and CARD algorithms. Spatial clustering analysis revealed five distinct tissue domains (Fig. [113]3b): EC/TT (Cluster 0), MC (Cluster 1), GST (Cluster 2), and VC/PC (Clusters 3 and 4). UMAP visualization revealed spatially segregated GST domains proximal to TT/EC but distinct from other cell types, with distribution patterns concordant with snRNA-seq projections (Supplementary Fig. [114]7c). Differential gene expression analysis identified cluster-specific markers, with GST-enriched Cluster 2 showing striking high expression of terpenoids biosynthetic pathway components. Four key enzymes (ADS, ALDH1, DBR2, ADH1) involved in artemisinin (sesquiterpene) biosynthesis and two camphene synthases^[115]46 ([116]AA068290, [117]AA251590) responsible for monoterpene biosynthesis ranked among the top 10 GST markers (Fig. [118]3c). Spatial validation through RNA in situ hybridization revealed specific expression of a marker gene [119]AA050580 of GST, and conclusively confirmed the GST identity of Cluster 2 (Fig. [120]3d, Supplementary Fig. [121]5). Through anchor-based integration of snRNA-seq and spatial transcriptomics, we mapped fine-grained subpopulations to tissue coordinates (Supplementary Fig. [122]8). Notably, GST spatial zonation patterns were consistent across modalities (Fig. [123]3e), and artemisinin biosynthetic genes (FPS, ADS, CYP71AV1, ADH1, ALDH1, DBR2, CPR) within GSTs demonstrated concordant expression (Fig. [124]3f), validating biologically relevant niche capture by our cross-modal alignment. Besides, four analytical tools were used to a computational framework to link spatial and single-nucleus transcriptomic datasets. SingleR and SciBet analyses revealed that the GST cluster in snRNA-seq data exhibited the strongest correlation with Cluster 2 in the spatial transcriptomics data (Fig. [125]3g). However, cell type classification diverged between the two methods: SingleR grouped GST with PC, whereas SciBet associated GST with EC. RCTD and CARD analyses yielded consistent results, with GST spatial localization aligning with both SingleR and SciBet findings (Fig. [126]3g). These integrative analyses consumingly confirmed the accuracy of the GST cluster annotation in the snRNA-seq data, providing a reliable foundation for subsequent investigations. Pseudotime analysis unravels the differentiation trajectory and functional specialization of GST The differentiation mechanisms governing both GST and TT, which are epidermal derivatives, remain incompletely understood, especially regarding metabolic specialization during GST maturation^[127]47. To address this gap, pseudotime analysis was used to reconstruct cellular differentiation hierarchies and identify regulatory gene networks^[128]48. Applying this approach to A. annua EC, GST, and TT revealed a bifurcating developmental trajectory. Epidermal progenitor cells diverged into GST and TT lineages along the pseudo-temporal axis (Fig. [129]4a). Gene expression dynamics across this developmental trajectory segregated into four functional modules (Fig. [130]4b, Supplementary Fig. [131]9). Module 4 genes exhibited high expression in the mid-pseudotime axis, primarily involved in cell wall organization, cellulose biosynthesis process, and lipid catabolic process. Notably, Module 1, functionally tethered to GST morphogenesis, exhibited hierarchical activation of terpenoid biosynthetic process and fatty acid metabolic networks, with de novo fatty acid biosynthesis emerging as the most enriched pathway. This metabolic signature highlights fatty acid metabolism as a central orchestrator of GST development, likely coordinating precursor provisioning for GST terpenoid synthesis and dynamic membrane biogenesis during secretory cell differentiation. Fig. 4. Developmental trajectory reconstruction in spatiotemporal coupling of artemisinin biosynthesis and GST differentiation. [132]Fig. 4 [133]Open in a new tab a Pseudotemporal trajectory of EC differentiation into GST and TT. Top, UMAP colored by cell identity. Bottom, Pseudotime axis (dark → light: early → late). b Functional annotation of gene modules along differentiation trajectories. Heatmaps show expression dynamics of EC → TT and EC → GST branches (color gradients: blue → red, gene expression levels Z-score-normalized). Representative GO terms are labeled. c UMAP of reclustered GST subpopulations (k = 7): G5 (apical cells, AC), G1/G7 (subapical cells, SA), G3 (basal cells, BC), G2 (stalk cells, SC), G4/G6 (immature GST cells, IGC). d Transcriptional correlation heatmap of GST subclusters. The color gradient represents correlation coefficients (r), with increasing red intensity indicating higher degrees of inter-cluster similarity. e Pseudotime reconstruction of GST development. Root anchored at IGC (G4/G6), with a bifurcation point G1/G5/G7 and G3 from SC (G2). f Heatmap of pseudotime-dependent gene expression across GST subclusters (color gradients: blue → red, gene expression levels Z-score-normalized). Representative GO terms are labeled. g Spatiotemporal validation of representative genes. Top, Pseudotime-dependent upregulation of ADS([134]AA415620), CYP71AV1, ALDH1 and DBR2. Bottom, UMAP showing spatial enrichment in secretory subclusters (G1/G5/G7; color intensity (blue → red) indicates the relative transcript level for the indicated gene in each cell). Our refined reclustering of GST-associated clusters identified seven distinct subclusters (G1-G7) (Fig. [135]4c). Notably, G6 and G7 (21 and 22 cells, respectively) displayed markedly lower cellular abundance relative to other clusters, suggesting their role as short-lived intermediates in pivotal developmental transitions (Supplementary Fig. [136]10a). Analysis of pearson correlation coefficients based on average gene expression profiles across cell clusters (Fig. [137]4d) revealed that G1, G3, G5, and G7 exhibited strong co-expression patterns, as evidenced by their high correlation coefficients, indicative of shared developmental origins and conserved functional characteristics. G2, G4, and G6 formed another cluster. Seven GST subclusters (G1-G7) were annotated through the integration of photosynthetic gene expression profiles (Supplementary Fig. [138]10b) and intercellular correlation networks, building upon Duke et al.‘s developmental framework^[139]6. G5 was annotated as AC due to minimal photosynthetic gene expression (1.9%). G1, identified as SA, was the most abundant (554 cells) and expressed more photosynthetic genes (10.6%). G7, strongly correlated with G1, was also annotated as SA. G3 was identified as BC, lacking detectable photosynthetic gene expression and clustering with secretory cells, potentially serving as support cells for GST structure maintenance and substance transport. G2 was annotated as SC, showing a low correlation with secretory cells but expressing some photosynthetic genes (1.7%). G6 specifically expressed H2A, a classic proliferation marker (Supplementary Fig. [140]10c). Meanwhile, GO enrichment analysis revealed that G4 primarily functions in translation, translational elongation, cutin biosynthesis, fatty acid metabolism, and plant epidermis morphogenesis. The strong Pearson correlation coefficient between G4 and G6 further supported their biological similarity (Fig. [141]4d). Therefore, G4 and G6 were annotated as immature GST cells (IGCs). Pseudotime trajectory analysis elucidated the developmental progression of GST subpopulations, revealing at least three distinct functional states during GST development (Fig. [142]4e). The initiation phase (G4/G6, IGC) was predominantly associated with fundamental cellular processes, including translation (GO: 0006412), nucleosome assembly (GO: 0006334), and cytoplasmic translation (GO: 0002181), establishing the essential cellular machinery for subsequent development. The intermediate phase (G2/G3/G1, SC, BC, and SA) exhibited a metabolic shift toward lipid metabolism, marked by the activation of the fatty acid biosynthetic process (GO: 0006633), wax biosynthetic process (GO: 0010025), and very long-chain fatty acid biosynthetic process (GO: 0042761). Notably, this phase co-expressed active photosynthetic pathways (GO: 0018298, GO: 0009768, GO: 0015979) alongside these biosynthetic processes, demonstrating a concurrent metabolic capacity during this transitional stage. Most significantly, the late developmental stage (G5/G7 clusters, SA and AC) exhibited upregulation of specialized metabolic pathways, particularly terpenoid biosynthesis (GO: 0016114) and fatty acid metabolism (GO: 0006633). This molecular profile aligns with the evolutionary role of GSTs as specialized biosynthetic compartments for secondary metabolite production (Fig. [143]4f, Supplementary Fig. [144]11). Reconstruction of the developmental trajectory via pseudotime analysis enabled determination of the dynamic expression of key genes in artemisinin biosynthesis during GST development (Fig. [145]4g). The expression of four downstream key genes in artemisinin synthesis gradually increased with trichome development. By resolving the pseudotemporal architecture of GST maturation, our work provides unprecedented insights into the dynamic functional transitions during GST development, offering a refined framework for understanding the coordination mechanisms underlying their differentiation and maturation. Artemisinin biosynthesis exhibits strict spatial confinement to six secretory cells within GST of A. annua While GSTs are recognized as the primary biosynthetic sites for artemisinin in A. annua, the precise cellular specialization within their 10-cell architecture has remained unresolved. Single-nucleus transcriptomic profiling of all known key genes in the artemisinin biosynthetic pathway was performed to determine cell-type-specific biosynthesis (Supplementary Data [146]5). GST clusters (C7, C11, C12, C14) displayed coordinated high expression of upstream terpenoid precursor pathways (MVA and MEP) and downstream biosynthetic genes (FPS^[147]9, ADS, CYP71AV1, CPR^[148]9, ALDH1, DBR2, ADH1). In contrast, TT clusters exhibited only basal expression of DXS^[149]9, HDS^[150]9, FPS, and CPR, while epidermal and mesophyll cells showed negligible pathway activity (Fig. [151]5a). These results conclusively demonstrate evolutionary compartmentalization of artemisinin biosynthesis within GST cells of A. annua. Fig. 5. Cell type-specific expression of artemisinin biosynthetic genes across GST subclusters. [152]Fig. 5 [153]Open in a new tab a Single-nucleus resolution of artemisinin pathway gene expression. Dot size: proportion of expressing cells; color gradient: mean normalized expression (log (TPM + 1)). Corresponding legend panels (left for All cell type clusters, right for GST subclusters) specify visualization parameters. b Single-nucleus resolution of transcription factors expression. c Stage-specific transcriptional regulation of the artemisinin pathway. Bulk RNA-seq (log2(FPKM + 1), Z-score normalized) shows coordinated upregulation of MVA/MEP and downstream genes in young GSTs (YG, S1/S2; n = 3) versus old GSTs (OG, S3/S4; n = 3). d qRT-PCR validation of core biosynthetic genes in whole leaf (L) versus GST-enriched (G) samples across stages (S1-S3). Data: mean ± SD (n = 3 biologically independent samples; *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, two-tailed t-test). Actin ([154]EU531837) is used for normalization. Integrated analysis of the artemisinin biosynthetic pathway, which spans terpenoid precursor pathways (MVA/MEP) and downstream genes, across GST subclusters revealed cell-type-specific metabolic partitioning. ACs and SAs exhibited coordinated upregulation of both precursor pathways and all known downstream enzymatic components (ADS, CYP71AV1, ALDH1, DBR2) (Fig. [155]5a), indicating their autonomous capacity for the production of enzymatic precursors in artemisinin biosynthesis. Conversely, basal cells (BCs) showed attenuated expression of late-stage genes (FPS, ADS, CYP71AV1, DBR2) but maintained active transcription of early MVA pathway genes (HMGS, HMGR)^[156]9, suggesting specialized roles in precursor provisioning for neighboring secretory cells. Strikingly, SCs and IGCs displayed minimal transcriptional activity across the entire pathway, confirming their metabolic exclusion from artemisinin production. High-resolution single-nucleus RNA-seq spatial mapping definitively restricted artemisinin production to six specialized secretory cells within A. annua GSTs, establishing these compartments as the principally biosynthetic site for this antimalarial compound. In addition, we analyzed the expression patterns of 48 previously reported transcription factors (TFs) associated with GST initiation or artemisinin biosynthesis at single-nucleus resolution (Supplementary Data [157]6). Figure [158]5b illustrates representative TF expression profiles across all A. annua leaf cell types and within GST subclusters. Key regulators of GST initiation, such as AaTLR3^[159]49, AaTAR1^[160]50, AaMIXTA^[161]51, and AaWIN1^[162]52, showed high expression in IGCs and/or SCs. Conversely, TFs modulating artemisinin biosynthesis, including AabHLH1^[163]53, AaMYC2^[164]54, AaWRKY9^[165]55, AaBBX21^[166]56, AaGSW1, AaWRKY17^[167]57, exhibited high expression in ACs and SAs. Notably, BCs co-expressed TFs implicated in both GST development (AaHD8^[168]58) and artemisinin synthesis (AaABI5^[169]59, AaERF1^[170]60, AaMYB108^[171]61). Our snRNA-seq data corroborated with published GUS localization studies. TFs validated by GUS assays for GST-specific expression, such as AaGSW1 and AaGSW2 also showed cluster-specific enrichment in GST within our scRNA-seq data. TFs documented to express in TT (AaMYB1^[172]62, AaTAR1, AaABF3^[173]63) showed high expression in corresponding snRNA-seq clusters. Similarly, AaTCP15^[174]64 expression, previously observed in mesophyll cells via GUS staining, was confirmed in the MC cluster. While AaTCP14^[175]65 was reported to exhibit non-specific GUS staining but high trichome expression, our analysis precisely localized its high expression to SA and PC (Fig. [176]5b). Spatiotemporal patterns were resolved at single-nucleus resolution. The artemisinin regulator AaHY5^[177]56 and AabZIP1^[178]66 both showed high expression in IGCs. The salicylic acid-associated bZIP TF TGA6^[179]67 was highly expressed in both IGCs and SAs. Intriguingly, while AaGSW2 (a reported GST initiation factor) showed high expression within the overall GST cluster, subcluster analysis revealed its specific enrichment in ACs and SAs rather than stalk or IGCs. This suggests AaGSW2 likely functions as a multifunctional TF, potentially regulating secretory cells metabolism (such as artemisinin biosynthesis) beyond its role in GST initiation. A notable observation concerns AaMYB15^[180]68, a negative regulator of artemisinin biosynthesis previously localized to both GSTs and TTs by GUS staining. Our data confirmed its high expression in both trichome types, with particularly strong enrichment in BCs. This expression pattern further supports the hypothesis that BCs may supply precursors for artemisinin biosynthesis in secretory cells. Bulk RNA-seq profiling of A. annua GSTs across developmental stages revealed pronounced enrichment of artemisinin biosynthetic genes during the young GSTs compared to old GSTs (Fig. [181]5c). This transcriptional signature was substantiated by developmental stage resolved metabolomic profiling (Fig. [182]1c), revealing precisely coordinated artemisinin accumulation dynamics that indicate biosynthesis is most active during the young secretory phase of GSTs, with a progressive decline in synthesis rate as the GST age. To establish the robustness of our single-nucleus and bulk transcriptomic datasets, we conducted independent validation using qRT-PCR to quantify expression dynamics of core artemisinin biosynthetic genes across A. annua leaf tissues and GST-enriched samples (Fig. [183]5d). Quantitative analysis demonstrated coordinated expression of four core artemisinin biosynthetic genes (ADS, CYP71AV1, ALDH1, DBR2), with pronounced enrichment in GST relative to leaf tissues and progressive attenuation from young to old GSTs. It confirms the technical reproducibility of our sequencing datasets while offering robust evidence for cell-type-specific metabolic specialization in artemisinin biosynthesis. hdWGCNA reveals hub genes associated with GST development and artemisinin biosynthesis in A. annua Integration of snRNA-seq with high-definition weighted gene co-expression network analysis (hdWGCNA) uncovered co-expressed gene modules linked to GST development and artemisinin biosynthesis, revealing functional coordination between these processes through distinct co-expression networks. Scale-free network analysis determined an optimal soft threshold of 2 (network fit index = 0.9; Supplementary Fig. [184]12a), yielding four functional modules (Fig. [185]6a, Supplementary Data [186]7). Core genes within each module were identified via gene connectivity (kME), with hub module eigengenes (hMEs) visualized (Fig. [187]6b). The orange module showed predominant expression in EC, while the brown module was enriched in GSTs, indicating spatial specialization. Fig. 6. hdWGCNA identified hub genes of trichome development and artemisinin biosynthesis. [188]Fig. 6 [189]Open in a new tab a Hierarchical clustering dendrogram of co-expression modules (hdWGCNA; soft threshold power = 2, R² = 0.9). 3,727 genes partitioned into four modules (colored branches); grey: unassigned genes. b Spatiotemporal distribution of module eigengenes. UMAP colored by module eigengene scores (kME values). c Cell type-specific expression dynamics of co-expression modules. Heatmap: average log2(fold change) in GST vs. other cell types. Significance: *FDR < 0.05, **FDR < 0.01, ***FDR < 0.001. d Co-expression networks between selected unreported hub genes and known artemisinin biosynthetic genes or transcription factors. e Violin Plots of expression patterns for six hub genes in artemisinin biosynthesis and trichome development. f qRT-PCR validation of three artemisinin biosynthesis genes in whole leaf (L) versus GST-enriched (G) samples across stages (S1-S3). Data: mean ± SD (n = 3 biologically independent samples; *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001, two-tailed t-test). Actin ([190]EU531837) used for normalization. Module-trait relationships confirmed a strong association of the brown module with GST clusters (Fig. [191]5c). Functional enrichment revealed the module’s core enrichment in specialized metabolic pathways critical to sesquiterpene biosynthesis: terpenoid biosynthesis, fatty acid metabolism, isopentenyl diphosphate metabolism, and flavonoid biosynthetic process (Supplementary Fig. [192]12b), suggesting its close association with the artemisinin biosynthetic pathway. Differently, the orange module associated broadly with EC, TT, and GST clusters (Fig. [193]5c) and was enriched for cuticle biosynthesis, very long-chain fatty acid (VLCFA) metabolism, lipid catabolism, and plant epidermal morphogenesis (Supplementary Fig. [194]12b). This spatial partitioning, where cuticle-related processes preceding GST specialization, suggests metabolic preparation may facilitate subsequent developmental transitions. Using our developed scoring system that integrates intramodular connectivity (kME) and association with known biological pathways, we established interaction networks for the top 30% hub genes in the brown and orange modules to identify key genes involved in artemisinin biosynthesis and GST development (Fig. [195]6d). Strikingly, the brown module network integrated both upstream (MEP) pathway genes (DXS, HDS) and six downstream enzymatic genes (ADS, CYP71AV1, ADH1, ALDH1, DBR2, CPR) that collectively span the complete artemisinin biosynthetic cascade. Furthermore, the network encompassed an ABC transporter gene (PDR1, [196]AA090620) functionally validated in Nicotiana tabacum for artemisinic acid transport implicated^[197]69. This comprehensive coverage of key genes in the artemisinin biosynthetic pathway significantly enhances the reliability and biological relevance of the constructed network. Six genes from brown and orange modules were prioritized as key candidates influencing artemisinin biosynthesis or GST initiation/development, particularly given the unique structural and functional specialization of plastids within GST cells. In the brown module, [198]AA013090 (annotated as a photosystem antenna protein-like based on sequence homology) is co-expressed with core artemisinin biosynthetic genes. Although its exact role in GST plastids remains unverified, this gene may contribute to light-dependent processes that indirectly support artemisinin production. This role is supported by tobacco mutant studies, where loss of light-harvesting chlorophyll a/b-binding proteins (LHCB) fully abolished glandular secretory capacity^[199]70. Although [200]AA013090 is not a direct ortholog of tobacco LHCB, the convergent GST dysfunction in both systems underscores the critical dependence of GST activity on photosystem components. Moreover, [201]AA244300 (annotated as a dimethylaniline monooxygenase) may be involved in oxidative modifications of artemisinin intermediates. Two aldehyde dehydrogenase family genes ([202]AA036960 and [203]AA405270) may influence artemisinin synthesis by modulating oxidoreductase activity (acting on the aldehyde or oxo group of donors, with NAD or NADP as acceptor). The orange module candidates comprised [204]AA393400 (annotated as Lipase, GDSL) and [205]AA372410 (non-specific lipid-transfer protein 3). Two genes potentially functioning in processes critical for cuticle and wax metabolism^[206]71,[207]72, thereby influencing GST initiation and physical integrity. Spatial expression profiling (Fig. [208]6e) confirmed functional relevance; brown module candidates showed GST enrichment, while orange module candidates exhibited broad expression across EC/TT/GST cell types. qRT-PCR validation further demonstrated that [209]AA036960, [210]AA013090, and [211]AA244300 exhibited expression patterns consistent with four core artemisinin genes (Fig. [212]6f). Notably, [213]AA036960 was validated as a marker gene for GST via RNA in situ hybridization (Fig. [214]2e, Supplementary Fig. [215]5). More comprehensive functional validation studies using CRISPR-Cas9 knockout are currently underway. Discussion Advances in single-cell transcriptomics have revolutionized molecular genetics, offering unprecedented opportunities to understand the cellular compartmentalization of specialized metabolism^[216]73–[217]75. Here, we established a single-nucleus atlas of GST in A. annua, resolving spatial organization of the artemisinin biosynthetic pathway. Integration of snRNA-seq with spatial transcriptomics yielded a high-resolution spatiotemporal dataset of A. annua GST at the single-nucleus level. This dataset provides a foundation for uncovering unknown pathways in artemisinin biosynthesis and elucidating the developmental mechanisms of GSTs, while also serving as a model for understanding the morphogenesis and regulatory mechanisms of GSTs in other species. It also offers insights into cell-specific synthesis of natural products. scRNA-seq of A. annua GSTs presented a major challenge. GSTs constitute less than 1% of leaf cells. Mature GSTs are small, encapsulated in a cuticular shell, and show reduced transcriptional activity typical of metabolite-storing cells^[218]47,[219]76. Conventional enzymatic methods failed to isolate intact GST protoplasts due to structural resistance and low RNA access, leading us to develop an optimized mechanical-enzymatic workflow. Focusing on the S1-S2 developmental stage where GST formation and peak biosynthesis coincide, we processed over 400 leaves. This yielded a high-resolution single-nucleus atlas with 1,864 GST nuclei and 1,469 TT nuclei, resolving seven distinct cell populations representing known subtypes. While spatial transcriptomics captured only about 200 trichome cells from limited sections, snRNA-seq profiled 1,864 cells across stages. This discrepancy arises from inherent biological constraints. GSTs are stochastically distributed, smaller than standard section thickness, and hindered by dense metabolites and cuticles, limiting capture efficiency despite optimization efforts. RNA in situ hybridization confirmed dataset accuracy and identified four GST marker genes. This integrated approach overcomes technical barriers, establishing A. annua as a cellular resolution model for specialized metabolism. Our atlas reveals unprecedented functional specialization within GSTs. Artemisinin biosynthesis is evolutionarily compartmentalized to secretory cells (SAs and ACs), specifically to just six specialized secretory cells possessing the all currently known enzymatic machinery for de novo production. BCs express early MVA genes but show reduced late-stage gene expression, suggesting a potential role in supplying precursors to adjacent secretory cells. This indicates an evolutionary strategy optimizing resource allocation via pathway compartmentalization. Notably, SCs and IGCs showed minimal artemisinin pathway gene expression, confirming metabolic activity is strictly confined to mature secretory cells. Using single-nucleus resolution, we clarified the distinct spatial specificity of transcription factor expression and localization. For example, AaGSW2, previously thought broadly expressed, is specifically enriched in apical/subapical cells, indicating potential dual roles in regulating secretory metabolism. High expression of the repressor AaMYB15 in basal cells further supports their involvement in precursor supply. These findings resolve the cellular basis of artemisinin synthesis and provide precise targets for metabolic engineering. Pseudotemporal trajectory analysis defined three GST developmental stages. The initiation phase establishes basic cellular machinery by activating fundamental metabolic pathways. The intermediate stage activates lipid metabolism, coordinating fatty acid and wax biosynthesis alongside photosynthesis. The terminal differentiation phase achieves metabolic specialization by spatially partitioning terpenoid and lipid biosynthetic pathways. This cascade aligns with the GST role as a specialized metabolite factory^[220]77. Applying hdWGCNA identified two key modules linked to GST development and artemisinin biosynthesis. The brown module, co-expressed with GST clusters, contained all known core artemisinin pathway enzymes. Significantly, this module encompasses three of the five candidate hub genes previously identified through WGCNA analysis of bulk RNA-seq data from A. annua trichomes^[221]78, reinforcing its central role in participating in the pathway and suggesting strong concordance between bulk and single-nucleus approaches. This convergence strongly suggests the module orchestrates artemisinin biosynthesis. The orange module is associated with EC and trichomes. Its enrichment in cuticle biosynthesis, VLCFA metabolism, lipid catabolism, and epidermal morphogenesis aligns with the reconstructed GST trajectory and known GST regulation. Cuticle layer composition and activation of its biosynthesis by AaMIXTA1^[222]51, which enhances GST density, confirm this module’s role. TAR1^[223]50 regulates artemisinin genes while maintaining cuticular wax, and TLR3^[224]49 constrains metabolic overflow by negatively regulating cuticle organization and branching. Together, those validated the orange module’s functions in epidermal structural integrity and lipid control. We speculate that flux through these cuticle/lipid biosynthetic pathways may generate signals (such as lipid-derived molecules, mechanical stress from cuticle deposition) or establish a permissive epidermal environment, thereby facilitating or influencing the downstream transcriptional regulatory cascade controlling GST fate specification. To functionally investigate these modules and test this hypothesis, six high-scoring genes within them are undergoing functional characterization via CRISPR-Cas9 knockout, and their detailed functions will be elucidated in subsequent publications. This study used an integrated multi-omics approach to reconstruct the spatiotemporal development of A. annua GSTs at single-cell resolution and identify hub genes regulating artemisinin biosynthesis and GST development. To overcome limitations of prior trichome studies of other species, which used whole-leaf samples with few GST cells^[225]34,[226]79. We created a GST atlas using enriched GST samples and leaf controls. This strategy addresses the current lack of similarly enriched GST atlases for other key species. Such atlases are essential for cross-species integration to evaluate the universality and evolutionary conservation of GST mechanisms identified here. Future work should apply this enrichment strategy to create high-resolution atlases for diverse GST metabolite-producing species. Integrating these datasets will define conserved developmental programs and species-specific adaptations. Functional characterization of the hub genes using gene editing is crucial to elucidate their roles in GST morphogenesis and artemisinin biosynthesis and test their conservation. This comparative and functional genomics work forms our core follow-up, with full validation to be detailed later. Methods Plant materials and growth conditions Seeds of A. annua (cultivar ‘Huhao 1’) were sourced from the School of Agriculture and Biology, Shanghai Jiao Tong University. Plants were grown in a 2:1 (v/v) nutrient soil-perlite mixture under controlled greenhouse conditions (25 °C, 16/8-hour light/dark photoperiod) for one month. Nuclei isolation and 10× single-nucleus RNA-seq library construction GSTs were enriched from 100 one-month-old A. annua plants by gently brushing leaf surfaces in 0.4% D-mannitol, filtering through a 70 μm strainer, and centrifuging (1000 rpm, 5 min). GST-enriched and whole leaf samples were lysed in pre-chilled buffer (0.44 M sucrose, 1.25% Ficoll, 2.5% Dextran T40, 20 mM HEPES, 10 mM MgCl2, 0.5% Triton X-100, 1 mM DTT, 1× protease inhibitor, 1000 U/mL RNase inhibitor), homogenized 20 min, and filtered through 40 μm and 20 μm sieves. Nuclei were collected by centrifugation (500 g, 5 min, 4 °C) and resuspended in 500 μl buffer (1× DPBS, 0.04% BSA, 1000 U/mL RNase inhibitor). For sorting, nuclei were stained with propidium iodide (PI) and isolated by flow cytometry based on PI signal and size. After pelleting (1000 g, 5 min, 4 °C), nuclei were resuspended in PBS, assessed by fluorescence microscopy, and 30,000 nuclei were loaded onto a 10X Genomics chip. Libraries were constructed using the 10X Chromium Single Cell 3’ Solution v3.1 kit for Illumina sequencing. snRNA-seq data processing Raw sequencing data were processed with Cell Ranger (v8.0.1) for quality control and alignment to the reference genome (PRJNA416223). Further quality filtering was performed using Seurat^[227]80 (v4.0.0), excluding cells with gene or UMI counts beyond mean ± 2 standard deviations. Doublets were removed using DoubletFinder^[228]81 (v2.0.3). Data normalization and identification of the top 2000 highly variable genes (HVGs) were conducted using Seurat’s NormalizeData and FindVariableGenes functions, respectively. Dimensionality reduction was achieved via principal component analysis (PCA) and uniform manifold approximation and projection (UMAP), followed by clustering using the shared nearest neighbor (SNN) algorithm (resolution = 0.4). Marker genes were identified with FindAllMarkers (logfc.threshold > 0, min.pct > 0.25), and differentially expressed genes (DEGs) were detected using FindMarkers (p-value < 0.05, fold change > 1.5). Functional enrichment analysis was performed using GO^[229]82 and KEGG^[230]83 databases. Spatial transcriptomic experiments and library construction Spatial transcriptomics was conducted using the BMKMANU S3000 platform. Leaves from the same developmental stage as snRNA-seq samples were co-embedded in OCT compound within single embedding molds (multiple leaves per block). During embedding, all leaves were uniformly oriented with abaxial epidermis facing upward and adaxial epidermis facing downward, with positional markings applied to the molds. Samples were flash-frozen in liquid nitrogen and stored at −80 °C. Sectioning was performed by obtaining transverse sections (10 μm thickness) through the mid-region of each leaf using 90° vertical cuts. Sections were mounted on BMKMANU S3000 Tissue Optimization and Gene Expression Slides. Tissue optimization determined the optimal permeabilization time, followed by Toluidine Blue O (TBO) staining, imaging, permeabilization, and cDNA synthesis. Libraries were prepared following the manufacturer’s protocol and sequenced on an Illumina platform. stRNA-seq data processing Clean data were processed with BSTMatrix software for quality control. Cell boundary segmentation and fluorescent staining were performed using Cellpose^[231]84 (v2.0). Spots overlapping tissue sections were selected for analysis. Data were aligned to the reference genome (PRJNA416223), and a spot-gene expression matrix was generated. Subsequent analysis in Seurat^[232]80 included normalization via the sctransform (SCT) algorithm, identification of the top 3,000 highly variable genes (HVGs), principal component analysis (PCA), graph-based clustering, and UMAP visualization. Integration analysis of stRNA-seq and snRNA-seq We integrated snRNA-seq and spatial transcriptomics data using anchor-based integration (Seurat)to transfer cell-type annotations from single-cell clusters to spatially resolved spots. This enabled deconvolution of spot-level compositions and revealed subpopulation spatial distributions at near-cellular resolution. To align clusters from single-nucleus and spatial transcriptomic data, we employed multiple computational tools: SingleR^[233]85, SciBet^[234]86, RCTD^[235]87, and CARD^[236]88. SingleR assigns cell type labels by calculating gene expression similarity between spatial transcriptomic spots and single-nucleus transcriptomic cell types, selecting the highest similarity score. SciBet uses supervised learning and Kullback-Leibler (KL) divergence to predict cell-type probabilities based on reference distributions. RCTD, implemented via the spacexr R package (doublet_mode = ‘doublet’), estimates cell type proportions in spatial data and visualizes them on tissue sections. CARD applies a non-negative matrix factorization model to integrate snRNA-seq reference data with spatial transcriptomic data, inferring cell type composition at each spatial location. The code for SingleR ([237]https://github.com/dviraran/SingleR) and SciBet ([238]https://github.com/ PaulingLiu/scibet) is publicly available. Pseudotime analysis Pseudotime analysis was performed using Monocle^[239]89. Highly variable genes across cells were selected for dimensionality reduction. A minimum spanning tree (MST) was constructed, with the longest path representing the differentiation trajectory of transcriptionally similar cells. The trajectory root was determined based on cell cluster or subcluster identity. Genes significantly varying along pseudotime (q-value < 0.1) were identified using the different Gene Test function and clustered via plot_pseudotime_heatmap. For GST clusters, dynamically expressed genes were visualized using plot_genes_in_pseudotime. High-dimensional weighted gene co-expression network analysis (hdWGCNA) hdWGCNA was applied to high-dimensional single-cell RNA-seq data using the hdWGCNA R package^[240]90(v0.2.19). Genes expressed in ≥5% of cells were selected via the SetupForWGCNA function. The k-nearest neighbors (KNN) algorithm aggregated similar cells to generate a meta-cell gene expression matrix^[241]91. Parameter scanning and soft threshold determination were performed using the TestSoftPowers and ConstructNetwork functions, respectively. Module eigengenes (MEs) summarized co-expression module profiles, and gene connectivity (kME) was quantified using the ModuleConnectivity function. Genes with high kME values were identified as hub genes. Downstream visualizations and analyses were conducted to explore module functions. To identify the top candidate genes within these functionally relevant modules, we employed a dual strategy focusing on both intramodular connectivity and association with known key players: We identified the Top 30% genes within each module (Brown and Orange) based on their kME values. For each of these Top 30% genes, we calculated their TOM similarity to a curated set of known key genes involved in artemisinin biosynthesis and GST initiation. We developed a scoring system to integrate these two important aspects: [MATH: Score=(50%*kME+50%*< mi mathvariant="normal">TOM_to_KnownGenes)*< mn>100 :MATH] kME = Module membership value for the candidate gene (range 0 to 1). TOM_to_KnownGenes = Mean Topological Overlap Measure between the candidate gene and the set of known artemisinin/trichome genes (range 0 to 1). RNA in situ hybridization Fresh A. annua leaves (stages 1 and 2) were fixed in FAA solution at 4 °C for 24 h, dehydrated through an ethanol gradient, embedded in paraffin, and sectioned into 10 μm slices using a Leica microtome. Sections were baked at 62 °C for 2 h, dewaxed in xylene and ethanol, rehydrated in DEPC water, and digested with proteinase K (Beyotime, Cat# ST537). After washing with PBS, sections were hybridized overnight with DIG-labeled RNA probes, incubated with mouse anti-DIG-alkaline phosphatase (Jackson, Cat# 200-052-156), and developed using BCIP/NBT substrate (Boster, Cat# AR1023). Images were acquired with a Nikon microscope. Probes were 5’-DIG-labeled oligonucleotides, with details provided in the supplementary Data [242]8. GST morphological analysis GST morphology and developmental dynamics were analyzed in A. annua leaves using a stereomicroscope (Motic SM7). High-resolution imaging was performed under standardized conditions using the ImageView software to ensure consistency across biological replicates. Leaves were systematically sampled from identical nodal positions of three individual plants per developmental stage (S1-S4) to minimize developmental-stage variability. GST density was quantified by manual enumeration of all GSTs across the abaxial leaf surfaces and normalized to GSTs per unit area. Whole-leaf phenotyping was performed with a Nikon (Z MC 105 mm) camera under controlled LED lighting. Metabolite profiling by LC-MS/MS Artemisinin and its precursors (artemisinic acid [AA], dihydroartemisinic acid [DHAA]) were quantified using an optimized LC-MS/MS protocol^[243]50. Three independent biological replicates were analyzed for each sample (n = 3). Leaf samples were dried at 60 °C, and 20 mg of powdered tissue was extracted in 1 mL methanol via ultrasonication (40 min) and centrifugation (14,000 rpm, 10 min). The supernatant was filtered (0.25 μm) and diluted 500-fold for analysis. LC-MS/MS analysis was performed on an Agilent 1200 Infinity LC system coupled to a 6410 Triple Quadrupole mass spectrometer. Separation was achieved using a ZORBAX SB-C18 column (2.1 × 100 mm, 3.5 μm) with isocratic elution (acetonitrile/0.1% formic acid, 65:35 v/v) at 0.3 mL/min. The mass spectrometer operated in positive electrospray ionization mode with multiple reaction monitoring (MRM), with parameters set to: nebulizer gas pressure (40 psi), source temperature (350 °C), gas flow rate (10 L/min), and capillary voltage (4000 V). Data were acquired and processed using MassHunter Workstation software (vB.08.00). Bulk RNA sequencing Total RNA was extracted from leaves (stages 1-3, one-month-old plants, 3 biological replicates per stage) and GST-enriched samples (stages 1-2: young GSTs; stages 3-4: old GSTs; 3 replicates of 100 plants each). Libraries were prepared using the HiEff NGS Ultima Dual-mode mRNA Library Prep Kit (Yeasen Biotechnology) and sequenced on an Illumina NovaSeq 6000 platform (150 bp paired-end reads). Raw data were processed and quality-controlled using the BMKCloud platform ([244]www.biocloud.net). RNA isolation and quantitative real-time polymerase chain reaction (qRT-PCR) Total RNA was extracted using the TransZol Up Plus RNA Kit (Tiangen Biotech) according to the manufacturer’s protocol. First-strand cDNA was synthesized from 1 μg RNA using the TransScript First-strand cDNA Synthesis SuperMix (TransGen Biotech). qRT-PCR was performed with SYBR-Green PCR Mastermix (Takara) on a Thermal Cycler Dice system. Gene expression levels were normalized to actin ([245]EU531837) as an internal control. All reactions were conducted in triplicate, with primer sequences detailed in Supplementary Data [246]8. Statistical analysis Statistical analyses were conducted using GraphPad Prism (v9.8.0.200). Data normality and homogeneity of variance were assessed before analysis. Two-tailed Student’s t-tests were used for comparisons between the two groups. Significance levels were denoted as: *P < 0.05, **P < 0.01, ***P < 0.001, and ****P < 0.0001. Reporting summary Further information on research design is available in the [247]Nature Portfolio Reporting Summary linked to this article. Supplementary information [248]Supplementary Information^ (2.5MB, pdf) [249]41467_2025_63770_MOESM2_ESM.pdf^ (179.6KB, pdf) Description of Additional Supplementary Files [250]Supplementary Data 1-8^ (402.4KB, xlsx) [251]Reporting Summary^ (104.7KB, pdf) [252]Transparent Peer Review file^ (781.4KB, pdf) Source data [253]Source Data^ (71.4KB, xlsx) Acknowledgements