Graphical abstract graphic file with name fx1.jpg [35]Open in a new tab Highlights * • Global methylation identified significantly different proportions of mCs in hybrid * • Of common DMRs, 33.08% showed different methylation in hybrid from the mid-parental value * • Negatively correlated DEG pDMR-genes were enriched in metabolic pathways * • Significant higher expression of metabolites and DE-Gs were identified in the F[1] hybrid __________________________________________________________________ Biological sciences; Natural sciences; Plant biology; Plant Genetics. Introduction The Capsicum (2n = 2x = 24) crop, which belongs to the family Solanaceae, is one of the most important vegetable crops. Although more than 38 species are reported under the Capsicum genus, only Capsium annuum, C. frutescens, C. chinense, C. pubescens and C. baccatum are cultivated ([36]Bosland and Votava, 2000; [37]Ibiza et al., 2012). Besides their use as a spice, Capsicum, commonly known as chilli peppers contains several important secondary metabolites and other phytochemicals compounds which makes them an economically important crop with high industrial and pharmacological potential ([38]Ramchiary et al., 2014; [39]Sarpras et al., 2020; [40]Sricharoen et al., 2017; [41]Sinisgalli et al., 2020; [42]Olatunji and Afolayan, 2019; [43]Baenas et al., 2019). For instance, the naturally found compounds such as phenolics, flavonoids, carotenoids, capsaicinoids, vitamins, and alkaloids in Capsicum can be used for multipurpose industry usage viz. in increasing food shelf-life, improving the sensory properties, seasoning, and coloring of foods, in cosmetic and pharma industry ([44]Baenas et al., 2019). Several studies reported the presence of intra and interspecific diversity of fruit traits, metabolites content, and other agronomic traits in Capsicum species ([45]Ahmad et al., 2021a, [46]2021b; [47]Araceli et al., 2009; [48]Chhapekar et al., 2020; [49]Liu et al., 2020; [50]Sarpras et al., 2016; [51]Sarpras et al., 2019). Furthermore, many studies on marker development, QTL/gene(s) mapping of economically important traits, whole transcriptome, and genome sequencing have been reported on Capsicum species ([52]Ahn et al., 2014; [53]Chhapekar et al., 2021; [54]Dubey et al., 2019; [55]Jaiswal et al., 2020; [56]Kim et al., 2020; [57]Liu et al., 2019; [58]Martínez et al., 2021; [59]Razo-Mendivil et al., 2021). Few studies reported that inter- and intraspecific hybrids of cultivated Capsicum species i.e. C. annuum, C. frutescens, C. chinense, C. baccatum, and C. pubescence, showed vigor for different traits like fruit yield, plant growth, and earliness ([60]Sood and Kumar, 2010; [61]Shrestha et al., 2011; [62]Singh et al., 2014; [63]Gözen et al., 2018). Heterosis breeding has been extensively used in Capsicum crop to increase the production of economically important traits ([64]Herath et al., 2021). A recent study reported the expression of higher capsaicinoids and sugars in heterotic hybrid (derived from crossing between C. chinense x C. frutescens) as compared to their parental lines ([65]Zamljen et al., 2020). Also, the higher accumulation of several metabolites such as sugars (viz. glucose, fructose, and sucrose) and primary metabolites like citric acid was observed in leaves of Capsicum hybrids compared to the parents at lower temperatures ([66]Shiragaki et al., 2021). These observations in different studies suggest that the higher expression of genes and metabolites in F[1] hybrids compared to the parental lines might have a direct role in the manifestation of heterosis in F[1] hybrids. Furthermore, it was reported that DNA methylation plays an important role in the regulation of the expression of genes, which would eventually influence the biosynthesis of metabolites ([67]Sinha et al., 2020; [68]Lauss et al., 2018), thereby, suggesting further the need for integrated analysis of DNA methylation, and expression of genes and metabolites content for detail understanding of the manifestation of heterosis, as no such study is reported in Capsicum crops until date. Although, primarily, DNA methylation causes the silencing of the gene(s) ([69]He et al., 2011, [70]2021); however, a few studies also reported higher expression of genes despite methylation found in the gene body ([71]Arechederra et al., 2018), and the exact mechanism is still not understood. In plants, DNA methylation occurs in three contexts: CG, CHG, and CHH (where H is any nucleotide other than G) and are maintained by MET1, CM3, and DRM2, respectively ([72]He et al., 2021). Studies have also found that DNA methylation also inherits loyally across the generations ([73]Becker et al., 2011; [74]Schmitz et al., 2011) when progenies are derived by selfing. However, the inheritance trend of DNA methylations may be deferred in hybrid genotypes derived by crossing ([75]He et al., 2010; [76]Chodavarapu et al., 2012); and the degree of differential methylation pattern is directly proportional to the divergence of parental genotypes. These allelic and epi-allelic variations and their corresponding effect on the expression of genes in hybrids are considered one of the major causes of heterosis ([77]Lauss et al., 2018). Recently, a number of studies were conducted to explore the DNA methylation pattern and their possible role in heterosis in model plants ([78]Greaves et al., 2012; [79]Shen et al., 2012; [80]Chen, 2013; [81]Groszmann et al., 2013; [82]Dapp et al., 2015; [83]Zhang et al., 2016; [84]Lauss et al., 2018) as well as important crops ([85]He et al., 2010, [86]2013; [87]Shen et al., 2017; [88]Li et al., 2018; [89]Sinha et al., 2020; [90]Zhou et al., 2021). In pigeon pea, the global DNA methylation and transcriptome study identified an overall increase in the methylation in F[1] hybrids ([91]Sinha et al., 2020). The same study reported the differential expression of genes involved in phytohormone signaling, defense response and growth. More than 65% of differentially expressed genes (DE-Gs) were also found to be differentially methylated in hybrids, thereby indicating the active role of DNA methylations in gene expression and ultimately hybrid vigor in F[1] ([92]Sinha et al., 2020). Similarly, in the case of Brassica, higher methylation was also observed in F[1]s than their respective parents ([93]Shen et al., 2017; [94]Li et al., 2018). In Brassica napus (allotetraploid), the DNA methylation was also associated with higher expression of small RNA clusters of that region and was also involved in mechanisms of active growth and flower development ([95]Shen et al., 2017). In B. oleracea, differential DNA methylations were found to contribute to curd yield heterosis ([96]Li et al., 2018). Likewise, efforts were also made to understand the role of DNA methylations in hybrids in other major food crops like rice ([97]He et al., 2010; [98]Zhou et al., 2021), maize ([99]He et al., 2013) and so forth. However, despite the availability of whole genome sequences of Capsicum spp. ([100]Kim et al., 2014, [101]2017; [102]Qin et al., 2014), the integrated analysis of global DNA methylome and its correlation with gene(s) and metabolites expression are not studied in Capsicum hybrids, although few studies explored the role of epigenetics in fruit development and ripening using whole genome methylome ([103]Rawoof et al., 2020; [104]Xiao et al., 2020) and small RNA sequencing ([105]Lopez-Ortiz et al., 2021; [106]Chhapekar et al., 2021), and observed that the co-networking/correlation of methylome and small RNAs were potential players in fruit development. Keeping this knowledge gap in mind, in this study for the first time we report the integrated analysis wherein we observed significant correlations of DNA methylation with the expression of genes and metabolites in heterotic F[1] Capsicum hybrid and their parental lines at the early plant growth stage. Furthermore, our study revealed the complex and dynamic relationships of the genomes of two Capsicum spp. and their inheritance in the interspecific F[1] hybrid. Results Phenotypic analysis of F[1] hybrid and parental lines The interspecific Capsicum F[1] hybrid derived from crossing between two contrasting Capsicum species, C. chinense (Choco, female parent) and C. frutescens (Frut4, pollen donor) was evaluated for heterotic traits along with their parental lines. The parent C. chinense bears large fruits and with extreme pungency, while the C. frutescens bears small and a large number of fruits, with robust plant type and medium pungency. Since the performance of the plants is determined by the early establishment and robust growth of plants from the seedling stage, we chose 60 days old plants for the present study. The developed F[1] hybrid plants showed heterosis over the parental genotypes for phenotypic traits studied ([107]Figure 1A). The F[1] plants showed vigorous performance from early seedlings compared to the plants of parental genotypes. For example, 29.7% mid-parent heterosis (MPH) and 20.0% better parent heterosis (BHP) were observed for leaf lengths (LL). In hybrids, leaves were broader than parental genotypes with high value of heterosis (MPH = 63.2%, BHP = 60.1%) ([108]Figures 1B and 1C). Furthermore, at the time of the flowering height of parents was 31.5 (Choco) and 53.2 cm (Frut4), respectively, however, hybrids reached to a height of 83.6 cm with 97.6% MPH and 56.6% BPH. Similarly, for plant width (PW), upto 80 and 58.8% MPH and BPH, respectively, were observed in hybrids. At the time of flowering, vigourisity were also observed for leaf traits like number of leaves (NL; MPH = 95.5%, BPH = 79.2%), leaf length (LL; MPH = 75.5%, BPH = 52.6%), leaf width (LW; MPH = 84.9%, BPH = 63.7%) in hybrids. These results emphasized that the heterosis in plants was manifested from the early seedling stage. Thus we used DNA/RNA from two-month-old plant stage for methylome and transcriptome analysis. Figure 1. [109]Figure 1 [110]Open in a new tab Trait variation observed in F[1] hybrid and their parents (A) Phenotypic variability observed in adult plants. (B and C) Phenotypic variability in young plants (two months old). Blue line showing mid parent value. PH = plant height, PW = plant width, NBS = number of secondary branches, NL = number of leaves, LL = leaf length, LW = leaf width, TCC = total chlorophyll content. MPH = mid parent heterosis, BPH = better parent heterosis. Whole-genome bisulfite and RNA sequencing of Capsicum hybrid and its parents It is expected that the methylation of two distinct parental lines belonging to different Capsicum species might potentially contribute to the methylome of their F[1] hybrid. Therefore, to obtain an in-depth understanding of the global dynamics of cytosine methylation/epigenetic regulation owing to DNA methylation during hybrid formation at single-base resolution, whole-genome bisulfite sequencing of parental lines and their hybrid was performed. A total of 1660 million (∼553 million/library including biological replicates) of length 150 bp clean bisulfite sequencing reads were obtained from parental lines and their hybrid. Around 1038 million reads from all samples were aligned to the reference genome with an average mapping rate of 62.54%. Overall, 384825557, 278031457, and 375544686 reads from Choco, Frut4, and hybrid were uniquely mapped to the C. chinense reference genome, respectively, thereby providing more than 10X coverage ([111]Table S2A). The reproducibility of the results was also tested using Pearson correlation analysis and a high correlation (0.9–0.95) among the biological replicates of each sample was found ([112]Figure S1). An average of 0.4–0.5% mapping efficiency of each sample against lambda DNA reference genome indicated >99% bisulfite conversion efficiency of sample replicates from parents and hybrids ([113]Table S2B). In addition, to understand gene(s) expression changes along with methylome dynamics in the hybrid compared to its parents, RNA sequencing (RNAseq) was performed using the same leaf tissue used in WGBS. Approximately 476 million of 150 bp clean reads (∼159 million/library including replicates) were obtained from parents and their hybrid was aligned to the C. chinense reference genome. On an average, 84.5% reads of Choco, 93.8% of Frut4, and 96.8% reads of the hybrid aligned to the C. chinense reference genome. ([114]Table S2C). Global methylation landscape in Capsicum hybrid and its parental lines The genome-wide dynamics of cytosine methylation at CG, CHG, and CHH context varied in parental line and their hybrid. Choco showed the highest global methylation in all three contexts as compared to hybrid and Frut4. The hybrid showed higher global average methylation in CG and CHG contexts and lower methylations in the CHH context compared to parent Frut4 ([115]Figure 2A). Also, the frequencies of mCs in Capsicum hybrid and its parental lines ranged from 32.8% to 36.02% ([116]Table S3). In particular context, the fraction of mCs observed was approximately 22.5–24.9%, 31.7–34.9% and 40–45.7% at CG, CHG, and CHH, respectively, in three genotypes ([117]Figure 2B). Notably, compared to both the parents, hybrid showed 1.2-fold higher mCs proportion (24.9 and 34.9%, respectively) at symmetric (CG, CHG) context, while contrasting to this, it showed 1.12-fold lower mCs proportion in asymmetric (CHH) context ([118]Figure 2B). Segmentation of chromosome-wise average methylation showed higher methylation at symmetric CG and CHG context than asymmetric context throughout the 12 chromosomes in all three genotypes ([119]Figure 2C). Also, at segments of chromosome (chr) chr2 to chr5, chr7, and chr11, both hybrid and Frut4 showed slightly higher methylation than Choco. Howbeit, the same chromosomes in hybrid showed slight decrease in methylation than Frut4 ([120]Figure 2C). Figure 2. [121]Figure 2 [122]Open in a new tab Global DNA methylation profile observed in hybrid and parental lines Choco and Frut4 (A) Average methylation, (B) methylated cytosine (mCs) proportion, and (C) average methylation distribution across 12 Capsicum chromosomes at CG, CHG, and CHH methylation context. Methylation pattern at various genomic regions and transposable elements The pattern of mCs at different genomic regions/elements was analyzed in hybrid and its parental line ([123]Figures 3 and [124]S2). Both parental line and hybrid showed the highest average gene-body methylation (GBM) at CG followed by the CHG context, while lowest at the CHH context ([125]Figure 3A). At CG context, the level of GBM looked similar to that in the 2 kb flanking region and was highest in Choco, followed by hybrid, and lowest in the case of Frut4. The GBM at CHG and CHH context showed notably less methylation as compared to the 2 kb flanking region of genes. Unexpectedly, the average GBM at CHG and CHH context was lowest in the hybrid; however, both parental lines showed similar methylation frequencies ([126]Figure 3A). To avoid the potential noise (arising owing to transposable elements), the GBM was segmented into exonic and intronic regions. We observed that methylation at exon body was lower than its flanking regions, while it was higher at intron body than its flanking region at both symmetric and asymmetric methylation contexts in three genotypes ([127]Figures S2A and S2B). On comparing methylations at three different contexts in hybrid and parents, we found that at CHG and CHH contexts, both exonic and intronic body methylations were lower in hybrid than its parents. However, methylation in CG context was at par with Frut4 in exon, lower and higher than Choco and Frut4, respectively in intronic regions ([128]Figures S2A and S2B). Genome-wide proportion/distribution of cytosine methylation was analyzed at different genomic regions including exons, introns, intergenic, and promoter regions across 4 different methylation bins categorized based on percent methylation, i.e. <20%, 20–50%, 50–80%, and >80% DNA methylation ([129]Figure S2C). At CG, these genomic regions showed a higher proportion of mCs with >80% methylation in Choco and hybrid than Frut4. However, with <20% methylation, the hybrid showed a slightly higher methylation proportion than its parental line across all genomic elements. Also, within 50–80% methylation bin at CG and CHG context, both hybrid and Frut4 showed a higher proportion of mCs than Choco ([130]Figure S2C). Figure 3. [131]Figure 3 [132]Open in a new tab Metaplots representing methylation distribution at genomic elements and their flanking regions (A) gene-body, (B–F) transposable-elements (TEs) body including copia, gypsy, LINE, SINE, and DNA-elements, respectively, along with their 1.5 kb flanking regions. Methylation at gene/TE -body and flanking regions was analyzed across 20 and 30 bins, respectively. In addition, methylation pattern within transposable-elements body (TE-body) and their 2kb flanking regions was analyzed for long terminal repeats (LTRs; copia and gypsy), non-LTRs (LINEs and SINEs) and DNA-elements ([133]Figures 3B–3F). Overall, higher TE-body methylation than its flanking region was observed at both symmetric (CG, CHG) and asymmetric (CHH) contexts in parental lines and their hybrid. The methylation in LTRs at both TE-body and flanking regions was consistently higher at all contexts ([134]Figures 3B and 3C), while in non-LTRs and DNA-elements, it was preferentially higher at TE-body than their flanking regions ([135]Figures 3D–3F and [136]S2D). Across all TE methylation, at both CG and CHG contexts, hybrid showed lower methylations than Choco and higher methylation than Frut4, while in the CHH context, hybrid showed peculiarly lowest TE methylation ([137]Figures 3B–3F). Likewise, in hybrid, at CHG context, SINE (non-LTRs) and at CHH context all TE (LTRs, non-LTRs, and DNA-elements) were hypo-methylated than its two parents ([138]Figure S2E). Altogether, as far as the CG context is concerned, the hybrid had lower methylation than Choco and higher methylation than Frut4 in all the genomic regions (GBM, TEs, DNA elements); however, in the CHH context, hybrid had lower methylations than both the parents. In the case of the CHG context, hybrid has lower methylations than both the parents in GBM and SINE, and in COPIA, GYPSY, LINE, DNA elements, and hybrid had lower and higher methylations than Choco and Frut4, respectively. Differentially methylated regions in Capsicum parental lines and their hybrid Analysis of differentially methylated regions (DMRs) between parental lines (Choco, Frut4) and their hybrid led to the identification of a total of 70597, 108797, and 38418 unique DMRs at CG, CHG, and CHH methylation context, respectively ([139]Tables S4, [140]S5, and [141]S6 and [142]Figure S3A). In Choco vs hybrid and Frut4 vs hybrid, around 8.8–9.2% and 10.1–11.2% DMRs in symmetric CG and CHG context, respectively while, 19.9% DMRs in the CHH context, were uniquely methylated ([143]Figures S3B–S3D; [144]Table S7A). It was noted that, in both symmetric contexts, Frut4 had a high number of hypo-methylated DMRs, while in the CHH context; it had high hyper-methylated DMRs compared to hybrid and Choco ([145]Figure S3A). Also, hybrid had a low contribution of DMRs, i.e. 27.7%, 28.3 and 13.9% from CG, CHG, and CHH context respectively when compared to Choco, indicating hybrid was hypo-methylated than parent Choco. However, when compared to the other parent Frut4, hybrid showed an increased contribution of DMRs, i.e. 80.8%, 75.2%, and 15.8% from CG, CHG, and CHH context, respectively, indicating that hybrid is hyper-methylated than parent Frut4 ([146]Table S7B). Then the density of DMRs in hybrid and its parents was checked across 12 Capsicum chromosomes. It was observed that at CG and CHG context, the density of hyper DMR was more in Choco vs hybrid, while it was low in Frut4 vs hybrid and Frut4 vs Choco ([147]Figures S4A–S4C). However, when analyzed at the individual chromosome level, low-density of hypo DMRs was observed in Choco vs hybrid across segments of chr2 to chr5, chr7, and chr11 at both symmetric contexts ([148]Figure S4A). But in Frut4 vs hybrid at CG and CHG context and in Frut4 vs Choco at CG, CHG, and CHH context, the same aforementioned chromosome segments showed increased density of hypo-methylated DMRs ([149]Figures S4B and S4C). After this, the genomic annotation/distribution of these DMRs was analyzed and observed that most of the DMRs in hybrid and its parents were from intergenic regions of the genome ([150]Figures 4A–4C and [151]Table S7C). Of total DMRs in Choco vs hybrid, around 81.9–87.12%, 3.8–6.7%, and 6.8–11.3% DMRs were from intergenic, intragenic, and promoter regions respectively ([152]Figure 4A and [153]Table S7C). Similarly, in Frut4 vs hybrid, of total DMRs, 81.4–89.3%, 3.6–7.7%, and 5.8–10.8% were from intergenic, intragenic, and promoter regions, respectively ([154]Figure 4B and [155]Table S7C). Also, in Choco vs hybrid, all intergenic, intragenic, and promoter DMRs were hyper-methylated at all three methylation contexts ([156]Figures 4A and [157]S5A). However, in Frut4 vs hybrid and Frut4 vs Choco at CG and CHG context, the intergenic DMRs were hypo-methylated ([158]Figures 4B, 4C, [159]S5B, and S5C). Figure 4. [160]Figure 4 [161]Open in a new tab Genomic distribution of differentially methylated regions (DMRs) at three methylation context Barplots representing DMRs distribution at intergenic, intragenic, and promoter regions in (A) Choco vs hybrid, (B) Frut4 vs hybrid, and (C) Frut 4 vs Choco. Furthermore, we identified a total of 4891 (2206 CG, 2353 CHG, and 332 CHH) common DMRs between hybrids and both of its parents ([162]Table S8.). Based on common DMRs, the predictive level of methylation in hybrid was estimated against the mean average parental methylation level or mid-parental methylation values (MPMV). Of common DMRs, only 33.08% (1618) showed significant different methylation (p value < 0.01; FDR<0.05) in hybrid from MPMV and DMRs in hybrid having higher methylation than MPMV were categorized as trans-chromosomal methylation (TCM), while those having methylation less than MPMV were categorized as trans-chromosomal demethylation (TCdM). Most of these DMRs (86.5%) were located in intergenic regions ([163]Table S8). Furthermore, the percent methylation contribution of both parents in hybrid as well as newly originated mCs in hybrid were identified. A total of 77994412 mCs in hybrid were from both parents out of which 78.92% (61551482 mCs) were from Choco and 21.08% (16442930 mCs) were from Frut4 ([164]Tables S9A and S9B). Also, around 76.5–76.7% mCHH sites in hybrid were contributed by both parents, while only 8.8–9.1% mCG and 14.4–14.5% mCHG sites arose from both parental lines. Additionally, a total of 15523997 new mCs originated in hybrid, which contributed to 76.99% mCHH, 13.94% mCHG, and 9.07% mCG sites ([165]Tables S9A and S9B). Allele-specific methylation variations The significantly methylated alleles (ASM) at single nucleotide positions were analyzed across the DMRs spanned over various genomic regions in parental lines and their F[1] hybrid. Around 8803–11379 CG, 10964–13966 CHG, and 4901–7299 CHH SNVs associated with pDMRs negatively correlated with their corresponding gene expressions were identified from Capsicum F[1] hybrid and its parental lines. Of these SNVs, 1449–2338 in Choco, 1310–1783 in Frut4, and 1468–2315 in hybrid were significantly methylated ASM (adjusted p value <0.05) and were associated with 523, 753, and 734 pDMRs genes at CG, CHG, and CHH context, respectively ([166]Figure 5A and [167]Tables S10A–S10C). In addition, significant methylated ASMs (adjusted p value <0.05) associated with TCM and TCdM DMRs were also identified in Choco (404 CG, 605 CHG, 282 CHH), Frut4 (1236 CG, 1098 CHG, 244 CHH) and their F[1]hybrid (3240 CG, 3686 CHG, 340 CHH, [168]Figure 5B and [169]Tables S10D–S10F). Furthermore, when comparing the ASM between the hybrid and its parents, a total of 3577 CG, 5130 CHG, and 3916 CHH ASM associated with promoter DMRs were identified collectively ([170]Figure 5C), while, a total of 4669 CG, 5163 CHG and 680 CHH ASM associated with TCM and TCdM DMRs were collectively identified from the parental lines and hybrid ([171]Figure 5D). Furthermore, at CG and CHG context, the majority of ASM associated with TCM and TCdM DMRs were from intergenic regions and both parents had comparatively lower number of ASM than hybrid ([172]Figures 5E and 5F). Although, at CHH context, the number of intergenic ASMs associated with TCM DMRs was lower than TCdM DMRs in both parental lines and their F[1] hybrid ([173]Figure 5G). Besides, the ASM showed a differential methylation pattern between both Capsicum parental lines and their F[1] hybrid. For instance, ASM with G>A/T alleles in the promoter region of gene BC332_22173 showed preferentially high methylation in the hybrid compared to both parents and it was observed that methylation in hybrid was preferentially affected by the A/T allele from parent Frut4 ([174]Figure 6A). Also, the expression of the gene showed the overall negative correlation with the promoter methylation ([175]Figure 6D). Another genes such as BC332_29730, BC332_09755, BC332_14352, BC332_20263 and BC332_23927 with ASM having C>T allele showed preferential contribution of methylation in hybrid from parental lines ([176]Figures 6B–6F). The ASM associated with the promoter of BC332_20263 and BC332_23927 genes from the glyoxylate and dicarboxylate metabolism pathway showed low or no methylation for allele T in parent Frut4. Also, in hybrid overall methylation for allele T was slightly lower than mid-parent methylation, while the expression of gene was also preferentially high in hybrid than in parent Frut4 ([177]Figure 6G). Additionally, ASM associated with TCM and TCdM DMRs showed preferential methylation and demethylation ([178]Figure 7). For example, in TCdM associated ASM with G>T allele at chr 9 ([179]CM008439.1:4821650), the allele T was hypo methylated in hybrid compared to parent Frut4 ([180]Figure 7A), while in ASM with G>A allele at chr 10 ([181]CM008440.1:221199113), the allele G was hypo methylated in hybrid compared to parent Frut4, and the methylation at allele A was transferred to hybrid from parent Choco ([182]Figure 7B). Similarly, in TCM-associated ASM with C>T allele, allele T showed hypo-methylation in both hybrid and parent Frut4 compared to other parent Choco, while allele C was preferentially hyper-methylated in hybrid compared to both parental lines ([183]Figure 7C). Figure 5. [184]Figure 5 [185]Open in a new tab Summary of allele-specific methylation (ASM) and single nucleotide variants (SNV) identified in Capsicum F[1] hybrid and its parental lines at CG, CHG, and CHH methylation context SNV and identified significant ASM associated with (A) Promoter DMRs (pDMRs), and (B) trans chromosomal methylation (TCM) and demethylation (TCdM) DMRs, common and unique ASM from (C) pDMRs and (D) TCM/TCdM DMRs, (E, F, and G) ASM distribution across the genomic region (promoter, intragenic and intergenic) at three different methylation context between the F[1] hybrid and its parental lines. Figure 6. [186]Figure 6 [187]Open in a new tab Allele-specific methylation (ASM) associated with promoter differentially methylated regions (pDMR) genes in Capsicum F[1] hybrid and its parental lines. (A–F) Representative ASM locus at CG context associated with heterozygous SNV sites in DMRs located in promoters region of genes and (G) expression pattern of genes in both parental lines Choco and Frut4 with respect to F[1] hybrid. Figure 7. [188]Figure 7 [189]Open in a new tab Allele-specific methylation (ASM) associated with non-additive trans chromosomal methylation and demethylation (TCM and TCdM) differentially methylated regions (DMRs). Representative heterozygous ASM locus with (A) G>T allele and (B) G>A allele associated with TCdM DMRs while (C) C>T allele associated with TCM DMRs in Capsicum F[1] hybrid and its parental lines. Expression profiling in Capsicum hybrid and its parents The expression profiling of transcripts and differential expression of genes between the Capsicum hybrid and its parents (Choco and Frut4) was analyzed using RNAseq data. A total of 6875 genes were significantly (p value < 0.001 and FDR <0.01) differentially expressed (DE-Gs) between hybrid and its parents. A total of 2003 (1083 up-regulated; 920 down-regulated) genes were DE-Gs between Choco vs hybrid and 4815 DE-Gs (2187 up-regulated; 2628 down-regulated) in Frut4 vs Hybrid, while 4316 DE-Gs (1924 up-regulated; 2392 down-regulated) in Frut4 vs Choco, were significantly expressed ([190]Table S11). Furthermore, it was observed that these DE-Gs were significantly enriched (adjusted p value < 0.01) in different pathways, including photosynthesis (ko00195), glyoxylate and dicarboxylate metabolism (ko00630), photosynthesis-antenna proteins (ko00196), carbon metabolism (ko01200), retinol metabolism (ko00830), and ubiquinone and other terpenoid-quinone biosynthesis (ko00130) ([191]Table S12A). Also, gene ontology (GO) terms from biological processes (BP) such as photosynthesis and light harvesting (GO:0009765, GO:0015979), transport of inorganic anion (GO:0015698), precursor metabolites and energy generation (GO:0006091) and organic acid catabolic process (GO:0016054) were found to be significantly enriched. Although GO terms related to the photosystem I and II (GO:0009521, GO:0009523, GO:0009522) under cellular component (CC) and GO terms such as those related to binding of carbohydrate, chlorophyll, monosaccharide (GO:0030246, GO:0016168, GO:0048029) and related to protein serine/threonine kinase activity (GO:0004674) were significantly enriched (p value < 0.01; FDR<0.05) under molecular function (MF) category ([192]Table S12B). Correlation between methylation instances and expression of genes Karl Pearson’s correlation analysis was performed to understand the correlations between the overall expression of genes and their average methylation in Capsicum hybrid and its parents Choco and Frut4. Overall, a significant negative correlation (R = −0.16 to −0.22; p value < 0.001) between the expression of genes and their CG and CHG methylation instances was observed in all the samples ([193]Figures 8A and 8B). However, no negative correlation was observed at CHH methylation ([194]Figure 8C). Furthermore, pDMRs were screened for their associated genes, and a total of 4332, 4782, and 4127 pDMRs underlying in 3154, 3476, and 3213 genes were identified at CG, CHG, and CHH context, respectively, between hybrid and its parental lines ([195]Tables S13, [196]S14, and [197]S15). Furthermore, 15.3–24.25% pDMR-Genes at CG, 13.8–21.64% pDMR-Genes at CHG and 15.54–25.62% pDMR-Genes were significant DE-Gs between hybrid and its parental lines. Also, only 55 CG, 49 CHG, and 3 CHH pDMRs were common in hybrid and its parental lines ([198]Figures 8D–8F). Out of total pDMRs, 1052 (662 hyper, 390 hypo) CG, 1122 (598 hyper, 524 hypo) CHG, and 1123 (846 hyper, 277 hypo) CHH pDMRs were identified in Choco vs Hybrid ([199]Table S7C). In Frut4 vs hybrid, a total of 1459 (453 hyper, 1006 hypo) CG, 1622 (826 hyper, 796 hypo) CHG, and 1307 (973 hyper, 334 hypo) CHH pDMRs were identified, while, in parents (Frut4 vs Choco), a total of 2940 (809 hyper, 2131 hypo) CG, 3151 (1252 hyper, 1899 hypo) CHG and 2379 (1190 hyper, 1189 hypo) CHH pDMRs were identified ([200]Tables S13, [201]S14, and [202]S15). Additionally, a total of 1854 (496 with DE-Gs) CG pDMRs, 1960 (532 with DE-Gs) CHG pDMRs, and 1800 (463 with DE-Gs) CHH pDMRs showed a negative correlation with the expression of genes between Choco vs hybrid, Frut4 vs hybrid and Frut4 vs Choco comparisons ([203]Tables S13, [204]S14, and [205]S15 and [206]Figures S6A–S6C), respectively. Moreover, these negatively correlated DEG pDMR-Genes were significantly enriched (adjusted p value < 0.01) in pathways ([207]Table S16A), including photosynthesis (ko00195), oxidative phosphorylation (ko00190) as well as were significantly enriched (p value < 0.01; FDR<0.05) in several GO terms such as those related to hydrogen and proton transports (GO:0006818, GO:0015992, GO:1902600) under BP, transmembrane transporter activity of monovalent inorganic cation and hydrogen ions (GO:0015077, GO:0015078) under MF and mitochondrial inner membrane (GO:0005743), organelle envelope (GO:0031967) under CC ([208]Figures S7 and [209]S8A–S8C; [210]Table S16B). Figure 8. [211]Figure 8 [212]Open in a new tab The correlation of promoter methylation with gene(s) expression Scatter plot showing a correlation between promoter methylation and gene expression in (A) Choco, (B) Frut4 and (C) hybrid at CG, CHG, and CHH context. Venn diagram representing common and unique promoter DMRs along with their gene count (in red text) among at (D) CG, (E) CHG and (F) CHH methylation context. qRT-PCR expression analysis The expression of 24 DE-Gs which showed a negative correlation with their promoter methylation at CG, CHG, and CHH context was validated using qRT-PCR in hybrid and its parental lines. Of the 24 genes, 22 (91.66%) showed similar expression patterns with their respective RNAseq expression data ([213]Figure 9A). The expression of these genes was higher while promoter methylation was lower in hybrid than both parents Choco and Frut4. Genes such as Cellulose synthase A catalytic subunit 9 (BC332_17838), Nudix hydrolase 8 (BC332_20283), Dead-box ATP-dependent RNA helicase 9 (BC332_18776), and Glycerol-3-phosphate dehydrogenase (BC332_12420) at CG context, Bifunctional aspartokinase/homoserine dehydrogenase 1 (BC332_02193), ATP-dependent zinc metalloprotease FtsH (BC332_10767), Photosystem II reaction center W protein (BC332_17671), Eceriferum 3 (BC332_18054), hypothetical protein BC332_20287, Chlorophyll a-b binding protein 3C (BC332_21229), hypothetical protein BC332_22614, Bifunctional enolase 2/transcriptional activator (BC332_22856), Diacylglycerol kinase 5 (BC332_25145), hypothetical protein BC332_28523, and Elongation factor 1-gamma 2 (BC332_29084) at CHG context, while Leucine-rich repeat receptor-like protein kinase PXL2 (BC332_03097), protein AUXIN SIGNALING F BOX 3 (BC332_04750), Chlorophyll a-b binding protein of LHCII type III (BC332_07524), 3-ketoacyl-CoA synthase 11 (BC332_10084), protein ASPARTIC PROTEASE IN GUARD CELL 2 (BC332_12969), putative peroxygenase 5 (BC332_13123), CBL-interacting protein kinase 5 (BC332_17800), and NAD(P)H-quinone oxidoreductase subunit H (BC332_27111) at CHH context, had significantly (q-value <0.001) high promoter methylation in both Choco and Frut4 compared to hybrid. Also, genes such as Photosystem II protein D1 (BC332_02125) at CG and CHG context, Chlorophyll a-b binding protein of LHCII type III (BC332_07524) at CHH context, had significantly (q-value <0.001) high promoter methylation in Frut4 compared to Choco. The qRT-PCR expression trend of these genes (except BC332_17838, BC332_17671 at CG and CHG in Frut4 vs Hybrid) between hybrid and both parents was similar to their RNAseq expression data. Also, the qRT-PCR expression of all genes was significantly different (p value < 0.05, p value < 0.01 and p value < 0.001) among samples, except for genes BC332_27111 in Choco vs hybrid, BC332_29084 in Frut4 vs hybrid and BC332_2125 in Frut4 vs Choco - which did not show a significant difference in their expression ([214]Figure 9A). Figure 9. [215]Figure 9 [216]Open in a new tab Gene expression observed in quantitative real-time (qRT) PCR analysis (A) Twenty-four (24) significantly (p value <0.01) differentially expressed genes (DE-Gs) showing negative correlation with promoter methylation in parental line Choco, Frut4 and their hybrid. Genes showing qRT-PCR expression similar to their RNAseq data were represented in blue boxes. LFC represents expression in log2 fold change and significant p values were denoted by ∗ <0.05; ∗∗ <0.01; ∗∗∗ <0.001. (B) Differential expression pattern of DNA methyltransferase and demethylase genes in parental lines Choco and Frut4 compared to F[1] heterotic hybrids analyzed using RNA sequencing (RNA-seq) and qRT-PCR expression data. Letters a, b, and c shown on bar indicate the significant expression difference of a gene between each parent and hybrid with p value less than 0.001, 0.01, and 0.05, respectively. Additionally, we have examined the expression patterns of 10 genes belonging to enzymes including DNA methyltransferases (METs), chromomethylases (CMTs), demeter-like (DML), and domain rearranged methyltransferases (DRMs) using qRT-PCR and RNA-seq analysis ([217]Figure 9B). We observed similar expression patterns of 8 out of 10 genes in both the expression analysis. The expression pattern of genes such as CcMET1 (BC332_12132) and CcDRM5-like 1 (BC332_03361) was higher in parental lines compared to F[1] hybrid and was significantly different between Choco-vs hybrid and Frut4-vs-hybrid (p value <0.001; p value <0.01). Similarly, but opposite expression pattern (low in parental lines than hybrid) of genes including CcDRM6-like (BC332_13992), CcDML1-like (BC332_22864), CcCMT2-like (BC332_20529) and CcCMT3-like (BC332_33051) was observed. Significant metabolites in hybrid Our GC-MS analysis identified more than 100 metabolites, of which only 71 were found in both parents and hybrid were used for further analysis. The overall changes in metabolic content between parents and their hybrid were analyzed in term of log2 fold change or ratio of normalized area. Most commonly altered metabolites in hybrid compared to parental genotypes were amino acids and their derivatives, and carboxylic acid and sugar derivatives groups. For instance, L-proline, L-serine, L-threonine, L-5-oxoproline, 1-amino-propan-2—Ol and 4-aminobutanoic acid from the amino group, 1,2,3-propanetricarboxylic acid from the carboxylic group and galactinol from sugar derivatives were significantly different in hybrid compared to parents ([218]Figure 10). Also, metabolites such as octanoic acid, sucrose, palmitic acid, quininic acid, cyclohexane, D-fructose, and silanol were significantly higher (p > 0.001) in the parent Choco than the hybrid, while most of the sugar and its derivatives including D-fructose, D-glucitol, sucrose, D-psicofuranose, and amino acid L-threonine were significantly higher in hybrid compared to the parent Frut4 ([219]Figure 10A). Moreover, we have observed that two pathways namely glyoxylate and dicarboxylate metabolism and glycine, serine, and threonine metabolism were most significantly (p value <0.01) enriched ([220]Figures 10B and 10C). In hybrid, metabolites such as glycine, L-threonine, succinate, and 2-oxoglutarate showed significantly (p value < 0.05 & p value < 0.01) high response ratio compared to both the parental genotypes, while metabolites L-serine, glyceric acid, and D-glycerate showed low response ratio than in both the parental genotypes ([221]Figures 10D–10J). In addition, few metabolites showed slightly higher content in the F[1] hybrid compared to both parents but were not significantly different from either or both of the parental genotypes. For example, the content of 2-ketoglutaric acid (2-oxoglutarate) was higher in hybrid compared to both parents but was not significantly different ([222]Figures 10A and 10J). Likewise, the content of metabolite butanedioic acid (succinate) was higher in the F1 hybrid than both parental genotypes but was only significantly different (p value <0.01) compared to parent Choco ([223]Figures 10A and 10I). Figure 10. [224]Figure 10 [225]Open in a new tab Global metabolite profiling in hybrid and parental genotypes (A) Heatmap representing metabolite changes in term of log2 fold change (ratio) of normalized area, (B) pathway enrichment of metabolites commonly identified in hybrid and parental genotypes, (C) pathway map of significantly enriched pathways, and (D–J) metabolites with significantly high and low response ratio compared to both parents Choco and Frut4. Metabolite response difference between Choco vs. hybrid, Frut4 vs hybrid and Fru4 vs. Choco are represented as a, b, and c, respectively and the significant level are indicated as ∗ for p value <0.5, ∗∗ for p value <0.01 and ∗∗∗ for p value <0.001. Correlation analysis of promoter methylation with the expression of genes and metabolites The integrated analysis between significant DE-Gs with their promoter methylation and metabolites biosynthesis pathway was performed to find the plausible impact of expression of genes and/or promoter methylation in Capsicum hybrid and parental genotypes. Our analysis observed the expression of 43 significant DE-Gs which were enriched in glyoxylate and dicarboxylate metabolism pathways. Of the total, 28 and 31 DE-Gs showed increased or higher expression in F[1] hybrid than parent Choco and Frut4, respectively ([226]Figure S9). Also, the expression of 28 DE-Gs was higher in hybrid than the mid-parent expression value. Few genes including Glycerate dehydrogenase, Glutamine synthetase, Glycine cleavage system H protein, Aminomethyltransferase, Aconitate hydratase, and Serine hydroxymethyltransferase showed 1-3 fold higher expression than mid-parent expression values, while genes such as ribulose bisphosphate carboxylase, and Malate dehydrogenase showed 2-47 fold higher expression than mid-parental expression value. Furthermore, the promoter regions of these genes was scrutinized for methylated cytosine which revealed that the upstream promoter regions of 14 genes including Glutamine synthetase, Glycine cleavage system H protein, Serine hydroxymethyltransferase, ribulose bisphosphate carboxylase, and Malate dehydrogenase were less/low methylated than mid-parental methylation value (MPMV) and hence showed higher gene expression of those genes compared to the mid-parent expression values ([227]Figure S9). Discussion Hybrid breeding is one of the most preferred methods in many crop plants owing to its heterotic performance and higher yield, among many advantages. However, the detailed mechanism of the heterotic performance of F[1] hybrids compared to parental lines is not still understood well especially with respect to the DNA methylation and its correlation with the expression of genes and metabolites. In this study, we developed an interspecific hybrid by crossing genotypes (Choco and Frut4) belonging to two different Capsicum species and found heterosis in the developed hybrid ([228]Figure 1). Through earlier, different genetic models like dominance and overdominance hypothesis have been proposed to explain hybrid vigour/heterosis in F[1] hybrids ([229]Jones, 1917; [230]Crow, 1948, [231]1999; [232]Schnable and Springer, 2013); however, these models do not provide sufficient explanation or prediction of heterosis. Recently, the role of DNA methylation in hybrid vigour has also been reported in several important plants, including Arabidopsis ([233]Shen et al., 2012; [234]Kawanabe et al., 2016), tomato ([235]Raza et al., 2017), rice ([236]He et al., 2010; [237]Chodavarapu et al., 2012), pigeon pea ([238]Junaid et al., 2018; [239]Sinha et al., 2020), and Populus ([240]Gao et al., 2014). Using methylation-sensitive amplified polymorphism (MSAP) analysis, [241]Xu et al. (2015) reported the adjustment of methylation patterns in heterotic reciprocal heterotic hybrids of two hot peppers. In the present study, we conducted an integrated high-resolution mapping of genome-wide DNA methylation at single-base resolution in interspecific Capsicum hybrid and its parental genotypes. This study also identified gene(s) expression signatures and showed expression correlation with their promoter methylation in both parental lines and their F[1] hybrid. Therefore, this study may prove to be a useful asset for research and investigations of molecular mechanisms influencing the heterotic phenomenon in Capsicum. DNA methylations are highly variable in three different contexts i.e. CG, CHG, and CHH across the different plant species ([242]Alonso et al., 2015; [243]Niederhuth et al., 2016; [244]Takuno et al., 2016). For instance, B. vulgaris showed maximum (∼81%) methylations at CHG; however, Eutrema salsugineum showed the lowest (9.3%) methylation at CHG context ([245]Niederhuth et al., 2016). In the case of tomato, CG methylations were prominent (71.6–72.8%) followed by CHG (52.5–53.0%) and CHH (10.7–12.5%) methylations ([246]Zuo et al., 2018). Earlier 82.8–89.6% CG, 77.8–83.9% CHG, and 22.4–25% CHH methylation was reported in fruits of three different Capsicum species, i.e. Capsicum annuum, C. chinense, and C. frutescens ([247]Rawoof et al., 2020). The present study also identified similar methylation i.e. 87.9–90.6% CG, 78–83.7% CHG and 17.2–21.1% CHH methylation context in leaves of hybrid and its parental lines ([248]Figure 2A). In Arabidopsis, it was reported that CMT2 and RNA-directed DNA methylation (RdDM) facilitate and maintain methylation in the CHH context ([249]Zemach et al., 2013). In hybrid, it was observed that the fractional proportion of CHH methylation was the lowest than both Choco and Frut4 parental lines, which suggested that CMT2 and RdDM mediated CHH methylation instances were less active in Capsicum hybrid ([250]Figure 2B). Patterns of DNA methylation are ubiquitous in nature and occur throughout the genome, including genic regions, inter-genetic regions, transposable elements; DNA elements and so forth and their density/distribution across these regions may provide biological information regarding functional processes and methylation regulation ([251]Takuno and Gaut, 2012, [252]2013). Furthermore, it was reported that promoter methylation affects the expression/transcription of genes. Also, high CHG and CHH methylation in flanking regions than gene-body was reported in fruits of Capsicum ([253]Rawoof et al., 2020). In Arabidopsis and rice, high CHG methylation at GBM along with flanking regions was observed with low transcript/gene expression ([254]Saze et al., 2008; [255]Zemach et al., 2010; [256]Klosinska et al., 2016). A similar GBM pattern was observed in Capsicum hybrid and its parental lines Choco and Frut4 ([257]Figure 3A). However, in pur study, the hybrid showed the lowest CHG and CHH GBM compared to both parental lines, suggesting methylation via RdDM pathway is less active in the Capsicum hybrid. One of the most important functions of DNA methylation is to stabilize transposable elements (TEs) ([258]Chan et al., 2005; [259]Law and Jacobsen, 2010). In this study, it was found that TE body along with flanking region of the majority of TEs including LTRs, non-LTRs, and DNA elements were densely methylated ([260]Figures 3B–3F and [261]S3), suggesting their possible role in genome stability. Methylation at both TE-body and TE flanking regions regulates transposon silencing ([262]Inagaki and Kakutani, 2012). It was observed that TE-body methylation was higher than GBM ([263]Figures 3 and [264]S3), which was similar to earlier studies ([265]Rawoof et al., 2020; [266]Sinha et al., 2020); however, at CHH context, methylation in both GBM and TEs is lower in Capsicum hybrid than its parents Choco and Frut4, suggesting that de novo maintenance of CHH methylation instances in TEs and GBM via CMT2 pathway is less active in Capsicum hybrid. The interaction between parental genomes and epigenomes within the nuclei of hybrid alters the cytosine methylation patterns ([267]Chodavarapu et al., 2012; [268]Greaves et al., 2012; [269]Shen et al., 2012). These alterations of mCs occur at particular chromosomal segments/regions which are differentially methylated (DMRs) between hybrid and parents ([270]Greaves et al., 2014). In this study, we observed that at CG and CHG context, hypo and hypermethylated DMRs were the most abundant in hybrid when compared with parent Choco and Frut4, respectively ([271]Figures S3 and [272]S4), while at the CHH context, hybrid had a higher abundance of hypo DMRs than both parents ([273]Figures S3 and [274]S4). It was reported that DNA methylations are associated with regulatory function and are more prominent in intergenic regions than in genic regions ([275]He et al., 2010). In this study, the genomic distribution of DMRs showed that the majority of DMRs in Capsicum hybrid and its parental lines belonged to the intergenic regions, followed by promoter regions and genic regions ([276]Figures 4 and [277]S5), which indicated their potential role in gene regulation and heterosis ([278]Zhao et al., 2007; [279]Jin et al., 2008; [280]He et al., 2010; [281]Greaves et al., 2012; [282]Sinha et al., 2020). There are two types of genetic regulators, first consisting of cis-acting regulators including promoter sequences, and the other being trans-acting regulators like transcription factors ([283]Springer and Stupar, 2007). It has been suggested that trans-acting elements are mainly involved in non-additive variations; however, additive variations may be caused by both cis- and trans-acting elements ([284]Swanson-Wagner et al., 2006; [285]Springer and Stupar, 2007; [286]Guo et al., 2008; [287]He et al., 2010). The non-additive methylation changes in Capsicum hybrid were attributed to TCM and TCdM whereas compared to the parental segments; methylation in hybrid was higher in case of TCM and lower in case of TCdM ([288]Greaves et al., 2014). Both TCM and TCdM are considered major epigenetic factors driving wide changes in hybrid methylome; however, their possible role in transcript expression regulations in heterotic hybrids remained unclear ([289]Greaves et al., 2014). During the present study, 4891 TCM (2272) and TCdM (2619), majorly located in intergenic regions of the genome, were identified in Capsicum hybrid, and only 33.08% (1618) were significantly different from mid-parent-methylation ([290]Table S8), suggesting significant trans-acting effects in Capsicum hybrid that may play an important role in heterosis. In plants, the methylation of the DMRs in the promoter of genes not only involved in the alteration of gene expression but is also used to understand the maintenance of parental variations in hybrid ([291]Zemach et al., 2010; [292]Ma et al., 2021). This study identified several allelic methylation associated with pDMR gene which was preferentially methylated in heterotic Capsicum hybrid ([293]Figures 5, [294]6, and [295]7). It was reported that the genetic diversity of parental genomes even at single nucleotide polymorphism might impact the expression of functional genes in hyrbid owing to the expression of either of the parental allele and thus may contribute to heterosis ([296]Botet and Keurentjes, 2020; [297]Zebell, 2021). This study observed that the expression of Malate dehydrogenase (BC332_23927) gene from glyoxylate and dicarboxylate metabolism pathway in F[1] hyrbid was higher than mid-parent expression, while its promoter region was hypo-methylated than mid-parent methylation ([298]Figures 6F and [299]S9). Additionally, the allele T with ASM (C>T) was preferentially high methylated than one of the parental alleles from Frut4 ([300]Figure 6F). Likewise, another gene Glycine cleavage system H protein (BC332_20263) showed significantly higher expression in hybrid than both the parental lines ([301]Figure 6G), while in its promoter region, methylation of allele T with ASM (C>T) was contributed from parental allele from Choco ([302]Figure 6E), suggesting that parental allele associated with pDMRs with their corresponding gene expression might be involved in heterosis, however, its molecular mechanism remains elusive. Earlier in rice, it was reported that parental alleles associated with CG and CHG DMRs were stably transferred in hybrid ([303]Ma et al., 2021). Therefore, to get insight into the conservation and transfer of the parental DMRs (TCM & TcdM) in the Capsicum hybrid, this study analyzed the methylation at the single base allele (ASM) associated with these TCM and TCdM DMRs in hybrid and its parental lines ([304]Figures 7A–7G). This study observed that at both CG and CHG contexts, high number of ASM were associated with TCM and TCdM DMRs, suggesting that the variations at CG and CHG methylation contexts are substantially maintained for parental alleles in F[1] Capsicum hybrids, which is in concordance with the previous finding. The stability of newly formed methylation instances in F[1] hybrid and the inheritance of methylated cytosines from parental lines to successive generations can be assessed using WGBS at single base resolution ([305]Becker et al., 2011; [306]Schmitz et al., 2011; [307]Van Der Graaf et al., 2015). In Arabidopsis, the inheritance of ∼45% of newly methylated sites was stable and maintained between generations ([308]Hofmeister et al., 2017). In Capsicum hybrid, the majority of mCs (78.9%) were inherited from parent Choco, while only 21.1% were inherited from Frut4 ([309]Table S9), suggesting major mCs in hybrid were inherited from parent Choco. The use of Choco as female parent in the development of hybrid could be one of the reasons for the higher proportion of mCs (78.9%) from Choco than Frut4. Furthermore, only a small fraction (9% of total inherited mCs) of newly formed methylated sites was identified in hybrid and of which ∼76% of newly formed mCs belonged to the CHH context and only ∼23% mCs originated at CG and CHG context ([310]Table S9). Altogether, our data suggested that although the majority of methylations are stanchly inherited across the generations as suggested previously ([311]Zhang et al., 2008; [312]Becker et al., 2011; [313]Van Der Graaf et al., 2015; [314]Hofmeister et al., 2017), however, newly formed methylations and demethylation events in hybrid may play a possible role in providing heterosis to the plants which need to be further confirmed. Hitherto, a number of studies have suggested that DNA methylation may differently regulate the transcript expression in hybrids and may contribute to the heterosis ([315]Greaves et al., 2015; [316]Zhang et al., 2016; [317]Raza et al., 2017; [318]Shen et al., 2017; [319]Junaid et al., 2018; [320]Sinha et al., 2020). Therefore, to understand the role of DNA methylations on gene expressions, we developed a complex of dynamic methylome and transcriptome and the associated gene functions; and thus conducted transcriptome analysis of the same samples. In Capsicum hybrid, ∼46% significant DE-Gs were up-regulated compared to parent Choco, while ∼55–56% significant DE-Gs were up-regulated in both hybrid and Choco compared to the other parent Frut4 ([321]Table S11). Also, it was observed that several DE-Gs with higher expression in hybrid were having low promoter methylation than both parental genotypes, suggesting methylation may involve in expression regulation during the manifestation of heterosis in Capsicum. For instance, in hybrid DEG DEAD-box ATP-dependent RNA helicase which was reported to be involved in plant development and abiotic stress responses ([322]Liu and Imai, 2018; [323]Macovei et al., 2012) was observed with higher expression. Similarly, DEG D. kinase (DGK) which is a key enzyme involved in lipid signaling, signal transduction from hormones, growth factors, and in build-up metabolic network in responses to several stresses ([324]Kue Foka et al., 2020), was also found to be highly expressed in Capsicum F[1] hybrid in both parental genotypes. Likewise, in hybrid around 2-fold increased expression of DEG B. enolase 2/transcriptional activator which is important for plant growth and development ([325]Eremina et al., 2015) was observed than both parents ([326]Figure 9A). In Arabidopsis, heterotic phenotypes showed an increment in the total amount of photosynthesis than parental lines ([327]Fujimoto et al., 2012; [328]Kawanabe et al., 2016). This study observed that photosynthesis-related gene Chlorophyll a/b binding protein of Light-Harvesting Complex (LHC) II was highly expressed in hybrid (4–11 LFC) than both parents and at the same time it showed low promoter methylation than MPMV of both parents suggesting increased photosynthesis in heterotic F1 hybrid with promoter methylation may interplay during heterosis in Capsicum ([329]Figure 9A). The enrichment analysis showed that DE-Gs in Capsicum hybrid and parental lines were significantly enriched in pathways and GO terms related to photosynthesis, glyoxylate and dicarboxylate metabolism, and metabolite biosynthesis ([330]Table S12), which was similar to earlier reports in hybrids ([331]Fujimoto et al., 2012; [332]Kong et al., 2020; [333]Li et al., 2020a, [334]2020b). In this study, we also observed significant enrichment of several metabolites from sugar and sugar derivatives, amino acid and their derivatives, and carboxylic acid group between hybrid and parental genotypes ([335]Figure 10). A similar observation was reported by recent studies on Capsicum hybrids wherein they observed the increased metabolite (mainly sugar, citric acid, and capsaicinoids) content in hybrids compared to parental genotypes ([336]Zamljen et al., 2020; [337]Shiragaki et al., 2021). Also, our metabolite profiling analysis showed that metabolites related to glyoxylate and dicarboxylate metabolism pathway were most enriched (p value < 0.01) between Capsicum hybrid and parents ([338]Figure S8; [339]Table S12), and it was observed that the expression of genes (∼65–72% of enriched genes) related to this pathway were also higher in F[1] hybrid than mid-parent average expression values of parental genotypes ([340]Figure S9). The promoter methylation analysis showed that around 32.6% of total significant DE-Gs enriched in glyoxylate and dicarboxylate pathway were having low promoter methylation than MPMV and increased expression than both parental genotypes ([341]Figure S9), which suggested that, methylation in hybrid might potentially involve in the regulation of gene expression thereby increasing certain metabolite content in hybrid leading to early heterosis in Capsicum F[1] hybrid. Additionally, the overall significant negative correlation of promoter methylations with gene expression ([342]Figure 8) suggested the overall repressive nature of DNA methylations which is also widely reported in different plant species ([343]Zilberman et al., 2007; [344]Zemach et al., 2010; [345]Lang et al., 2017; [346]Raza et al., 2017). Earlier, it was reported that promoter CHH methylation positively correlated with the gene expression ([347]Bhatia et al., 2018; [348]Rawoof et al., 2020) and similar results were observed in Capsicum hybrid and parental lines; however, the molecular mechanism behind this correlation is still unknown. To get insight into the regulation of expression in hybrid, we investigated the promoter DMR (pDMR) of genes and observed that the majority of genes showing increased expression had low/decreased promoter methylation in the hybrid compared to their parents ([349]Figure S6). Also, these pDMR DE-Gs were significantly enriched in photosynthesis and oxidative phosphorylation-related pathways ([350]Table S16; [351]Figures S7 and [352]S8), suggesting DNA methylation in Capsicum hybrid may be involved in enhanced photosynthesis and may contribute to heterosis, which further needs to be investigated. Furthermore, we have validated the expression of randomly selected 24 DE-Gs which notably have low promoter methylation and high expression in hybrid compared to both parental lines ([353]Figure 9A). The qRT-PCR expression analysis also showed that most of these genes had significant differential expression in hybrid, which was much more than that observed in both parental lines and suggested that promoter methylation in Capsicum hybrid is correlated with altered expression of genes. In plants, the establishment of DNA methylation at all context is carried out by DNA methyltransferase Domain Rearranged Methyltransferase 2 (DRM2) via siRNA-guided RdDM pathways, while, methylation maintenance at CG, CHG, and CHH context is carried out by DNA methyltransferase 1 (MET1), chromomethylase 3 (CMT3), and CMT2, respectively. Additionally, the demethylation of methylated cytosine is actively arbitrated by several DNA demethylases including demeter (DME), demeter-like (DML), and repressor of silencing 1 (ROS1) ([354]Liang et al., 2019). Therefore, to understand the expression dynamics of the DNA methylase and demethylase genes, we have analyzed the expression of 10 different genes (1 MET1, 3 DRM, 3 DML, and 3 CMT) using RNA-seq and qRT-PCR analysis ([355]Figure 9B). Of these, 8 genes including CcMET1-like (BC332_12132), CcDRM5-like 1 (BC332_03361), CcDRM6-like (BC332_26991), CcDML1-like (BC332_22863), CcDML3-like (BC332_29933), CcCMT2-like (BC332_20529), CcCMT3-like (BC332_33051), and CcCMT4-like (BC332_01667) showed similar expression trends among the samples in both RNA-seq and qRT-PCR analysis. We observed that compared to F[1] hybrid, both the parents (Choco and Frut4) showed the upregulation of CcMET1-like, while CcCMT4-like was upregulated in parent Choco and downregulated in Frut4 compared F[1] hybrid. Also, in Arabidopsis hybrids generated from met1-RNAi knockdown, it was observed that MET1 is not involved in heterosis ([356]Kawanabe et al., 2016). In this study, we observed that the high expression of MET1 and CMT4 with relatively low CG and CHG methylation in F[1] hybrid than parental lines and their role in heterosis needs further investigation. Furthermore, the increased expression of CcCMT2-like than the mid-parent expression and additional newly formed mCs (9% of total inherited mCs), chiefly from CHH context was observed in the F[1] hybrid ([357]Table S9). Previously, it was reported that in addition to CHH methylation maintenance via RdDM, CMT2 along with a decrease in DNA methylation1 (DDM1) can alternatively also be involved in the generation of new methylation marks ([358]Greaves et al., 2015; [359]Zemach et al., 2013) and in corroboration with, our observation suggests that CMT2 plausibly might be involved in the establishment of new mCs partially in F[1] Capsicum hybrid, which in turn might lead to heterosis. In conclusion, the present study identified the inheritance pattern of DNA methylation marks from parent to hybrids. Some of the regions were hyper/hypo methylated with significant MPMV in hybrids as compared to parental genotypes. Expression dynamics and global metabolite profiling suggested the potential role of cross-talk between the three components (methylation, expression of genes, and metabolites) in the development of early heterosis in the F[1] hybrid of Capsicum. The expression analysis of genes identified a significant number of DE-Gs in hybrids which overall showed negative correlations with the promoter methylation. Furthermore, our metabolite profiling showed significantly higher expression of metabolites such as glycine, carboxylic acid, D-fructose, D-glucitol, sucrose, D-psicofuranose, and L-threonine in hybrid. Glyoxylate and dicarboxylate metabolism pathway was significantly enriched for metabolites and DE-Gs in F[1] hybrid and parental lines. Of the total, 65–72% enriched genes in this pathway showed overall increased expression (of these 32.6% with low promoter methylation than MPMV) in F[1] hybrid than mid-parent value. Altogether, the present study provided insightful dimensions into the plausible integrated role of DNA methylation, and expression of genes and metabolites toward the early heterosis in Capsicum hybrid. Limitations of the study Out of two Capsicum spp used to develop hybrid, the reference genome was available only for C. chinense, and thus present study might escape the detection of some essential epigenomic regions associated with heterosis that varies among both parental genomes (such as SNP/InDels) can be explored in future on the availability of reference genome of C. frutescens. Secondly, the present study focused on early-stage heterosis, however, hybrid vigor for fruit characters is also a potential area to be explored. STAR★Methods Key resources table REAGENT or RESOURCE SOURCE IDENTIFIER Biological samples __________________________________________________________________ Capsicum chinense (Choco) This study N/A Capsicum frutescens (Frut4) This study N/A Capsicum F1 Hybrid (Choco X Frut4) This study N/A __________________________________________________________________ Chemicals, peptides, and recombinant proteins __________________________________________________________________ Acetone Sigma-Aldrich 34850 CTAB Sigma-Aldrich H5882 Zymo EZ-DNA Methylation-Gold Kit Zymo Research D5005 NucleoSpin RNA plant Kit Macherey-Nagel 740949.50 Illumina TruSeq RNA sample preparation kit Illumina RS-930-2001 PrimeScript IV 1^st strand cDNA Synthesis Mix TaKaRa Bio 6215A TB Green Premix Ex Taq II Master Mix TaKaRa Bio RR820L Methanol Sigma-Aldrich 34860 Adonitol (Adonite, Ribitol) Sigma-Aldrich A5502 Methoxyamine hydrochloride MP Biomedicals 02155405-CF Pyridine Sigma-Aldrich 360570 N-Methyl-N-(trimethylsilyl) trifluoroacetamide Sigma-Aldrich 69479 __________________________________________________________________ Deposited data __________________________________________________________________ Raw Data for WGBS and RNA sequencing This study PRJNA770309 Capsicum chinense reference genome ASM227189v2 NCBI ([360]Kim et al., 2017) [361]https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/002/271/895/GCA_00227 1895.2_ASM227189v2/ __________________________________________________________________ Oligonucleotides __________________________________________________________________ Gene specific Primers for qRT-PCR analysis, see [362]Table S1 This study N/A __________________________________________________________________ Software and algorithms __________________________________________________________________ FastQC tool [363]Andrews (2010) [364]https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ TrimGalore N/A [365]https://github.com/FelixKrueger/TrimGalore Bismark [366]Krueger and Andrews (2011) [367]https://github.com/FelixKrueger/Bismark Hisat2 [368]Kim et al. (2019) [369]http://daehwankimlab.github.io/hisat2/ featureCounts(v1.5.1) [370]Liao et al. (2014) [371]http://subread.sourceforge.net/ methylKit [372]Akalin et al. (2012) [373]https://bioconductor.org/packages/methylKit/ edgeR [374]Robinson et al. (2010) [375]https://bioconductor.org/packages/edgeR/ Circlize [376]Gu et al. (2014) [377]https://github.com/jokergoo/circlize GenomicFeatures [378]Lawrence et al. (2013) [379]https://bioconductor.org/packages/GenomicFeatures/ CGmapTools [380]Guo et al. (2018) [381]https://cgmaptools.github.io/ Pheatmap [382]Kolde (2019) [383]https://CRAN.R-project.org/package=pheatmap agriGO tool [384]Tian et al. (2017) [385]http://systemsbiology.cau.edu.cn/agriGOv2/ clusterProfiler [386]Yu et al. (2012) [387]https://github.com/YuLab-SMU/clusterProfiler Primer3 [388]Untergasser et al. (2012) [389]https://primer3.ut.ee/ GC-MS Shimadzu GCMS-QP2010 Plus Reference mass spectral library databases NIST14 [390]https://www.nist.gov/ Reference mass spectral library databases Wiley 9 [391]https://www.sisweb.com/software/ms/wiley.htm MetaboAnalystR [392]Pang et al. (2021) [393]https://www.metaboanalyst.ca/docs/RTutorial.xhtml __________________________________________________________________ Other __________________________________________________________________ CFX96 Touch Real-Time PCR Detection System Bio-Rad CFX96 HiSeq X Ten Sequencing System Illumina N/A HiSeq 4000 Sequencing System Illumina N/A [394]Open in a new tab Resource availability Lead contact Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contacts, Vandana Jaiswal ([395]vandana.jaiswal2009@gmail.com). Materials availability This study did not generate new unique reagents or material. Method details Plant material and phenotyping To generate Capsicum hybrid, two Capsicum accessions Choco and Frut4 (pollen donor) belonging to Capsicum chinense and C. frutescens, respectively, were crossed by hand pollination. The resultant F[1] seeds were collected, sown and allowed to grow at standard growth conditions (day/night temperature: 27°C/19°C and 16 h/day light) in the greenhouse of the School of Life Sciences, Jawaharlal Nehru University, New Delhi, India. Then leaf samples were collected from two months old plants from two parents and hybrid, each with two biological replicates. Leaves from each sample were immediately frozen and preserved in liquid nitrogen and used for DNA/RNA extraction and metabolome profiling. Hybrids and their parents were phenotyped for seven traits at the time of flowering each with five replicates. These traits include plant height (cm), plant width (cm), number of secondary branches, number of leaves, leaf length (cm), leaf width (cm), and total chlorophyll content (μg/mg). The total chlorophyll content was estimated according to the method of [396]Arnon (1949). Fresh leaves were taken, washed, blotted in tissue paper to dry the external moisture and the leaf weight of the material was recorded. Approximately, 100 mg leaf samples were homogenized in a pre-chilled mortar using cold 80% acetone and tissue was pulverized completely. The homogenate was centrifuged at 3200 rpm for 30 min and the supernatant was collected. 80% acetone was taken as blank and the optical density of the acetone extract was measured at 645 and 663 nm in a spectrophotometer (Thermo Scientific). Total chlorophyll content was calculated from three biological replicates using the following formula: [MATH: TotalChlorophyll=20.1XO.D.(at645nm)+8.02XO.D(at663nm) :MATH] Mid parent heterosis (MPH) and better parent heterosis (BPH) were also calculated for these seven traits using following formula- [MATH: MPH=[( F1MP)/MP]×100 :MATH] [MATH: BPH=[( F1BP)/BP]×100 :MATH] where, MP and BP are mid parent and better parent trait values respectively. Whole-genome bisulfite sequencing and data processing From each sample, high molecular weight genomic DNA was extracted using the cetyltrimethylammonium bromide (CTAB) method. Around 50–100 ng of DNA was used for bisulfite conversion using Zymo EZ-DNA Methylation-Gold kit (Zymo Research, Irvine, USA), following manufacturer’s protocol. Briefly, fragmented DNA of 100–300 bp size was subjected to anneal to random primers having tagged sequences, providing ssDNA copy. Then, using ssDNA as a template, dsDNA was synthesized using terminal tagging oligo (TTO) with 3′ end blocked. Further, polymerase chain reaction (PCR) was performed with Illumina TruSeq indices followed by assessment of fragment size, concentration and library size using Agilent 2200 TapeStation. At last, six 2 × 150 bp paired-end libraries from three samples (Choco, Frut4 and hybrid; each with two biological replicates) were constructed on Illumina’s HiSeq X Ten platform (Illumina, San Diego, USA). Each sample’s raw sequencing reads were converted to FASTQ format and quality was checked using FastQC tool ([397]Andrews, 2010). Adaptor sequences and reads with low quality scores were filtered out using TrimGalore ([398]https://github.com/FelixKrueger/TrimGalore). The clean reads were then aligned to bismark-transformed C. chinense ([399]Kim et al., 2017) reference genome using Bismark ([400]Krueger and Andrews, 2011) and genome-wide CG, CHG and CHH methylation at single-base resolution were identified using criteria as described earlier by ([401]Rawoof et al., 2020). Bisulfite conversion efficiency was determined using spiked-in un-methylated 1% lambda DNA (GenBank: [402]J02459.1). Average methylation at 12 Capsicum chromosomes was represented across bins of equal 1MB length using the R circlize package ([403]Gu et al., 2014). RNA sequencing and data processing Total RNA from each sample was extracted using NucleoSpin RNA plant Kit (Macherey-Nagel, Germany) following standard protocol. The quality and integrity of RNA were checked using an Agilent bioanalyzer and then RNA with RNA integrity number (RIN) > 8 was sequenced as described earlier ([404]Rawoof et al., 2020; [405]Islam et al., 2021). Briefly, an equal amount of total RNA (5–10 μg) from each sample was used for library preparation using Illumina TruSeq RNA sample preparation kit (Illumina, San Diego, CA) following manufacturer’s protocols and paired-end sequencing was performed on Illumina HiSeq 4000 to develop 150 bp paired-end reads from three samples (two biological replicate of each sample). The raw RNAseq were subjected for quality check and then adapter sequences, as well as reads with quality score <30, were filtered out. Further, clean reads were aligned to the C. chinense reference genome ([406]Kim et al., 2017) using Hisat2 ([407]Kim et al., 2019) program with default parameters. Read counts of transcripts were quantified using the featureCounts (v1.5.1) ([408]Liao et al., 2014). Raw read counts were normalized using TMM methods ([409]Robinson and Oshlack, 2010), and differentially expressed genes (DE-Gs) between two sample comparisons were identified using the glmQLFit and glmQLFTest functions in the edgeR package ([410]Robinson et al., 2010). Genes with p value < 0.01 and false discovery rate (FDR) < 0.01 were considered significantly differentially expressed between the two samples. Analysis of differential methylation levels and their correlation with gene expression For the identification of differentially methylated regions (DMRs) at each methylation context and their genomic annotation, parameters earlier described by [411]Rawoof et al. (2020) were used. In brief, differential methylation analysis was performed using a sliding window of 100bp with 50 bp step size and overlapping DMRs were merged, and methylation information of each DMR was assessed using the R methylKit package ([412]Akalin et al., 2012) with parameters including Fisher’s exact test, sliding linear model for q-values and p values. Also, each window having bases >10X coverage, with at-least 3 mCs for CG, CHG and CHH context along with methylation difference (meth.diff) > 25% with q-values <0.01 were considered as significant DMRs. For common DMRs between hybrid and its parents, all samples were analyzed together. Then for additive methylation level in hybrid, mid-parental methylation values (MPMV) were calculated and common DMRs with higher methylation than MPMV were categorized as trans-chromosomal methylation (TCM), while lower than MPMV were categorized as trans-chromosomal demethylation (TCdM). A significance test was performed using pairwise t-test with both FDR and p value <0.01. DMRs were annotated using an in-house built txdb SQLite database of C. chinense using R GenomicFeatures ([413]Lawrence et al., 2013) and were categorised as intergenic, intragenic and promoter DMRs. Repeat-elements information as reported by ([414]Rawoof et al., 2020) was used for methylation analysis at transposable elements (TEs). Gene-body methylation (GBM) and TE methylation were analyzed using CGmapTools ([415]Guo et al., 2018). Moreover, to avoid noise from TEs at GBM, methylation at exonic and intronic levels along with 2 kb flanking regions were also analyzed. Methylation and gene expression correlation was performed using Pearson correlation analysis between the RNAseq expression of total C. chinense genes and their promoter methylation at each methylation context, and correlations with p value <0.001 were considered as significant. Additionally, promoter-DMR genes (PDG) having differential expression between parents and hybrids were investigated for methylation and their correlation with expression was represented as a heatmap using R pheatmap package ([416]Kolde, 2019). Analysis of allele-specific methylation (ASM) associated with DMRs To analyze allele specific methylation, single nucleotide variants (SNVs) were identified using CGmapTools snv function ([417]Guo et al., 2018). The SNVs detection was performed using bismark mapped methylated reads (>10x coverage) from hybrid and parental lines against C. chinense reference genome with BayesWC mode and dynamic p value cut-off vale of <0.001. Prior to ASM calling, the SNVs associated with TCM, TCdM and promoter DMRs were filtered out using vcftools --recode function. These DMRs specific SNVs were further used for ASM calling using asm mode of CGmapTools with filter criteria of minimum 5 reads for each allele linked sites, adjusted p value < 0.05 and high methylation threshold ≥0.8 to identify significant ASM associated with DMRs at CG, CHG and CHH methylation context. The tanghulu plots representing ASM were visualized using CGmapTools. Gene ontology and pathway enrichment analysis Gene ontology (GO) analysis of DE-Gs between hybrid and parental lines and DE-Gs having a negative correlation between their expression and promoter methylation, was performed using agriGO tool ([418]Tian et al., 2017). For GO information, a hypergeometric test with p value <0.01 and FDR <0.01 was used. For pathway enrichment analysis, SQLite-based custom annotation package for C. chinense was prepared in R, and then enrichment analysis was performed using R clusterProfiler ([419]Yu et al., 2012). GO terms and pathways with adjusted p values <0.01 were considered as significant. qRT-PCR expression validation of DE-Gs negatively correlated with promoter methylation At least 24 DE-Gs genes that negatively correlated with promoter methylation were selected for qRT-PCR validation, and gene-specific primers ([420]Table S1) were designed using Primer3 ([421]Untergasser et al., 2012). RNA was extracted from leaves of hybrid and parental lines (same samples used in WGBS and RNAseq) using NucleoSpin RNA plant kit (Macherey-Nagel, Germany) following manufacturer’s protocol. The quality and integrity of extracted RNA was checked using NanoDrop 1000 (Thermo Scientific) and 1% Agarose gel, respectively. Equally, ∼1 μg of total RNA was used to synthesize cDNA using PrimeScript IV 1 st strand cDNA Synthesis Mix (Takara). The real-time PCR reaction was set up using TB Green Premix Ex Taq II Master Mix (Clontech, USA) and run on the Bio-Rad CFX96 Real-Time System (Bio-Rad, USA) using the thermal profile of initial denaturation at 95°C for 30 s, followed by amplification step with 40 cycles of 95°C for 5 s and 60°C for 30 s. Then final amplification was completed at 65°C for 5 s, as described earlier ([422]Dubey et al., 2019; [423]Ahmad et al., 2021b). Further, qRT-PCR expression was compared with in silico RNAseq expression data. EF1-alpha was used as internal control, and significant expression difference was analyzed using two-way ANOVA test. Untargeted GC-MS based metabolomics analysis The global metabolome of hybrid and parental genotypes, three biological replicates each, was analyzed using gas chromatography coupled with mass spectrometry (GC-MS) following the method described by [424]Lisec et al. (2006) and [425]Ahmad et al. (2021b) with slight modification. Briefly, 100 mg of each leaf sample was homogenized with mortar and pestle using liquid nitrogen and then 1400 μL of pure methanol (chilled) was added. For internal quantitative standard, 60 μL of ribitol (0.2 mg mL^−1) was added and kept for shaking for 10 min at 70°C, followed by vortexing at 11000g for 10 min. Further, 750 μL of chloroform and 1500μL of dH[2]O were added and centrifuged for 15 min at 2,200g. Finally, the supernatant of each sample was dried in a vacuum concentrator without heating. For derivatization, 40μL of methoxyamine hydrochloride (20 mg mL^−1) in pyridine was added in all samples, vortexed and kept in a shaker for 2 h at 37°C. For trimethylsilylation, 70 μL of N-Methyl-N-(trimethylsilyl) trifluoroacetamide (MSTFA) as a silylating agent was added and then vortexed and incubated with shaking for 30 min at 37°C. The GC-MS running conditions includes: oven temperature – 100°C for 2 min followed by increment in oven temperature upto 250°C at the rate of 5°C per min, then finally 280°C at the rate of 2°C per min. Further, 1μL of derivatised sample was injected at 10:1 split ratio into a GC-MS (Shimadzu QP2010 Plus), equipped with a Rtx- 5 MS capillary column (0.25 mm film thickness, 0.25 mm internal diameter, and 30 m in length) with helium as the carrier gas with flow-rate of 1.2 mL/min. The quantification of detected metabolites was analyzed using GCMS post run Analysis Program (Shimadzu Scientific Instruments) following criteria described by [426]Ahmad et al. (2021b). In short, metabolites were identified by MS detector via full scan mode using their distinct peak fragmentation pattern and peaks of identified metabolites were compared with retention indices and mass spectra from reference mass spectral library databases (NIST14 and Wiley 9). At last, relative quantification was based on peak area ratio of analyte to internal standard ribitol. The metabolites in hybrid were assessed by relative changes in abundance as response ratios compared to its parents. The response ratio was calculated by dividing the average metabolite concentration from both parents (Choco and Frut4) by the average metabolite concentration from the hybrid. Metabolite differences between parents and hybrid with fold change >2 and p value < 0.05 were considered significant. Pathway enrichment of metabolites were analyzed using MetaboAnalystR ([427]Pang et al., 2021) and pathways number of metabolites >3 and pathway impact >0.2 along with p value <0.01 were considered as significantly enriched. Acknowledgments