Abstract This study delves into the combined effects of seasonal climate variations and MIPS1 gene mutations on the germination rates of soybean cultivars TW-1 and TW75. Through comprehensive metabolomic and transcriptomic analyses, we identified key KEGG pathways significantly affected by these factors, including starch and sucrose metabolism, lipid metabolism, and amino acid biosynthesis. These pathways were notably disrupted during the spring, leading to an imbalance in metabolic reserves critical for seedling development. Additionally, MIPS1 gene mutations further altered these pathways, exacerbating the metabolic disturbances. Our results underscore the intricate network of environmental and genetic interactions influencing soybean seed vigor and underscore the importance of understanding these pathways to enhance agricultural resilience and seed quality in fluctuating climates. Supplementary Information The online version contains supplementary material available at 10.1186/s12870-024-05957-x. Keywords: Soybean germination, MIPS1 mutation, Multi-omics, Seasonal climate variation, Low phytic acid Introduction The soybean (Glycine max L. Merr.) stands as a leading oilseed crop worldwide. Its seeds are pivotal, not only significantly bolstering human nutrition with essential plant-derived proteins and fats, but also serving as a critical element in the plant’s reproductive cycle by facilitating propagation. Therefore, seed quality is crucial for preserving genetic resources and advancing agricultural yields. Soybean seed development unfolds in three primary stages [[36]1]: embryonic development featuring embryo growth, cell division, and morphogenesis; maturation characterized by the substantial accumulation of nutritive reserves; and a seed drying phase involving desiccation and dormancy onset. These stages are central to determining seed quality, germination, and viability and coincide with comprehensive changes in gene expression, protein profiles, and metabolite levels, alongside notable metabolic shifts over space and time. The stored nutrition within the seed are the energy source for germination and seedling growth. The quality of soybean seeds stems from a complex interaction between innate genetic factors and external environmental conditions. Environmental factors profoundly and intricately affect plant growth and metabolic functions. The interplay between the soybean’s genetics and various climatic stressors, such as temperature fluctuations [[37]2, [38]3], drought stress [[39]4, [40]5], and sunlight exposure, affects soybean physiology in complex and unpredictable ways. While soybeans are generally thermally resilient, the seed development stage is particularly sensitive to extreme temperatures, potentially leading to reduced germination, increased susceptibility to disease, and decreased seed value [[41]6, [42]7]. The genetic blueprint also significantly dictates seed development, where genetic deviations, such as low phytic acid mutations, sometimes align with inferior seed quality. These genetic variations tend to manifest in reduced seed yield and viability at last [[43]8–[44]12]. The work of Meis et al. [[45]13] illuminated the phenotypic consequences of such genetic alterations, finding that homozygous mips genotypes from the LR 33 lineage display significantly lower field emergence rates compared to wild-type (WT) counterparts. Another important observation from this study was the varying impact of the seeds’ origin; those cultivated in temperate climates suffered less in field emergence than those from tropical regions. Low phytic acid mutants associated with the MRP5 gene also exhibit reduced seeds viability [[46]14]. However, mutations of IPK1 gene, which catalyzes the last step of phytate synthesis, do not lead to the inferior of quality in soybean seeds [[47]15, [48]16]. Genotype play a decisive role in seeds viability among low phytic acid mutants. The MIPS1 gene catalyzes the first step of phytate synthesis, and mutations in this gene result in substantial reduction in phytic acid content, also affecting the metabolism of the raffinose in soybean seeds [[49]9, [50]10]. A large number of studies have been conducted on the metabolic mechanisms of low seed emergence in low phytic acid seeds [[51]17–[52]19] and on improving the emergence of low phytic acid varieties [[53]20, [54]21]. In our previous study [[55]22–[56]24], the TW-1, a mutation of Taiwan75, exhibited a notably different field emergence rate compared to its parent variety, Taiwan75. We have previously investigated the differential expression profiles of these mutations during seed germination through comprehensive proteomic and transcriptomic analyses. However, the impact of parental traits on the F1 generation’s subsequent germination remains largely unexplored. By examining how parental characteristics affect the germination potential of the F1 generation, we can gain valuable insights into the genetic mechanisms of seed development and agronomic practices that could enhance seed quality and field performance. Metabolomics has emerged as a powerful tool in delineating metabolic processes across different crops, soybeans included [[57]25]. Metabolomic studies employing liquid chromatography-mass spectrometry (LC-MS) and gas chromatography-MS have yielded critical insights into the metabolic alterations in stress-exposed seeds. The evolution of next-generation sequencing technologies has paved the way for new omics fields like transcriptomics [[58]26, [59]27], genomics [[60]28], and proteomics [[61]23]. Combining metabolomics with transcriptomics in multi-omics analyses has become a prevalent method for investigating the complex interactions within organisms, shedding light on the regulatory networks that control metabolic pathways [[62]29–[63]31]. Our previous research has extensively explored the agronomic traits of the soybean genotypes TW75 and TW-1 [[64]22–[65]24]. It was observed that field emergence rates for these varieties were influenced by a complex interaction between their genetic constitution (MIPS1 mutation) and environmental factors (specifically temperature variation associated with seasonal changes). Notably, seeds collected during autumn in Hangzhou demonstrated high field emergence rates for both parental and mutant lines, reaching around 85%. In contrast, seeds from spring season exhibited significantly lower emergence rates, with the parental genotype TW75 achieving 45% and the mutant TW-1 only 25%. These results highlighted the significant impact of seasonal variations on seed germination rates and suggest that mutations in the MIPS1 gene may affect field emergence differently under varying environmental conditions. In the current study, we leveraged transcriptomic and metabolomic analyses to map the regulatory network involved in mips1 mutant soybean seed development. Our aim was to explore the impacts of seasonal and genetic variations on field emergence rates, thereby clarifying the relationship between MIPS1 gene expression and metabolic profiles in the context of soybean mutant seed germination. This research contributes to a more profound and comprehensive understanding of how seasonal and genetic factors that influence the accumulation of germination-promoting factors in soybean seeds. Materials and methods Plant materials and growth conditions The experiment utilized low phytic acid soybean mutant lines Gm-lpa-TW75-1 (referred to as TW-1) and its corresponding wild-type parental variety, Taiwan 75 (referred to as TW75). The TW-1 mutant line was developed through gamma irradiation. TW75 is a widely cultivated vegetable soybean variety in Zhejiang Province, China. For comparative analysis, seeds were harvested from plants grown in adjacent plots within the same field during the spring and autumn seasons of 2020 at the Experimental Farm of the Zhejiang Academy of Agricultural Sciences in Hangzhou, Zhejiang. Sampling was performed at three developmental stages: the early stage (ES) at 15–20 days after flowering (DAF), the middle stage (MS) at the R6 stage, and the late stage (LS) at the R8 stage, for both spring and autumn. All samples were immediately frozen and stored at -80 °C until further metabolite analysis and RNA extraction. Sample preparation and metabolomic analysis by LC-MS Sample preparation A precise 30 mg sample was placed into a 1.5 mL Eppendorf tube containing steel balls. Internal standards, including 20 µL of 2-chloro-L-phenylalanine (0.3 mg/mL) and 20 µL of Lyso PC 17:0 (0.01 mg/mL) dissolved in methanol, were added. Subsequently, 1 mL of a methanol-water mixture (7:3, v/v) was introduced to each tube. The samples were then subjected to a series of treatments: first, a 2-minute freeze at -20 °C, followed by grinding at 60 Hz for 2 min, vortexing, 30-minute ultrasonication at room temperature, and another 20-minute freeze at -20 °C. After centrifugation at 13,000 rpm and 4 °C for 10 min, 300 µL of the supernatant was dried in a freeze-concentration centrifugal dryer. The dried residue was reconstituted with 400 µL of a methanol-water mixture (1:4, v/v), vortexed for 30 s, and ultrasonicated for 2 min. Following another centrifugation under the same conditions, 150 µL of the supernatant was syringe-filtered through a 0.22 µm microfilter and transferred to LC vials, which were subsequently stored at -80 °C until LC-MS analysis. Quality control (QC) samples were generated by pooling aliquots of all the samples. LC-MS analysis We employed an ACQUITY UHPLC system coupled with an AB SCIEX Triple TOF 5600 System for metabolic profiling in both ESI positive and negative ion modes. Chromatographic separation was carried out on an ACQUITY UPLC BEH C18 column using a binary gradient elution system comprising solvent (A) water with 0.1% formic acid (v/v) and solvent (B) acetonitrile and methanol (2:3, v/v, with 0.1% formic acid). The gradient protocol was as follows: starting with 1% B, increasing to 30% B at 1 min, 60% B at 2.5 min, reaching 90% B at 6.5 min, holding at 100% B from 8.5 to 10.7 min, returning to 1% B at 10.8 min and maintaining until 13 min. The flow rate was set at 0.4 mL/min with a column temperature of 45 °C. Samples were maintained at 4 °C during analysis, with an injection volume of 1 µL. Data acquisition was conducted using full scan mode over an m/z range of 50 to 1000, combined with IDA mode. Mass spectrometry parameters were set as follows: ion source temperature at 115 °C for both positive and negative modes; capillary voltages at 2500 V (+) and 2500 V (-); declustering potential at 40 V (+) and 40 V (-); collision energy at 4 eV (+) and 4 eV (-); desolvation temperature at 450 °C for both modes; desolvation gas flow at 900 L/h for both modes; with a scan time of 0.2 s and interscan delay of 0.02 s. QC samples were interspersed throughout the run, with one inserted every 10 samples to facilitate the assessment of analytical repeatability. Sample preparation and metabolomic analysis by GC-MS Sample preparation A 60 mg sample was accurately weighed and transferred into a 1.5 mL Eppendorf tube. To each sample, 360 µL of cold methanol and 40 µL of a 0.3 mg/mL solution of 2-chloro-l-phenylalanine in methanol, serving as an internal standard, were added. The samples were then chilled at -20°C for 2 min and ground at 60 Hz for another 2 min. After this, the samples were subjected to ultrasonication at room temperature for 30 min. This was followed by the addition of 200 µL of chloroform and vortex mixing. An additional 400 µL of water was added and the mixture was vortexed again. The samples underwent a second round of ultrasonication at room temperature for 30 min, followed by centrifugation at 12,000 rpm for 10 min at 4 °C. A QC sample was prepared by pooling aliquots from all the samples. A 200 µL portion of the supernatant was then transferred to a glass sampling vial and vacuum-dried at room temperature. The dry sample was reconstituted with 80 µL of a 15 mg/mL methoxylamine hydrochloride solution in pyridine, vortexed for 2 min, and incubated at 37 °C for 90 min. Subsequently, 80 µL of BSTFA (with 1% TMCS) and 20 µL of n-hexane were added, followed by adding 10 µL of 11 internal standards (C8/C9/C10/C12/C14/C16 at 0.8 mg/mL; C18/C20/C22/C24/C26 at 0.4 mg/mL, all prepared in chloroform). The sample was vortexed for another 2 min and derivatized at 70 °C for 60 min. The samples were then left to equilibrate at room temperature for 30 min prior to GC-MS analysis. GC-MS analysis The derivatized samples were analyzed using an Agilent 7890B gas chromatograph coupled with a 5977A MSD system. A DB-5MS fused-silica capillary column was employed for derivative separation. Helium of high purity (> 99.999%) served as the carrier gas at a flow rate of 1 mL/min. The injection volume was 1 µL in splitless mode, with the injector temperature set at 260 °C. The initial oven temperature was 60 °C and was programmed to increase to 305 °C utilizing a multi-step temperature gradient. The MS quadrupole and ion source temperatures were maintained at 150 °C and 230 °C, respectively, with an electron impact energy of 70 eV. The mass spectral data was collected in full-scan mode, scanning from m/z 50 to 500. To ensure analytical consistency, QC samples were injected at regular intervals after every 10 sample analyses. Metabolomic data processing The LC-MS raw data were processed using Progenesis QI software, with a precursor tolerance of 5 ppm and a fragment tolerance of 10 ppm. The retention time (RT) tolerance was set to 0.02 min. Peak alignment was conducted without relying on internal standard detection parameters, isotopic peaks were excluded, and a noise elimination threshold was established at a level of 10.00. The cut-off for minimum intensity was set to 15% of the base peak intensity. The data was compiled into an Excel file, containing three-dimensional datasets that included m/z values, peak RT, and peak intensities, with RT–m/z pairs serving as unique identifiers for each ion. Peaks that were not detected in over 50% of the samples were excluded from the dataset. The internal standard was employed for data quality control, ensuring reproducibility. Metabolites were identified by progenesis QI (Waters Corporation, Milford, USA) Data Processing Software, based on public databases such as [66]http://www.hmdb.ca/; [67]http://www.lipidmaps.org/ and self-built databases. For GC-MS data, the AnalysisBaseFileConverter software was utilized to convert raw data from the .D format to .abf files. These files were then imported into MD-DIAL for data processing. Metabolite annotation was done using the LUG database, specifically designed for untargeted GC-MS analysis. Following this, a ‘raw data array’ was compiled, including sample information, peak names or retention times and m/z values, and peak intensities. This array was filtered to remove internal standards and any known pseudo-positive peaks resulting from background noise, column bleed, or the BSTFA derivatization process. Peaks with a relative standard deviation (RSD) above 0.3 were discarded. The remaining peak areas were normalized according to retention time periods, using multiple internal standards to adjust for any variations in peak intensity. Transcriptome sequencing To isolate total RNA, we employed the Trizol reagent kit (Invitrogen, Carlsbad, CA) following the manufacturer’s protocol. The integrity and quality of the RNA were assessed using the Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA) in conjunction with RNase-free agarose gel electrophoresis. Subsequent to DNA digestion with DNase, eukaryotic mRNA was isolated using Oligo(dT) beads. This mRNA was then fragmented in a buffer solution and reverse-transcribed into cDNA with random hexamer primers. To produce the second-strand cDNA, we used a combination of RNase H, DNA Polymerase I, dNTPs, and reaction buffer. The double-stranded cDNA was then purified using the QiaQuick PCR extraction kit (Qiagen, Venlo, The Netherlands), and further processed through end-repair, poly(A) tailing, and adapter ligation for Illumina sequencing. The adapter-ligated fragments were size-selected via agarose gel electrophoresis, PCR-amplified, and sequenced on an Illumina HiSeq 2500 platform by Gene Denovo Biotechnology Co. Bioinformatics analysis Raw reads from the sequencing process were pre-processed using Fastp version 0.18.0 to remove low-quality sequences. The resulting clean reads were then aligned against the ribosomal RNA (rRNA) database using Bowtie 2 version 2.2.8 for rRNA removal. An index of the reference genome was generated, and HISAT2 version 2.2.4 was employed for the alignment of the cleaned and paired-end reads to the soybean reference genome. Expression levels were quantified by calculating FPKM (Fragments Per Kilobase of exon per Million mapped fragments) for each transcript region. Differential expression analysis among various samples was performed using DESeq2 version 1.44.0. Statistical analysis To assess the correlation between replicates, a principal component analysis (PCA) was conducted using the R package gmodels version 2.19.1. Metabolites with a Variable Importance in Projection (VIP) score exceeding 1, a statistically significant P-value less than 0.05 (Mann-Whitney U test), and an absolute log2 fold change (|log2 FC|) greater than 1 were categorized as differentially accumulated metabolites (DAMs). To gain further insights into the biological significance of the DAMs, an enrichment analysis was performed using the MetaboAnalyst platform ([68]www.metaboanalyst.ca). Differentially expressed genes (DEGs) were identified as those with an absolute log2 fold change (|log2 FC|) greater than 1 and a P-value less than 0.05 (Mann-Whitney U test). These DEGs were then subjected to enrichment analyses using the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) databases. Correlation and cluster analyses were performed using the R package ComplexHeatmap version 2.20.0 to explore the relationships and patterns of co-expression among the genes of interest. Results Metabolome and transcriptome profiling In our detailed metabolic analysis using LC-MS and GC-MS techniques, we identified a total of 479 unique metabolites, with the majority (419) detected through GC-MS and the remaining 60 via LC-MS. Principal component analysis (PCA) effectively differentiated the sample groups (Fig. [69]1A), capturing 55.24% of the variance with the first two principal components. Distinctions among groups were primarily driven by developmental stages and seasonal changes rather than MIPS1 mutations. The meticulous correlation analysis confirmed unique metabolic profiles for each sample group (Fig. [70]1B), particularly when classified by developmental stage and seasonal context. A key finding was the significant metabolic differentiation observed at the late developmental stage, suggesting a vital role for metabolic diversity in affecting germination success. Fig. 1. [71]Fig. 1 [72]Open in a new tab Metabolome and transcriptome profile in different groups. A PCA of metabolites detected in different groups. B Correlation analysis of metabolites in different groups. C PCA of genes expressing in different groups. D Correlation analysis of genes in different groups In terms of transcriptomics, the assembly and sequencing of RNA libraries for TW75 and TW-1 yielded a total of 1,088,044,580 clean reads across the variants and seasons. PCA revealed tight clustering of biological replicates (Fig. [73]1C), accounting for 75.15% of the variation, primarily due to the developmental stage. This indicated notable gene expression differences, especially evident in the late developmental phase. Moreover, intricate correlation analysis highlighted distinct seasonal expression patterns within the same developmental stages (Fig. [74]1D), with a clear distinction between the early and middle stages compared to the late stage. Additionally, the metabolic profiles and transcriptional profiles of parental and mutant soybeans differed to some extent at the same developmental stage and the same season, with this difference being more pronounced in the spring. These findings emphasize the complex relationship between gene expression and environmental conditions. Seasonal variation in Differentially Accumulated Metabolites (DAMs) Amidst pronounced seasonal differences in field emergence rates and metabolite profiles, we conducted metabolite analyses across three developmental stages between spring and autumn to pinpoint the DAMs. To be specific, we screened for DAMs by considering VIP values derived from PLS-DA, along with R values and log2 FC (Fig. [75]S1). As the stages progressed, both TW75 and TW-1 exhibited an increase in DAMs, with a marked rise in up-regulated metabolites (Fig. [76]2A). During the late stage, TW75 exhibited 203 DAMs (24 down-regulated and 179 up-regulated), while TW-1 exhibited 208 DAMs (20 down-regulated and 188 up-regulated) between the two seasons. Notably, 147 metabolites were consistently regulated across both soybean varieties, potentially driving the seasonal shifts in field emergence rates. These DAMs were categorized into over ten subclasses, primarily including amino acids, peptides, and analogues, carbohydrates and their conjugates, and fatty acids and conjugates (Fig. [77]2B). The top-enriched KEGG pathways (Fig. [78]2C) included arginine biosynthesis, glutathione metabolism, the pentose phosphate pathway, arginine and proline metabolism, and pyrimidine metabolism—mainly connected to amino acids and energy metabolism. These pathways, especially prevalent in the late stage, are believed to significantly influence the germination process. Fig. 2. [79]Fig. 2 [80]Open in a new tab Differentially accumulated metabolites (DAMs) between different seasons. A Sum of up-, down-, and same-regulated DAMs and between autumn and spring; B HMBD subclass categories of same-regulated DAMs in different stages; C Most enriched KEGG terms in different stages Dynamic changes in seed composition during the developmental and maturation phases significantly affect seed quality, highlighting the importance of understanding stage-specific DAMs alterations during seed development in TW75 and TW-1 (Fig. [81]3). The DAMs were grouped into four clusters, each with a distinct profile. A significant proportion of amino acids, peptides, and analogues (69.44%) fell into cluster 1 and 3, both showing a decline in autumn. However, during spring, cluster 1’s metabolite levels were considerably higher at every stage, whereas cluster 3 displayed a rising trend, diverging from the autumnal patterns. These trends underscore the critical role of free amino acids in seed maturation. The differential accumulation of these compounds from the middle to late stages may distinctly influence seed quality. Cluster 2 contained a diverse mixture of subclasses, most notably carbohydrates and their conjugates (53.33%), fatty acids and conjugates (44.44%), glycerophosphocholines (70.00%), and alcohols and polyols (100.00%). In this cluster, metabolite levels in spring surged during the late stage while remaining consistent throughout the developmental stages in autumn. The differential build-up of amino acids, carbohydrates, and lipids in this cluster is vital for the germination of soybean seeds, particularly regarding energy metabolism. This variation may contribute to the superior emergence rates seen in autumn-harvested seeds. Additionally, DAMs such as scyllo-inositol and inositol-4-monophosphate in cluster 2, linked to inositol phosphate metabolism, warrant further exploration for their roles in these observed phenomena. Fig. 3. [82]Fig. 3 [83]Open in a new tab Trends and cluster analysis of DAMs across different stages Seasonal expression discrepancy of genes The distinctive patterns of gene expression observed during different seasons reflected variations in field emergence rates, prompting a detailed analysis of DEGs (Fig. S2). A similar method was employed to filter differentially expressed genes (DEGs) using R values and log2 FC. Further analyze the same regulated DEGs, in the ES, a remarkable count of 639 DEGs was identified (551 up-regulated and 88 down-regulated), increasing to 3,071 DEGs (2,167 up-regulated and 904 down-regulated) in MS, and decreasing to 1,636 DEGs (771 up-regulated and 865 down-regulated) in LS of development (Fig. [84]4A). A peak in the quantity of DEGs was noted during the middle phase when comparing the spring and autumn periods. These DEGs, numbering 639, 3,071, and 1,636 for each respective stage, were systematically categorized within the Gene Ontology (GO) framework, encompassing three domains: biological processes, molecular functions, and cellular components (Fig. [85]4B). The most substantial and enriched category across all three stages pertained to cellular components, specifically cell and cell part (312, 48.82% in ES; 1,265, 41.19% in MS; 747, 52.02% in LS), with the ES also emphasizing intracellular components (289, 45.22%) and binding functions (281, 43.97%). The MS gene expressions were marked by single-organism processes (668, 21.75%), transferase activities (527, 17.16%), and small molecule interactions (454, 14.78%). In contrast, the LS was dominated by intracellular components (678, 47.21%) and organelle-specific genes (521, 36.28%). A distinct shift in GO terms was evident, with the MS favoring cell membrane-related processes and the ES and LS focusing on intracellular metabolic functions. Fig. 4. [86]Fig. 4 [87]Open in a new tab DAMs and DEGs between spring and autumn in different stages. A sum of up-, down- and same-regulated genes between spring and autumn; B top 30 enrichment GO terms between spring and autumn; C Transcription factor (TF) families enriched in DEGs between spring and autumn; D top 20 KEGG enrichment pathways of ES; E top 20 KEGG enrichment pathways of MS; F top 20 KEGG enrichment pathways of LS The radial enrichment diagrams (Fig. [88]4D-F) illustrate the top 20 enriched KEGG pathways on the periphery, the number of genes associated with each pathway and their significance values on the second tier, the regulation status of the genes on the third, and the pathway enrichment ratios at the core. Early-stage development was characterized by pathways such as motor protein activity (map04814), photosynthesis antenna proteins (map00196), ATP-dependent chromatin restructuring (map03082), circadian rhythm regulation in plants (map04712), and homologous recombination (map03440). As development advanced, the middle stage’s pathways shifted towards plant-pathogen interaction (map04626), metabolism of amino and nucleotide sugars (map00520), plant-specific MAPK signaling (map04016), nucleotide sugar biosynthesis (map01250), and galactose utilization (map00052). The later stage was enriched with pathways critical for protein processing in the endoplasmic reticulum (map04141), arginine construction (map00220), glutathione pathways (map00480), glycerolipid metabolism (map00561), and the breakdown of 2-oxocarboxylic acids (map01210). Further enrichments were seen in stress-related metabolic pathways during the middle and late stages, in addition to those involving amino acids previously identified in the analysis of DAMs. Notably, the DEGs across these stages were predominantly responsible for coding 9 key transcription factors (TFs, Fig. [89]4C)—AP2/ERF, WRKY, bHLH, MYB, HSF, bZIP, TCP, GATA, and NAC. These TFs play a critical role in regulating gene expression essential for initiating and promoting germination. The expression trends of HSF, in particular, point to the critical impact of thermal dynamics on gene expression throughout the development stages, highlighting the complex regulatory mechanisms that influence germination in soybean seeds. Impact of MIPS1 mutations on field emergence rates TW-1, a soybean mutant line recognized for its low phytic acid levels, has shown variable field emergence rates during the spring season. This variability suggests that the inositol phosphate metabolism pathway plays a pivotal regulatory role in the seed maturation process. A detailed investigation compared the metabolite levels and gene expression profiles between the mutant TW-1-s and wild-type soybeans TW75-s to elucidate the genetic factors affecting spring emergence (Fig. [90]5). The study revealed dynamic changes in DAMs, starting with an early count of 40 DAMs (30 up-regulated and 10 down-regulated), shifting to 33 DAMs (24 up-regulated and 9 down-regulated), and peaking at 79 DAMs (50 up-regulated and 29 down-regulated) by the LS of development (Fig. [91]5A). Comparative analysis indicated significantly higher counts of DAMs between seasons than between genotypes. Despite this, there was a notable commonality in the DAMs, including fatty acids and their conjugates, carbohydrates, carbohydrate conjugates, as well as amino acids, peptides, and analogues (Fig. [92]5B). This suggests that both seasonal variation and MIPS1 genetic variation may share a similar mechanism underlying reduced soybean emergence, leading to the lowest seed germination rate for TW-1 in spring. However, the regulatory mechanisms during seed development need further discussion. Fig. 5. [93]Fig. 5 [94]Open in a new tab Differential metabolites and genes between TW75-s and TW-1-s. A sum of up- and down-regulated DAMs and DEGs between TW75-s and TW-1-s; B HMDB subclass categories of DEGs between TW75-s and TW-1-s; C top 30 enrichment KEGG pathways in different stages; D TF families enriched in DEGs between TW75-s and TW-1-s An upward trend was observed in the number of DEGs between TW75-s and TW-1-s, beginning with 232 DEGs (130 up-regulated and 102 down-regulated) in the ES, escalating to 1,204 DEGs (746 up-regulated and 458 down-regulated) in the MS, and culminating in 2,028 DEGs (1,027 up-regulated and 1,001 down-regulated) by the LS (Fig. [95]5A). Pathway enrichment analysis of these DEGs highlighted their significant participation in key metabolic pathways, including carbon metabolism, glycolysis/gluconeogenesis, galactose metabolism, and pyruvate metabolism. The inositol phosphate metabolism pathway was especially prominent among the DEGs, emphasizing its potential regulatory importance (Fig. [96]5C). The DEGs between TW75-s and TW-1-s demonstrated marked enrichment in energy metabolism pathways and pathways related to amino acid metabolism. An examination of TFs revealed that the DEGs predominantly encoded AP2/ERF, MYB, bZIP, WRKY, and bHLH transcription factors. This pattern of TF encoding mirrored the differential expression of transcription factors noted between the spring and autumn seasons (Fig. [97]5D), suggesting a preservation of regulatory themes within the seasonal transcriptome. Correlation network of seasonal and genetic factors The genes in enriched pathways, TFs in DEGs, and metabolites in DAMs were screened based on their relative contents/fpkm values and P values. These elements were then classified using heatmaps and correlation networks (Fig. [98]6A). The investigation differentiated between seasonal influences—contrasts observed between autumn and spring samples—and genetic contributors—disparities between the TW75-s and TW-1-s variants. Among the transcription factors surveyed, families such as AP2/ERF, WRKY, bHLH, MYB, and HSF stood out as potential modulators of field emergence rates. The expression profiles of these TFs were predominantly dictated by seasonal changes, with only a subset being influenced by genetic factors. In pathway analysis, key metabolic routes—glutathione metabolism, galactose metabolism, inositol phosphate metabolism, and carbon metabolism—were found to be significantly responsive to both seasonal and genetic elements. Glutathione metabolism, in particular, was identified as a primary determinant for the observed decrease in seed emergence during spring. Moreover, carbon metabolism genes were consistently implicated under both seasonal and genetic factors, suggesting a compounded effect that potentially leads to further diminished field emergence rates in the TW-1-s line. Fig. 6. [99]Fig. 6 [100]Open in a new tab Expressing trends and correlation of key metabolites and genes. A Expressing trends of DEGs of TF families (left) and trends of DEGs and DAMs of key metabolic pathways (right); B Correlation network of TF families and genes in pathways Furthermore, we conducted a correlation analysis to investigate the relationship between field emergence rates and the expression of TF genes, as well as DEGs associated with carbon and glutathione metabolism pathways during specific developmental stages. This analysis revealed a strong correlation (Fig. [101]6B), indicating that variations in field emergence rates might be closely linked to the regulatory roles of specific TFs during the seed development phase. These transcription factors are likely critical in fine-tuning gene expression within key metabolic pathways, thereby influencing the germination process and initial seedling growth. Discussion In our study, we observed that both climatic variability and mutations in the MIPS1 gene have a pronounced impact on the field emergence rates of soybean cultivars TW-1 and TW-75, which is consistent with previous reports about low phytic acid mutants [[102]11, [103]13, [104]24]. We proposed that the cooler and wetter conditions typical of spring in Hangzhou may compromise seed integrity, and suggested that MIPS1 mutations, together with subsequent metabolic pathways, could intensify this decline in seed emergence. The quality of soybean seeds was clearly influential on germination rates; however, the specific factors and mechanisms that account for seasonal variation, MIPS1 mutation and their synergistic effect in soybean germination are not yet fully understood. To shed light on this, we analyzed mips1 mutant and wild type seed samples from both seasons using metabolomic and transcriptomic approaches, aiming to pinpoint DAMs and DEGs, and thereby illuminate the physiological and biochemical changes that accompany seasonal shifts. During seed maturation, a vital phase of development, we observed significant changes of nutritive reserves. Notably, the seeds harvested in spring exhibited a marked increase in metabolites such as amino acids, peptides, carbohydrates, and fatty acids, particularly in the late maturation stage. These small molecules play crucial roles in abiotic stress response. This high concentration of these small molecular metabolites suggests a response to abiotic stresses in spring, such as temperature fluctuations. Macromolecular substances, namely starch, lipids and proteins, are essential nutrition for seed germination [[105]32]. This stress-induced metabolic changes lead to early energy usage and depletion of seed storage materials, negatively impacting seed development and germination. Starch is a pivotal energy store within the endosperm, broken down by amylases from the aleurone layer at the onset of germination [[106]33]. The increased levels of carbohydrates in spring-harvested seeds may suggest premature starch degradation, potentially affecting energy availability during early germination stages. This premature breakdown could impact the energy supply necessary for successful germination, thus influencing overall seed viability. Moreover, given the critical role of lipids during germination, our observation of increased phosphatidylcholine and lysophosphatidylcholine levels in the late maturation stage of seeds harvested in spring indicates significant membrane synthesis activity. These elevated lipid levels are often associated with stress responses, as plants frequently alter the selective permeability of membranes to adopt to abiotic stress [[107]34]. Lipids, notably phosphatidylcholine and lysophosphatidylcholine, play a crucial role in maintaining the stability and fluidity of cell membranes. These lipids act as key structural components that enhance cellular integrity, especially during the challenging spring conditions characterized by temperature fluctuations and desiccation. By strengthening membrane structures, they enable cells to endure and recover from these stresses more effectively. This stabilization helps preserve essential cellular functions and prevents the leakage of vital contents, ensuring that cells remain functional even in adverse environmental conditions. This stress response could divert resources from other vital processes, potentially leading to reduced field emergence rates. Notably, phytic acid is the main storage form of phosphorus in plant seeds [[108]9, [109]10, [110]15]. In TW-1-s, the synthesis of large amounts of phosphatidylcholine and lysophosphatidylcholine may further affected phosphorus storage, leading to reduced seed quality. Furthermore, the metabolome dynamics also extend to amino acids and peptides. During germination, these compounds are released through protein hydrolysis, enhancing nutrient accessibility [[111]35]. Typically, free amino acids decrease, and total amino acids (including those incorporated into seed-storage proteins) increase in the late developmental stage of seeds [[112]1]. Our comparative analysis revealed a 4.88-fold increase in these metabolites in spring. Notably, the free amino acids identified as DAMs play fundamental roles in the seed’s central metabolism, serving not only as building blocks for storage proteins but also as precursors for a myriad of other metabolites, including phytohormones and protective secondary metabolites. The metabolome analysis revealed the accumulation of significantly higher levels of stress-related primary metabolites, such as soluble amino acids, soluble sugars, and TCA cycle intermediates, in spring. This indicates that the climatic stress in spring decreased seed quality and further reduced field emergence rates. The climate-related stress that result in varying seed quality, as revealed by metabolomic analysis, are the direct causes of the low emergence rate. Further transcriptomic analysis combined with metabolic analysis elucidated the underlying regulatory mechanisms. The transcriptomic data exhibited patterns parallel to these metabolic trends, particularly in pathways related to glycerolipid metabolism, glutathione metabolism, carbon metabolism, and alanine, aspartate, and glutamate metabolism. Notably, genes involved in glutathione metabolism, known to preserve seed longevity and regulate dormancy [[113]36–[114]38], displayed distinct differential expression between the spring and autumn seasons. Our findings indicated a significant association between glutathione metabolism and carbon metabolism, which could be a crucial factor contributing to the observed differences in field emergence between the spring and autumn seasons (Fig. [115]7). Fig. 7. [116]Fig. 7 [117]Open in a new tab The expression of DEGs and DAMs associated with glutathione metabolism and carbon metabolism in different groups Examining the differences between TW75-s and TW-1-s, we identified both similarities and distinctions in DAMs and DEGs in response to seasonal variations. These differences suggest that the MIPS1 gene may be involved in the seasonally induced stress response. The reduction in MIPS1 activity can lead to a decrease in myo-inositol levels, consequently impeding the production of important oligosaccharides (Fig. [118]8). Additionally, phytic acid synthesis includes both lipid-independent and lipid-dependent pathways [[119]9, [120]10, [121]12]. MIPS1 gene is involved in the first step of phytic acid synthesis and affects both pathways, with the lipid-dependent pathway being associated with cell membrane synthesis [[122]39]. Disturbances in phosphorus and lipid metabolism affected the synthesis of cell membranes under abiotic stress, and thus TW-1 produces lower seed quality in spring than TW75. Fig. 8. [123]Fig. 8 [124]Open in a new tab The expression of DEGs and DAMs associated with inositol phosphate metabolism and galactose metabolism in different groups Furthermore, transcription factors from the AP2/ERF, WRKY, bHLH, MYB, and HSF families play significant roles in regulating seed development and quality, influenced by both seasonal and genetic factors. For instance, the AP2/ERF family, implicated in water absorption and abscisic acid signaling, likely affects the seeds’ ability to cope with drought and water-related stresses, influencing emergence rates [[125]40]. The bHLH family influences germination response to temperature, which could explain varied emergence rates under different seasonal temperature conditions [[126]41]. The MYB family, involved in stress responses, likely plays a role in the seeds’ resilience to environmental stresses, including salinity and drought [[127]42]. Lastly, the HSF family emphasizes the role of temperature in stress responses, particularly in the activation of heat shock proteins during high-temperature conditions [[128]43]. These TFs are involved in the regulation of previously described pathways, such as energy metabolism, lipid metabolism, and phosphorus metabolism, in response to abiotic stresses. This regulation can ultimately lead to reduced seed quality and field emergence rates. Our findings underscore the multifaceted influences on field emergence, rooted in seed development impacted by both seasonal and genetic factors. Stress-induced metabolic imbalances and mutations in the MIPS1 gene contribute to reduced seed quality and emergence rates. Transcription factors likely play a critical role in modulating these metabolic processes. Both the wild-type and mutant seeds were sensitive to seasonally induced stress; however, the mutant TW-1 exhibited a higher sensitivity, leading to significantly lower field emergence rates in the spring compared to TW75. The MIPS1 gene is associated with stress resistance, potentially due to its role in regulating cell membrane synthesis through phosphorus metabolism. Despite these insights, the intricate nature of seeds and the variability of climate conditions necessitate tailored optimization strategies. Future research will further investigate the germination rates of various mips1 mutants, aiming to identify optimal seeds and conditions for germination, with the overarching goal of enhancing germination rates to improve the quality of soybean varieties. Conclusion In conclusion, our study underscores the profound influence of seasonal climatic variability and genetic mutations in the MIPS1 gene on the field emergence rates of soybean cultivars TW-1 and TW75. The marked decrease in seed quality and emergence rates, particularly in the spring, highlights the intricate interplay between environmental conditions and seed physiology. By employing an integrated multi-omics approach, which includes both metabolomic and transcriptomic analyses, we offer a more comprehensive and nuanced understanding of the complex regulatory networks and stress-induced metabolic imbalances, particularly in lipid, carbohydrate, and amino acid metabolism pathways. This multi-omics strategy surpasses the limitations of single-omics approaches by providing a holistic view that reveals the interconnectedness of metabolic and gene expression changes, crucial for understanding the observed variations. Specifically, the MIPS1 gene plays a crucial role in regulating cell membrane synthesis via phosphorus metabolism. Mutations in this gene lead to significant changes in metabolic activities, resulting in increased sensitivity to seasonal stressors. Our observations indicated that the mutant TW-1, in particular, exhibited a higher sensitivity to such stress, manifesting in significantly lower field emergence rates compared to TW75. The differential accumulation of metabolites and expression of genes during seed maturation stages further elucidates the physiological and biochemical changes underpinning these seasonal effects. The study also highlights the role of various transcription factors, including those from the AP2/ERF, bHLH, MYB, and HSF families, which are instrumental in modulating seed development and stress responses. These factors influence critical processes such as water absorption, temperature response, and adaptation to abiotic stressors, thereby affecting seed quality and emergence rates. Going forward, it is essential to delve deeper into the regulatory mechanisms involving the MIPS1 gene and its associated metabolic pathways. Future research should aim to identify the optimal genetic traits and environmental conditions that enhance seed quality and germination rates of low phytic acid cultivars. Such insights will be invaluable in developing soybean varieties with improved resilience and performance, ultimately contributing to agricultural sustainability and food security. Our findings lay a solid foundation for future studies targeting the optimization of seed emergence through an integrated approach combining genetic, metabolic, and environmental factors. By advancing our understanding of these intricate interactions, we pave the way for innovative strategies to improve the quality and yield of soybean crops, thereby addressing the challenges posed by climatic variability and enhancing global food production. Supplementary Information [129]Supplementary Material 1.^ (11.2MB, docx) Acknowledgements