Abstract Nodule development directly affects nitrogen fixation efficiency during soybean growth. Although abundant genome-based information related to nodule development has been released and some studies have reported the molecular mechanisms that regulate nodule development, information on the way nodule genes operate in nodule development at different developmental stages of soybean is limited. In this report, notably different nodulation phenotypes in soybean roots inoculated with Bradyrhizobium japonicum strain 113-2 at five developmental stages (branching stage, flowering stage, fruiting stage, pod stage and harvest stage) were shown, and the expression of nodule genes at these five stages was assessed quantitatively using RNA-Seq. Ten comparisons were made between these developmental periods, and their differentially expressed genes were analysed. Some important genes were identified, primarily encoding symbiotic nitrogen fixation-related proteins, cysteine proteases, cystatins and cysteine-rich proteins, as well as proteins involving plant-pathogen interactions. There were no significant shifts in the distribution of most GO functional annotation terms and KEGG pathway enrichment terms between these five development stages. A cystatin Glyma18g12240 was firstly identified from our RNA-seq, and was likely to promote nodulation and delay nodule senescence. This study provides molecular material for further investigations into the mechanisms of nitrogen fixation at different soybean developmental stages. __________________________________________________________________ Legumes interact with specific soil rhizobia to develop symbiotic relationships that lead to the formation of root nodules. These nitrogen-fixing nodules allow the host plants to grow without the addition of nitrogen fertilizers and are of significant agronomic and ecological importance. Nodule development is initiated by an exchange of chemical signals between legumes and soil rhizobia and is accompanied by a series of signal transductions inside the legume’s root cells[46]^1,[47]^2. The early molecular events involved in nodule initiation are quite well understood[48]^3,[49]^4,[50]^5,[51]^6. Although some legume genes have been implicated in the late developmental stage of nodule development and/or bacteroid differentiation[52]^7,[53]^8,[54]^9,[55]^10,[56]^11,[57]^12 and some studies have investigated the molecular mechanisms that regulate nodule development[58]^13,[59]^14, information on nodule-related genes in the middle or late developmental stage of nodule development is also limited. Soybean (Glycine max), which had a global harvest area in 2015 of more than 1,734 million acres, is an important oil crop, food and feed material, and requires a large amount of nitrogen for growth. However, excessive application of nitrogen fertilizer not only reduces the effective nitrogen utilization of soybean (nitrogen uptake, utilization and fixation efficiencies), but also results in reduced production efficiency, a waste of resources, environmental pollution and other issues[60]^15,[61]^16. Studies on how to reduce the amount of nitrogen fertilizer and increase crop nitrogen utilization efficiency are very important, and there have been some good studies on the high nitrogen utilization efficiency of soybean[62]^17,[63]^18,[64]^19. For symbiotic nitrogen fixation, the premature senescence of nodules can negatively affect nitrogen availability for soybean growth; thus, the development and senescence of nodules require further detailed study[65]^11. Nodule development directly affects nitrogen fixation efficiency, which changes with the growth of soybean[66]^20. A soybean developmental stage classification based on significant changes in external morphological characteristics has been in place since 1971[67]^21 and plays important roles in soybean cultivation and fertilizer application[68]^10; the characteristics of nitrogen fixation at different developmental stages are critical in agriculture and ecology. However, little is known about the molecular mechanisms regulating nodule development and/or nitrogen fixation at different developmental stages. Knowledge of the sequence information related to nodule development in soybean was greatly enhanced by the release of high-quality drafts of the G. max, M. truncatula and Phaseolus vulgaris genomes[69]^6,[70]^22,[71]^23. Moreover, a huge EST collection of more than 390,000 sequences in soybean has been released ([72]http://www.ncbi.nlm.nih.gov/dbEST/) together with approximately 40,000 full-length cDNA clones[73]^24. A comparative analysis of genome sequences in six legumes revealed many of the symbiotic genes in soybean[74]^25, and RNA-Seq transcription data predicted several nodulation-related gene regulatory networks[75]^26. However, this genome-based information is constrained by prior knowledge of gene sequences and limits the patterns of gene expression at various stages of nodule development. We are in a position to significantly improve our understanding of soybean nodule development using RNA-Seq, which is an effective method that produces quantitative data related to transcripts with greater sensitivity, higher reproducibility, and wider dynamic range[76]^27 than other conventional methods. This method also has relatively little variation between technical replicates for identifying differentially expressed genes[77]^28. In this report, we investigated ten comparisons between five important developmental stages (branching stage, flowering stage, fruiting stage, pod stage and harvest stage) of soybean and identified a large number of differentially expressed genes (DEGs), including those encoding soybean symbiotic nitrogen fixation-related proteins, cysteine proteases, cystatins and cysteine-rich proteins, as well as proteins involved in plant-pathogen interactions. Seven cystatins are actively transcribed during nodule development and senescence[78]^11, while two other cystatins (Glyma18g12240 and Glyma18g00690) were identified from our RNA-seq. The symbiotic function of Glyma18g12240 was studied to verify the reliability of our data, and the results showed that it was likely to promote nodulation and delay nodule senescence. The discovered differentially expressed genes (DEGs) and the results described in this study should aid efforts to understand the mechanisms of soybean nodule development and shed new light on the molecular mechanisms of nitrogen fixation at five important developmental stages. Results Nodulation phenotypic characterization at different developmental periods of soybean The nitrogen fixation rate of soybean nodules changes with plant growth. The nodules begin to fix nitrogen when leghemoglobin (Lb) is expressed (branching stage); the nitrogen fixation rate then gradually increases, is relatively high at the flowering and fruiting stages and then gradually weakens from the pod stage; and at the harvest stage, the nodules are senescence and almost incapable of nitrogen fixation[79]^29,[80]^30. To investigate nodule development during soybean growth, we examined the symbiotic phenotypic of soybean inoculated with B. japonicum 113-2 at five developmental stages (branching stage, flowering stage, fruiting stage, pod stage and harvest stage), and the results are shown in [81]Fig. 1 and [82]Table S1. The classification of these five developmental stages according to the popular classification system[83]^21 and the detail defined information for the plants at each stage was shown in materials and methods. The growth of soybean inoculated with B. japonicum 113-2 at these five developmental stages was shown in [84]Fig. 1A–E. At the branching stage, the control (CK, inoculated with media only) had slightly better growth than the soybean inoculated with B. japonicum 113-2 ([85]Fig. 1A), maybe because the symbiotic process required more energy from the host plant and nitrogen fixation was weak during this period. The number of nodules per plant and the dry weight per nodule increased from the branching stage to fruiting stage and decreased from the pod stage ([86]Fig. 1F–J, [87]Table S1). At the branching stage, flowering stage and fruiting stage, the nodules were pink and/or yellow, while at the pod stage, the nodules became brown ([88]Fig. 1G–I). The nodules were senescence and filled with water at the harvest stage ([89]Fig. 1I,J). Moreover, differentially expressed levels of Gm Lb (Glyma10g34280), which is required for nitrogen fixation[90]^31,[91]^32, at all five developmental stages were shown ([92]Fig. 1K). These results indicate notable symbiotic phenotypes among different developmental stages of soybean. Figure 1. Symbiotic phenotype features of five important soybean developmental stages. [93]Figure 1 [94]Open in a new tab (A–E) Soybean growth at five important developmental stages, including the branching stage, flowering stage, fruiting stage, pod stage and harvest stage. (F–J) Nodulation phenotypes were examined at five developmental stages after inoculation with 113-2. (K) qPCR analysis of the transcript levels of Lbc1 (Glyma10g34280) at five developmental stages. Bars, 4 cm (A,B); 5 cm (C–E); 3 cm (F–J). d, days. Quality assessment of RNA-Seq and DEG identification in different developmental periods of soybean RNA-Seq was performed for soybean nodule samples at five developmental stages of soybean, and detailed information is shown in the materials and methods. The proportion of clean reads among the total acquired reads was more than 99.6%, indicating high-quality sequencing ([95]Fig. S1). Comparable numbers of total mapped reads, perfect match reads and unique match reads were obtained in the mapping results for the five developmental stages ([96]Table S2). Sequencing saturation analysis showed that the number of genes represented by clean reads stabilized when the number of total tags reached 2.5 million or higher; therefore, the obtained reads represented full coverage for each sample ([97]Fig. S2). During the RNA-Seq experiment, mRNAs are first broken into short segments by chemical methods and then sequenced. The results showed that the distribution of reads on reference genes was sufficient for subsequent bioinformatics analysis ([98]Fig. S3). These results indicate that the sequencing library was of good quality and contained sufficient information for gene expression analysis. To judge the significance of differences in DEGs between five developmental stages of soybean, a false discovery rate (FDR) ≤0.001 and |log2 ratio| ≥1 were used as criteria for the ten Groups (detailed information is provided in the materials and methods). Information on the DEGs in the ten Groups between the five nodule samples is provided in [99]Table S3, and the numbers of up-regulated and down-regulated DEGs in these ten Groups are shown in [100]Fig. 2. These included 6,099 DEGs (2331 up-regulated, 3768 down-regulated) between the branching stage and flowering stage; 9,562 DEGs (3652 up-regulated, 5910 down-regulated) between the branching stage and fruiting stage; 14,539 DEGs (3863 up-regulated, 10676 down-regulated) between the branching stage and pod stage; 11,341 DEGs (4824 up-regulated, 6517 down-regulated) between the branching stage and harvest stage; 1,030 DEGs (527 up-regulated, 503 down-regulated) between the flowering stage and fruiting stage; 7,891 DEGs (2121 up-regulated, 5770 down-regulated) between the flowering stage and pod stage; 4,441 DEGs (2708 up-regulated, 1733 down-regulated) between the flowering stage and harvest stage; 6,609 DEGs (1558 up-regulated, 5051 down-regulated) between the fruiting stage and pod stage; 2,789 DEGs (1836 up-regulated, 953 down-regulated) between the fruiting stage and harvest stage and 6,975 DEGs (5673 up-regulated, 1302 down-regulated) between the pod stage and harvest stage. Figure 2. Differentially expressed genes (DEGs) between different developmental periods of soybean. [101]Figure 2 [102]Open in a new tab Group 1: branching stage vs. flowering stage; Group 2: branching stage vs. fruiting stage; Group 3: branching stage vs. pod stage; Group 4: branching stage vs. harvest stage; Group 5: flowering stage vs. fruiting stage; Group 6: flowering stage vs. pod stage; Group 7: flowering stage vs. harvest stage; Group 8: fruiting stage vs. pod stage; Group 9: fruiting stage vs. harvest stage; Group 10: pod stage vs. harvest stage. Functional ontology and KEGG pathway enrichment analysis of DEGs An internationally standardized gene function classification system, Gene Ontology, was used to classify the DEGs in the above-mentioned 10 Groups. A total of 18 gene ontology function terms are listed in [103]Fig. 3 and divided into three categories: biological process, cellular components, and molecular function. For all 10 Groups, the biological processes associated with the DEGs mainly focused on metabolic process, cellular process, response to stimulus, single-organism process, establishment of localization and localization. The cellular components mainly included cell, cell part, membrane, organelle, membrane part and organelle part. The main molecular functions of the DEGs were catalytic activity, binding, transporter activity, nucleic acid binding transcription factor activity, molecular transducer activity and antioxidant activity ([104]Fig. 3). Figure 3. Gene Ontology - based functional annotation of DEGs between different developmental periods of soybean. [105]Figure 3 [106]Open in a new tab The three GO domains - biological process, cellular components, and molecular function - are shown. The numbers of genes in each term are shown in histograms. Eighteen GO function terms are indicated: 1, metabolic process; 2, cellular process; 3, response to stimulus; 4, single-organism process; 5, establishment of localization; 6, localization; 7, cell; 8, cell part; 9, membrane; 10, organelle; 11, membrane part; 12, organelle part; 13, catalytic activity; 14, binding; 15, transporter activity; 16, nucleic acid binding transcription factor activity; 17, molecular transducer activity; 18, antioxidant activity. (A) Group 1 (branching stage vs. flowering stage) and Group 2 (branching stage vs. fruiting stage). (B) Group 3 (branching stage vs. pod stage) and Group 4 (branching stage vs. harvest stage). (C) Group 5 (flowering stage vs. fruiting stage) and Group 6 (flowering stage vs. pod stage). (D) Group 7 (flowering stage vs. harvest stage) and Group 8 (fruiting stage vs. pod stage). (E) Group 9 (fruiting stage vs. harvest stage) and Group 10 (pod stage vs. harvest stage). KEGG is the major public database for pathway enrichment analysis[107]^33 and the eleven KEGG pathway subgroups associated with the DEGs are shown in [108]Fig. 4. The pathways with the greatest numbers in each group were metabolic pathways, and other pathways, such as biosynthesis of secondary metabolites, plant hormone signal transduction and plant-pathogen interaction, were also main enriched pathways. Figure 4. Results of KEGG pathway enrichment analyses of DEGs for 11 KEGG pathways. [109]Figure 4 [110]Open in a new tab The x- and y-axes represent pathway categories and the number of genes in each pathway, respectively. a, Biosynthesis of secondary metabolites; b, Metabolic pathways; c, Plant hormone signal transduction; d, Cysteine and methionine metabolism; e, Nitrogen metabolism; f, ABC transporters; g, Plant-pathogen interaction; h, Ubiquitin mediated proteolysis; i, Protein processing in endoplasmic reticulum; j, RNA transport; k, mRNA surveillance pathway. (A) Group 1 (branching stage vs. flowering stage) and Group 2 (branching stage vs. fruiting stage). (B) Group 3 (branching stage vs. pod stage) and Group 4 (branching stage vs. harvest stage). (C) Group 5 (flowering stage vs. fruiting stage) and Group 6 (flowering stage vs. pod stage). (D) Group 7 (flowering stage vs. harvest stage) and Group 8 (fruiting stage vs. pod stage). (E) Group 9 (fruiting stage vs. harvest stage) and Group 10 (pod stage vs. harvest stage). GO functional annotation and a pathway enrichment analysis of DEGs in soybean nodule development revealed no high shift in the distribution of most terms in the ten Groups. DEGs associated with plant-pathogen interactions A legume-derived flavonoid can induce the pathogenic type III secretion system (T3SS) of rhizobia, which injects effector proteins into host cells to modulate nodulation signalling towards nodulation and to abort the nodulation process through recognition by the host defence system[111]^34,[112]^35. To study the regulation of the soybean-soil rhizobia interaction during nodule development, we analysed the plant-pathogen interaction KEGG pathway (obtained by KEGG, [113]http://www.kegg.jp/kegg/kegg1.html) in more detail ([114]Fig. 5). A total of 24 selected KEGG gene sets in this pathway were differentially expressed between different developmental periods of soybean. One gene that matched this pathway (MIN7[115]^36) was only up-regulated in Group 10 and down-regulated in the other Groups, while the other two genes, MAPK kinase (MKK4/5)[116]^37 and coronatine-insensitive protein 1 (COI1)[117]^38, were down-regulated in Group 10 and up-regulated in the other Groups. Jasmonate ZIM domain-containing protein (JAZ)[118]^39 was up-regulated in Group 1 and down-regulated in Groups 6, 8 and 9. Elongation factor Tu 18 (elf18)[119]^40 was down-regulated in Group 2 and up-regulated in Groups 3, 6, 7 and 8. Pathogenesis-related protein 1 (PR1)[120]^41 was down-regulated in Group 1 and up-regulated in Groups 5, 6 and 8, and chitin elicitor-binding protein (CEBiP)[121]^42 was down-regulated in Group 8 and up-regulated in Groups 1–4 and 10. The DEGs in these gene sets with ≥8-fold changes are listed in [122]Table S4; some of these DEGs existed in two or more Groups, for example, Glyma18g51546 and Glyma16g34030 were in Groups 6, 8 and 10, and so on. These results indicate that differential defence responses in soybean nodules are related to nodule development and senescence, and this also uncovers some important genes in the tightly regulated soybean symbiotic process that coordinates nodule development with host soybean immunity defence. Figure 5. DEGs associated with the plant-pathogen interaction pathways between different developmental periods of soybean. [123]Figure 5 [124]Open in a new tab (A) Up-regulated genes are boxed in red, and the red arrows point out the up-regulation of DEGs in Groups of nodule development. (B) Down-regulated genes are boxed in green, and the green arrows point out the down-regulation of DEGs in Groups. Group 1: branching stage vs. flowering stage; Group 2: branching stage vs. fruiting stage; Group 3: branching stage vs. pod stage; Group 4: branching stage vs. harvest stage; Group 5: flowering stage vs. fruiting stage; Group 6: flowering stage vs. pod stage; Group 7: flowering stage vs. harvest stage; Group 8: fruiting stage vs. pod stage; Group 9: fruiting stage vs. harvest stage; Group 10: pod stage vs. harvest stage. DEG-encoding cysteine proteases, cystatins and cysteine-rich proteins in nodule development and senescence Cysteine proteases and cystatins play roles in nodule development; eighteen cysteine proteases and seven cystatins are actively transcribed during nodule development and senescence[125]^11. Among them, ten cysteine proteases and four cystatins were differentially expressed in nodule development and senescence. Two other cysteine proteases (Glyma04g36470 and Glyma17g18440) and two cystatins (Glyma18g12240 and Glyma18g00690) were also identified from the DEGs ([126]Table S5). Additionally, 44 nodule cysteine rich proteins, which included 33 receptor-like protein kinases, nine secretory proteins and two polycomb-like proteins, were identified from the DEGs associated with nodule development by RNA-Seq, and some of the genes encoded the same protein ([127]Table S5). These proteins were characterized by a putative signal peptide and conserved cysteine residues; however, there has been very limited research into the functions of these cysteine-related genes. Analysis of selected symbiotic nitrogen fixation - related genes Fifty-two soybean genes responsible for the broad NF signal pathway and nodulation were identified by searching for homologues of M. truncatula and L. japonicus nitrogen fixation-related genes in soybean genome sequence data[128]^22,[129]^25,[130]^43, and most of these orthologues in soybean have been identified in previous works[131]^44,[132]^45. Among them, 21 genes were identified as DEGs in this report and their potential symbiotic functions were shown in [133]Table S6. Besides, four important marker genes known to be important in symbiotic nitrogen fixation (ApyraseGS52, Calmodulin-like protein, Enod40 and Nodulin 35)[134]^46 were examined here to see how their expression changes throughout nodule development and senescence ([135]Table S6). Verification of RNA-Seq results by qPCR and functional analysis of the candidate gene Glyma18g12240 To verify the RNA-Seq results, first, the expression stabilities of four references genes were evaluated. QACT (GmACT11, Glyma18g52780),