Abstract Background Stipa grandis (Poaceae) is one of the dominant species in a typical steppe of the Inner Mongolian Plateau. However, primarily due to heavy grazing, the grasslands have become seriously degraded, and S. grandis has developed a special growth-inhibition phenotype against the stressful habitat. Because of the lack of transcriptomic and genomic information, the understanding of the molecular mechanisms underlying the grazing response of S. grandis has been prohibited. Results Using the Illumina HiSeq 2000 platform, two libraries prepared from non-grazing (FS) and overgrazing samples (OS) were sequenced. De novo assembly produced 94,674 unigenes, of which 65,047 unigenes had BLAST hits in the National Center for Biotechnology Information (NCBI) non-redundant (nr) database (E-value < 10^-5). In total, 47,747, 26,156 and 40,842 unigenes were assigned to the Gene Ontology (GO), Clusters of Orthologous Group (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases, respectively. A total of 13,221 unigenes showed significant differences in expression under the overgrazing condition, with a threshold false discovery rate ≤ 0.001 and an absolute value of log[2]Ratio ≥ 1. These differentially expressed genes (DEGs) were assigned to 43,257 GO terms and were significantly enriched in 32 KEGG pathways (q-value ≤ 0.05). The alterations in the wound-, drought- and defense-related genes indicate that stressors have an additive effect on the growth inhibition of this species. Conclusions This first large-scale transcriptome study will provide important information for further gene expression and functional genomics studies, and it facilitated our investigation of the molecular mechanisms of the S. grandis grazing response and the associated morphological and physiological characteristics. Introduction In China, the grasslands are the dominant landscape, covering 40% of the national land area, of which approximately 78% is located in the northern temperate zone [[33]1]. In the Inner Mongolia Autonomous Region, the steppe grasslands cover 68% of the total land area (up to 791,000 km^2) [[34]2], but 30–50% of this region is affected by deterioration and desertification [[35]3], and overgrazing is considered to be the major factor contributing to this grassland degradation [[36]2–[37]4]. In the grasslands, grazing is normally linked with plant morphological and physiological responses, such as reduced shoot internodes or altered water utilization rates, to adapt to defoliation disturbance and more stressful habitats [[38]5]. Under grazing conditions, plants are expected to develop a resistance strategy (such as small size, being short-lived and fast-growing or having low palatability) to avoid grazing or a tolerance strategy to re-grow / reproduce after damage; decreased plant height is considered to be a positive response to grazing [[39]6, [40]7]. The accumulating data have revealed that grazing leads to growth-inhibition in plants; for example, herbivory by the white-tailed deer on Anticosti Island (Quebec) resulted in stunted bonsai-like plants [[41]8]. Additionally, overgrazing by sheep has significantly reduced plant height when compared with non-grazing grasslands [[42]9]. Stipa grandis (Poaceae, 2n = 44) is a C3 perennial bunch grass and is one of the dominant species in a typical steppe of the Inner Mongolian Plateau [[43]10]. It is a wind-pollinated grass, flowering in mid to late July, with the seeds ripening in late August or early September [[44]10, [45]11]. The mature plants have dense tussocks that are approximately 30 cm high, with long, thin leaves [[46]11]. The S. grandis steppe, which represents the major pasture type in Inner Mongolia, primarily spreads from the Xilingole Plateau and the middle-eastern region of the Hulun Buir Plateau [[47]4] and acts as a natural green barrier to protect the vast area from sandstorms [[48]11]. However, as one of the typical dominant steppes in the Xilin River Basin, the S. grandis steppe has degraded to different degrees, mainly due to overgrazing [[49]4, [50]12]. As a result, the plants not only showed a smaller size but also reduced plant circumference and sexual reproduction [[51]10]. Because there is no sequencing information for S. grandis in the public databases, further research on this steppe plant at the molecular level in response to grazing has been limited. Next-generation sequencing (NGS) technologies, such as the Roche / 454, Solexa / Illumina and AB SOLiD platforms [[52]13, [53]14], have been rapidly developed in recent years, providing more efficient and less costly sequencing than ever before [[54]13]. In the current study, we performed a de novo assembly of the leaf transcriptome of S. grandis in response to grazing using the Illumina HiSeq 2000 sequencing platform and characterized the transcriptional changes by comparing the transcriptomes of overgrazing and non-grazing plants. This information will facilitate further functional genomics studies in S. grandis and aid in the understanding of the molecular mechanisms behind the grazing response in plants and the associated morphological changes. Materials and Methods Ethics statement Regarding to the field study, no specific permits were required for the S. grandis species in the described locations in this manuscript, and the field studies did not involve endangered or protected species. Plant materials and RNA preparation The samples used in this experiment were collected from the non-grazing area (referred to as FS) and the overgrazing area (referred to as OS), in a typical dominant steppe in the Baiyinxile Livestock Farm of Xilinhot, Inner Mongolia (116°40'E, 43°33'N), in the middle of August. The non-grazing area has been fenced since 1983 for grazing-free and is considered to be restored, while overgrazing region right next to the fenced area was freely grazed by sheep and seriously degraded [[55]15]. More details about the non-grazing and overgrazing region can be found in previously study [[56]16]. The climate of this area was described previously [[57]2]. For measuring plant height, ten individual plants were randomly chosen and sampled for each region in the morning (over 12 hours after the last grazing happened), and the plants height were statistically analyzed in [58]S1 Fig. For sequencing, three plants were taken from the non-grazing and overgrazing fields each, and fresh leaves of S. grandis were collected and immediately frozen in liquid nitrogen and then stored at -80°C. 0.1 g leaves of S. grandis were taken from each plant for total RNA extraction using Trizol regent. The total RNA from each of the three plants was pooled to obtain at least 20 μg of RNA, which was further treated with RNase-free DNase I. The RNA integrity was examined using an Agilent 2100 Bioanalyzer. The poly(A) mRNA was isolated using Oligo(dT) Beads. cDNA library construction and sequencing Following the isolation, the mRNA was fragmented into short fragments using fragmentation buffer. The short fragments were used as templates, and the first-strand cDNA was synthesized using random hexamer primers and reverse transcriptase. Following the second-strand cDNA synthesis, the fragments were end repaired, and poly(A) and sequencing adapters were ligated to the fragments. Following the selection of suitable template fragments, enrichment was performed using PCR amplification to create the final cDNA library. The libraries were sequenced using an Illumina HiSeq 2000. Data filtering and de novo assembly The raw reads were filtered by removing the reads containing adaptor sequences, “N” (unknown nucleotides) percentages that were greater than 5%, and low-quality reads (the rate of reads which quality value ≤ 10 is more than 20%) with software “filter_fq”. Then, the transcriptome de novo assembly was performed using the short-read assembly program Trinity [[59]17]. Trinity performs analysis in three steps using different software modules: Inchworm, Chrysalis, and Butterfly. Briefly, Inchworm assembles the cleaning reads into the linear transcript contigs (unique sequences of transcripts) using greedy K-mer extension (k = 25). Then Chrysalis clusters minimally overlapping Inchworm contigs into clusters and builds complete de Bruijn graphs for each cluster, representing the full transcriptional complexity for a given gene. In the last step, Butterfly processed the individual graphs in parallel, compacted the graphs and report full-length transcripts for alternatively spliced isoforms and paralogous transcripts. The sequences generated from Trinity were defined as unigenes. The assembled unigenes from FS and OS were used for further sequence splicing and redundancy removing with clustering software TGICL [[60]18] to get non-redundant unigenes as long as possible. After gene family clustering, the final obtained unigenes were divided into either clusters (shared more than 70% similarity) or singletons. Finally, a Blastx [[61]19] alignment (E-value < 10^-5) was performed between the unigenes and various protein databases, such as the non-redundant protein (nr) database ([62]http://www.ncbi.nlm.nih.gov), the Swiss-Prot protein database ([63]http://www.expasy.ch/sprot), the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database ([64]http://www.genome.jp/kegg) and the Cluster of Orthologous Groups (COG) database ([65]http://www.ncbi.nlm.nih.gov/COG). The best aligning results were used to decide the unigene’s sequence orientation. A priority order of nr, Swiss-Prot, KEGG and COG was used when the outcome from the different databases conflicted. If a unigene did not align to any of the above databases, ESTScan [[66]20] was used to predict its sequence orientation. Unigene annotation and classification For the annotation, the unigenes were first aligned using Blastx against the protein databases, including nr, Swiss-Prot, KEGG and COG (E-value < 10^-5). The protein with the highest sequence similarity was retrieved using the given unigenes and was annotated to each unigene. The GO (gene ontology; [67]http://www.geneontology.org) annotations for the unigenes were performed using the Blast2GO [[68]21] program, based on the best Blastx hits from the nr database (E-value < 10^-5), and the GO functional classification for All-unigenes was performed using WEGO [[69]22]. The COG database was also used for possible functional prediction and classification of the unigenes. The pathway assignments were generated using Blastall software against the KEGG database. Estimation of the expression levels and the differential expression analyses To calculate the unigene expression levels, the FPKM method (Fragments Per kb per Million fragments) was employed, as previously described [[70]23]. The fold changes of the transcripts were calculated using the log[2] formula of OS_ FPKM / FS_FPKM, and 0.01 was used (instead of 0) for the fold change calculation once the value of either OS_ FPKM or FS_FPKM was zero. The differentially expressed genes (DEGs) between the two samples were analyzed with a rigorous algorithm based on Audic’s [[71]24] method. The false discovery rate (FDR) method [[72]25] was used to correct for the P-value in the multiple hypothesis testing. An FDR ≤ 0.001 and an absolute value of log[2]Ratio ≥ 1 were used as the thresholds to judge the significance of the differential gene expression. For the functional and pathway enrichment analysis, the DEGs were then mapped into GO terms (P-value ≤ 0.05) and the KEGG database (q-value ≤ 0.05). Quantitative real-time PCR analysis (qPCR) The qPCR was performed using a Roche LightCycler 480 Real-Time PCR system, as previously described [[73]26]. Actin was used as the internal control gene, and the 2^-ΔΔCT method [[74]27] was used to evaluate the relative quantities of each amplified product in the samples. For each qPCR analysis, three technical replicates were performed. The primers used for the qPCR were provided in [75]S1 Table. Data deposition The clean raw reads data were deposited in NCBI Sequence Read Archive (SRA, [76]http://www.ncbi.nlm.nih.gov/Traces/sra) under the accession number SRP051667. Results Illumina sequencing and reads assembly To investigate the transcriptomic response of S. grandis to overgrazing, the leaves of S. grandis from both FS and OS areas were sampled. When compared with the non-grazing plants, the S. grandis under the overgrazing conditions were significantly reduced in height ([77]S1 Fig). In total, 77,859,274 raw reads for FS and 75,960,480 raw reads for OS were generated using the Illumina HiSeq 2000 sequencing platform. After filtering out the dirty raw reads, 65,954,618 and 64,545,856 clean reads (with an average length of 90 bp) were obtained from the FS and OS samples ([78]Table 1), respectively. Table 1. A summary of the transcriptome sequencing and assembly results in the FS and OS populations of S. grandis. Sample Total Number Total Length (nt) Mean Length (nt) N50 (bp) Raw reads FS 77,859,274 OS 75,960,480 Clean reads FS 65,954,618 OS 64,545,856 Contig FS 193,974 53,395,584 275 375 OS 184,321 50,894,860 276 368 Unigene FS 94,905 61,536,711 648 1130 OS 93,816 59,020,466 629 1103 All-unigenes 94,674 73,795,643 779 1271 [79]Open in a new tab Mean length: Mean length of the assembled sequences. N50: The length of the contig or unigene corresponding to the sequence, which is added to 50% of the total assembled bases when the assembled sequences are sorted from short to long. The Trinity [[80]17] method was adopted to assemble all of the high-quality clean reads into contigs and unigenes. As a result, 193,974 FS contigs and 184,321 OS contigs were generated ([81]Table 1). The majority of the contigs from both samples had similar length distributions in the range of 100–300 bp ([82]S2 Fig). The above contigs were further assembled into 94,905 FS-unigenes and 93,816 OS-unigenes. Notably, all of the unigenes were gap-free sequences. Ultimately, 94,674 unigenes, with a mean length of 779 bp and an N50 of 1271 bp, were obtained by combining the FS and OS unigenes ([83]Table 1), including 44,421 clusters and 50,253 singletons. As shown in [84]Fig 1, the FS and OS unigenes shared similar length distributions; the highest sequence range proportion was 100–500 bp, with 62.18% (59,012 / 94,905) for FS and 63.91% (59,960 / 93,816) for OS. However, All-unigenes dramatically increased in length when the 100-200-bp regions were removed and only the unigenes with long sequences were included; the length distribution percentages increased to 74.31% (70,352 / 94,674) in the 200-1000-bp range. These results suggest that our transcriptome sequencing data were assembled effectively. Fig 1. Length distributions of the FS-, OS- and All-unigenes. [85]Fig 1 [86]Open in a new tab Length distributions of (A) the FS-unigenes, (B) the OS-unigenes, and (C) All-unigenes. The y-axis indicates the number of unigenes, and the x-axis indicates the sequence sizes of the unigenes. Functional annotation of the assembled unigenes For the functional annotation, the unigene sequences were first blasted against the NCBI nr database using Blastx (E-value < 10^-5). Of the 94,674 unigenes, 65,047 (68.71%) were annotated ([87]Table 2), whereas 29,627 unigenes (31.29%) were not matched to any known proteins in the nr database. The E-value frequency distribution analysis revealed that 46% of the sequences shared strong homologies, with E-values ≤ 1.0E-60, while the remaining 54% fell into the range of 1.0E-60–1.0E-5 ([88]Fig 2A). Furthermore, we also observed that 56.9% of the sequences had a similar distribution range between 80% and 100%, but only 5.8% had similarity values less than 40% ([89]Fig 2B). Based on the homologous species identified among the annotated unigenes, 49.0% of the unigene sequences matched to Brachypodium distachyon, followed by Hordeum vulgare subsp. vulgare (17.2%) and Oryza sativa Japonica (12.3%) ([90]Fig 2C). In addition to the nr annotation, 42,233 unigenes (44.61%) were aligned to known proteins in the Swiss-Prot database (E-value < 10^-5) ([91]Table 2). Table 2. A summary of the functional annotations of the assembled unigenes. Public protein database Number of unigene hits Percentage (%)[92] ^a NR 65,047 68.71% Swiss-Prot 42,233 44.61% GO 47,747 50.43% COG 26,156 27.63% KEGG 40,842 43.14% [93]Open in a new tab ^a Represents the proportion of the 94,674 assembled unigenes. Fig 2. Unigene homology searches against the nr database. [94]Fig 2 [95]Open in a new tab (A) The proportional frequency of the E-value distribution. (B) The proportional frequency of the sequence similarity distribution. (C) The proportional species distribution of S. grandis among other plant species. To further functionally categorize the S. grandis unigenes, the genes were matched to the international standardized gene functional classification system (GO). A total of 47,747 unigenes ([96]Table 2) were assigned to one or more GO terms and categorized into 55 functional groups ([97]Fig 3), which belong to three main GO ontologies: molecular function, cellular component and biological process. The results showed that a high percentage of genes were assigned to ‘‘cell”, ‘‘cell part”, ‘‘organelle”, ‘‘membrane”, ‘‘catalytic activity”, ‘‘binding”, ‘‘metabolic process”, ‘‘cellular process” and ‘‘response to stimulus”. However, a few genes were clustered as ‘‘channel regulator activity”, ‘‘metallochaperone activity”, ‘‘nutrient reservoir activity”, ‘‘protein tag”, ‘‘translation regulator activity”, ‘‘extracellular matrix” or ‘‘extracellular matrix part”, ‘‘extracellular region part”, “carbon utilization” and ‘‘locomotion”. Fig 3. Histogram of the GO (gene ontology) classification of the S. grandis unigenes. [98]Fig 3 [99]Open in a new tab All 47,747 unigenes were grouped into three ontologies: molecular function, cellular component and biological process. The y-axis indicates the percentage (left) and number (right) of genes in each term. In an attempt to further evaluate the integrality of our transcriptome library and the effectiveness of the annotation process, COG was used to classify the unigenes. In total, 26,156 unigenes ([100]Table 2) were divided into 25 COG categories ([101]Fig 4), of which the largest group was the cluster ‘‘general function prediction” (9399), followed by ‘‘transcription” (6781), ‘‘function unknown” (6583), ‘‘translation, ribosomal structure and biogenesis” (6523) and ‘‘replication, recombination and repair” (5657). ‘‘Nuclear structure” (9), ‘‘extracellular structures” (23) and ‘‘RNA processing and modification” (257) represented the smallest groups. To further investigate their biological functions, the unigenes were mapped to reference canonical pathways in KEGG. As a result, 40,842 unigenes ([102]Table 2) were KEGG annotated and assigned to 128 pathways. Fig 4. Histogram of the COG (clusters of orthologous groups) functional classification of all of the S. grandis unigenes. [103]Fig 4 [104]Open in a new tab Out of the 94,674 de novo assembled unigenes, 26,156 were annotated and grouped into 25 categories. Differential expression and pathway analyses in the FS and OS populations To reveal the differential expression profiles between FS and OS, the potential DEGs were analyzed. The FPKM method [[105]23] was used to calculate the expression levels. When a threshold of FDR ≤ 0.001 and an absolute value of log[2]Ratio ≥ 1 were used, a total of 13,221 unigenes showed significant differences in response to grazing, of which 6283 unigenes were upregulated and 6938 unigenes were downregulated ([106]Fig 5). Fig 5. Identification of the DEGs (differentially expressed genes) between FS and OS. [107]Fig 5 [108]Open in a new tab The DEGs were determined using a threshold of FDR ≤ 0.001 and an absolute value of log[2]Ratio ≥ 1. The red, green and blue spots represent the upregulated, the downregulated DEGs and the genes without obvious changes in response to grazing, respectively. Base on the nr annotation, a total of 6526 DEGs were assigned 43,257 GO terms and classified into 51 functional categories, which belongs to three main ontologies: biological process (17,746), cellular component (18,307) and molecular function (7204) ([109]S2 Table). By analyzing the “biological process” categories, 18 subcategories, such as “response to wounding” (GO:0009611), “response to water deprivation” (GO:0009414), “response to reactive oxygen species” (GO:0000302), “defense response” (GO:0006952), “proline biosynthetic process” (GO:0006561) and “chitin catabolic process” (GO:0006032), which is related to resistant responses to abiotic and biotic stressors, were observed ([110]Fig 6). Fig 6. The GO term assignments for the DEGs related to the stress response under the overgrazing and non-grazing conditions. [111]Fig 6 [112]Open in a new tab The red bars represent the upregulated DEGs, the black bars represent the downregulated DEGs. To further explore the biological functions of the DEGs, a pathway enrichment analysis was performed. A total of 8747 DEGs were enriched in 32 metabolic pathways (q-value ≤ 0.05), and the top enriched metabolic pathways included “plant-pathogen interaction” (675 DEGs), “RNA transport” (951 DEGs), and “mRNA surveillance pathway” (830 DEGs) ([113]S2 Table). Validation of the transcriptome results via qPCR analysis To validate the expression levels in the transcriptome, 30 genes exhibiting different expression patterns (10 upregulated, 10 downregulated, and 10 unigenes with no obvious changes in transcript abundance, with a |log[2]Ratio| < 1 between FS and OS) were randomly selected and examined using qPCR. As a result, the relative expression patterns of the 30 unigenes tested via qPCR correspond well with the transcriptome data that was determined using the FPKM method, exhibiting clear up- or down-regulation or no change in response to grazing ([114]Fig 7), suggesting that the transcriptomic profiling data were reliable. However, some quantitative differences in the relative expression levels were observed, which may be due to the differential sensitivities of the two techniques. Fig 7. qPCR validation of the gene expression patterns. [115]Fig 7 [116]Open in a new tab A total of 30 genes, including (A) 10 downregulated, (B) 10 upregulated, and (C) 10 unigenes with no difference in expression and with an absolute value of log[2]Ratio < 1, were selected, and the expression levels were confirmed using qPCR. The gray bars represent the changes in transcript abundance determined by the FPKM method. The black bars represent the relative expression levels estimated using qPCR and the 2^-ΔΔCT method. The error bars indicate the standard deviation (n = 3). Genes related to stress and defense Grazing is a process containing multiple stressors, involving both biotic and abiotic stressors and including trampling / wounding, water stress and pathogen infection. Because JAs (jasmonates) play a key role in the wound response [[117]28], 52 DEGs involved in the JA biosynthesis and signaling pathway, along with those regulated by wounds, were identified, of which 38 genes were upregulated and the other 14 were downregulated in response to grazing ([118]Table 3 and [119]S3 Table). The genes involved in JA biosynthesis included AOS (allene oxide synthase, also known as CYP74A) [[120]29], LOX (lipoxygenase) [[121]30], OPR (12-oxophytodienoate reductase) [[122]31], FAD (Fatty acid desaturase) [[123]32] and PLD (phospholipase D) [[124]33]. Other genes that are relevant to the JA signaling pathway included JAZ (jasmonate ZIM domain) / TIFY and COI1 (coronatine insensitive 1) [[125]34, [126]35]. Table 3. The DEGs related to abiotic stress. Genes Gene ID FS FPKM OS FPKM Fold change FDR Genes Gene ID FS FPKM OS FPKM Fold change FDR AOS CL12787.Contig1 2.78 18.15 2.71 4.49E-48 JAZ / TIFY CL513.Contig5 8.37 39.47 2.24 1.86E-122 CL9995.Contig1 0.81 6.61 3.02 5.42E-44 CL513.Contig3 9.79 42.01 2.10 2.25E-116 Unigene7257 32.08 8.54 -1.91 4.58E-35 CL513.Contig2 9.73 33.12 1.77 9.82E-84 CL9995.Contig2 0.83 5.89 2.83 7.11E-35 CL513.Contig1 6.11 21.36 1.80 1.10E-56 LOX CL7512.Contig1 2.51 32.91 3.71 0 CL513.Contig4 4.23 10.46 1.31 4.68E-23 CL789.Contig2 4.33 87.26 4.33 6.07E-142 CL15037.Contig1 10.40 22.98 1.14 3.13E-40 CL9755.Contig1 2.34 53.58 4.52 2.49E-85 DREB CL9452.Contig2 30.09 10.81 -1.48 1.24E-73 CL7512.Contig2 4.99 11.04 1.14 6.90E-11 CL9452.Contig1 28.35 11.09 -1.35 1.98E-62 Unigene11705 2.86 6.82 1.25 7.89E-07 CL1929.Contig6 8.16 20.68 1.34 9.79E-57 FAD CL40.Contig4 245.67 515.41 1.07 0 CL1929.Contig8 5.72 15.96 1.48 1.16E-50 CL40.Contig1 8.69 21.69 1.32 5.58E-24 CL1929.Contig5 5.13 12.43 1.28 9.14E-41 Unigene22017 22.84 46.65 1.03 6.55E-22 CL1929.Contig9 4.45 10.82 1.28 9.95E-35 CL491.Contig1 0.07 2.67 5.28 4.38E-10 CL13869.Contig1 8.48 17.39 1.04 2.97E-11 Unigene40701 0.00 2.22 11.12 8.11E-08 NCED CL8856.Contig4 19.29 39.22 1.02 2.98E-105 CL5434.Contig3 3.13 7.51 1.26 2.00E-07 CL8856.Contig1 11.30 30.62 1.44 4.62E-31 Unigene12113 7.90 3.22 -1.30 5.07E-05 CL12565.Contig1 1.88 6.31 1.74 4.06E-08 OPR CL5872.Contig1 192.88 537.10 1.48 0 CL1476.Contig1 3.54 7.13 1.01 6.82E-07 CL5872.Contig4 166.32 430.50 1.37 0 Dehydrin CL4123.Contig1 145.71 362.95 1.32 0 CL5872.Contig8 140.87 377.14 1.42 0 CL4123.Contig3 47.86 192.08 2.00 0 CL5170.Contig2 4.40 1.60 -1.46 1.06E-05 CL4123.Contig4 43.21 164.64 1.93 0 CL5872.Contig7 0.79 3.41 2.11 5.39E-05 CL5735.Contig3 57.07 139.37 1.29 1.62E-212 PLD CL15321.Contig2 3.75 8.26 1.14 4.37E-14 Unigene15994 20.98 82.84 1.98 9.83E-204 CL112.Contig3 5.12 12.61 1.30 5.36E-93 CL5735.Contig2 42.40 102.93 1.28 2.24E-126 CL3882.Contig2 4.86 1.77 -1.46 8.21E-05 Unigene33265 3.82 0.45 -3.08 5.03E-18 Unigene32822 2.50 0.80 -1.64 1.05E-05 LEA Unigene15689 1.62 60.39 5.22 0 Unigene7167 2.36 0.70 -1.76 1.39E-04 Unigene24764 37.15 277.73 2.90 0 CL3882.Contig1 5.07 1.98 -1.35 5.42E-04 Unigene9383 17.29 51.87 1.59 8.80E-90 COI1 CL13823.Contig1 0.85 3.28 1.95 1.60E-14 CL3750.Contig1 15.01 37.86 1.33 8.90E-52 CL6692.Contig6 1.22 3.21 1.39 2.59E-11 CL6420.Contig4 2.71 12.32 2.18 6.91E-19 CL6692.Contig2 1.00 2.32 1.21 7.71E-07 Unigene19508 4.86 2.36 -1.04 2.40E-07 Unigene18713 13.46 5.73 -1.23 4.60E-20 CL6420.Contig2 2.36 4.85 1.04 6.03E-05 Unigene17090 13.98 6.69 -1.06 6.67E-11 Unigene8570 0.62 3.20 2.36 1.44E-04 [127]Open in a new tab The prefix “CL” represents clusters, and “unigene” represents singletons. FPKM indicates the FPKM values of the unigenes in FS or OS. “Fold change” is equal to log[2] (OS-FPKM / FS-FPKM). “+” indicates upregulated transcription, and “-” represents downregulated transcription. Based on previously published data [[128]36–[129]39], we searched for water-deprivation response genes in our grazing-regulated DEGs, such as the LEA (late embryogenesis abundant), DREB (dehydration-responsive element-binding), NCED (9-cis-epoxycarotenoid dioxygenase) and dehydrin genes. We found that the majority of the transcripts (76.9%, 30 / 39) of these genes were increased by grazing ([130]Table 3 and [131]S3 Table). Together, these results suggest that grazing induces wound- and water-stress response of S. grandis. Overall, we identified a total of 115 defense-response related DEGs among the 675 DEGs in the plant-pathogen interaction pathway ([132]Table 4 and [133]S3 Table), including the pathogenesis-related gene PR1 (Unigene6840), which is considered to be one of the SAR (systemic acquired resistance) marker genes [[134]40], and one PRB1-2-like gene (Unigene2461), which had accumulated in the OS sample, indicating that disease resistance was activated. Six leucine-rich repeat receptor kinase (LRR-RK) transcripts, encoding the FLS2 (flagellin sensitive 2) genes [[135]41], and 11 RPM1 (resistance to Pseudomonas maculicula protein 1) [[136]42] were increased by overgrazing. In contrast, 4 homologs of RIN4 (encoding an RPM1-interacting protein) [[137]42], showed attenuated expression. Additionally, the transcripts of the SGT1 (suppressor of G-two allele of Skp1) [[138]43] and Hsp90 (heat shock protein 90) genes were significantly enhanced by grazing. Notably, a set of WRKY genes, including WRKY-1, -7, -11, -55, and -70, also exhibited altered expression due to overgrazing. These results indicate that the immune response of S. grandis is enhanced and that disease resistance is activated by overgrazing. Table 4. The DEGs related to defense. Genes Gene ID FS FPKM OS FPKM Fold change FDR Genes Gene ID FS FPKM OS FPKM Fold change FDR PR1 Unigene6840 7.12 18.88 1.41 2.84E-28 SGT1 CL3879.Contig2 18.93 47.68 1.33 3.11E-195 Unigene2461 3.16 7.18 1.18 2.46E-04 CL3879.Contig1 33.79 71.54 1.08 4.84E-150 FLS2 Unigene43887 0.15 2.50 4.04 5.06E-04 WRKY6 CL178.Contig2 16.26 6.01 -1.44 2.94E-63 Unigene43888 0.45 3.06 2.76 8.10E-09 CL178.Contig1 17.39 6.67 -1.38 1.04E-63 Unigene12048 0.63 3.83 2.60 6.61E-11 WRKY3 CL6449.Contig1 7.00 2.75 -1.35 1.61E-20 Unigene2737 0.66 3.76 2.51 1.40E-24 CL6449.Contig3 5.78 2.13 -1.44 1.31E-20 Unigene11377 0.53 2.41 2.18 2.35E-04 CL6449.Contig4 4.14 1.68 -1.30 1.07E-13 CL3221.Contig2 1.22 3.95 1.69 2.18E-09 CL6449.Contig2 12.46 4.99 -1.32 3.03E-37 RPM1 CL9260.Contig3 0.07 3.64 5.71 1.73E-27 CL6449.Contig5 3.80 1.63 -1.22 2.83E-10 CL13422.Contig1 0.00 2.92 11.51 2.43E-14 WRKY20 CL511.Contig1 34.97 14.03 -1.32 2.40E-130 CL11573.Contig2 0.98 3.06 1.65 2.98E-10 CL511.Contig9 27.55 11.02 -1.32 1.07E-108 Unigene40051 0.00 2.99 11.55 7.08E-10 CL511.Contig2 35.84 15.82 -1.18 3.09E-113 Unigene17462 1.93 4.93 1.36 1.79E-09 CL511.Contig8 32.29 13.68 -1.24 2.53E-115 Unigene38815 0.09 2.58 4.89 1.54E-07 CL511.Contig5 26.03 11.33 -1.20 1.54E-89 CL15353.Contig1 0.00 1.90 10.89 8.78E-06 CL511.Contig6 27.04 11.47 -1.24 5.94E-97 CL1475.Contig1 0.72 2.40 1.74 3.07E-05 WRKY71 CL2147.Contig3 7.14 3.18 -1.17 2.33E-11 CL467.Contig1 1.61 3.56 1.15 3.60E-05 CL2147.Contig1 22.34 5.57 -2.00 1.75E-72 Unigene39466 1.10 3.34 1.60 4.78E-05 WRKY21 Unigene20256 20.10 7.28 -1.47 6.58E-98 Unigene42608 0.00 1.60 10.64 6.33E-05 WRKY11 CL4223.Contig1 28.86 11.56 -1.32 4.59E-13 CL7239.Contig3 3.64 0.93 -1.96 3.19E-44 WRKY58 CL4473.Contig1 13.82 5.23 -1.40 1.20E-47 Unigene17990 13.76 0.86 -3.99 8.50E-42 CL4473.Contig3 29.24 9.84 -1.57 6.80E-116 CL7239.Contig4 3.52 1.03 -1.77 1.51E-37 WRKY7 CL168.Contig3 41.47 15.13 -1.45 1.86E-59 CL15522.Contig1 7.39 1.89 -1.97 2.12E-36 CL168.Contig4 21.19 7.30 -1.54 6.12E-44 CL3622.Contig6 5.28 2.28 -1.21 5.23E-32 WRKY1 CL14068.Contig1 55.85 11.28 -2.31 8.47E-132 CL4636.Contig4 4.87 2.39 -1.03 3.07E-24 CL14068.Contig2 28.14 10.42 -1.43 6.99E-39 CL11476.Contig2 3.42 0.19 -4.13 6.12E-16 Unigene18532 6.58 2.33 -1.50 1.04E-11 Unigene30488 2.44 0.00 -11.25 2.99E-13 CL5938.Contig1 9.58 3.95 -1.28 4.22E-28 CL11573.Contig1 3.17 0.46 -2.80 2.92E-09 CL1657.Contig4 212.46 101.23 -1.07 0 Unigene7686 5.43 2.48 -1.13 3.48E-07 WRKY55 CL2736.Contig5 1.12 5.94 2.41 5.91E-31 RIN4 CL1992.Contig2 0.75 1.72 1.21 5.88E-04 CL2736.Contig6 0.73 3.43 2.24 4.89E-18 CL10916.Contig3 51.66 19.81 -1.38 6.30E-122 CL2736.Contig3 2.00 4.43 1.14 1.02E-07 CL10916.Contig4 36.82 15.49 -1.25 2.63E-127 CL2736.Contig9 0.74 5.88 2.99 1.78E-42 CL10916.Contig5 32.11 14.17 -1.18 3.41E-54 CL2736.Contig1 1.31 9.04 2.79 7.72E-56 CL10916.Contig6 26.19 12.78 -1.04 8.63E-64 CL2736.Contig4 1.48 6.52 2.14 6.18E-28 HSP90 CL898.Contig1 0.25 48.67 7.61 0 Unigene17732 0.50 2.79 2.47 6.72E-17 CL898.Contig4 0.14 23.27 7.34 6.49E-92 CL11908.Contig1 12.45 6.17 -1.01 2.85E-18 CL898.Contig2 0.12 8.58 6.10 5.98E-37 WRKY33 Unigene4825 44.30 7.94 -2.48 2.91E-73 CL15329.Contig2 0.09 3.80 5.36 1.17E-10 Unigene4405 81.66 24.90 -1.71 3.21E-73 Unigene45696 0.65 14.04 4.43 8.99E-65 CL5906.Contig2 63.08 12.53 -2.33 1.29E-182 CL15329.Contig1 0.73 6.78 3.22 8.21E-27 CL5906.Contig1 78.48 23.50 -1.74 2.23E-157 CL898.Contig10 74.90 573.39 2.94 0 WRKY22 CL4876.Contig1 5.74 1.14 -2.33 2.06E-18 Unigene13091 1.26 5.44 2.11 1.13E-09 CL13979.Contig1 7.54 2.37 -1.67 1.22E-19 CL898.Contig5 40.59 166.29 2.03 0 CL13979.Contig2 9.35 4.09 -1.19 2.22E-04 CL898.Contig6 64.46 222.26 1.79 0 Unigene15232 7.16 3.20 -1.16 7.16E-16 CL898.Contig3 36.85 121.77 1.72 1.02E-238 WRKY4 CL9110.Contig3 14.89 5.92 -1.33 1.13E-32 CL898.Contig8 38.92 125.68 1.69 0 WRKY46 CL6991.Contig2 2.52 7.10 1.49 9.25E-13 Unigene3765 2.53 6.52 1.36 5.29E-07 WRKY70 CL10835.Contig2 4.72 0.55 -3.10 1.97E-15 CL898.Contig9 22.51 112.51 2.32 1.71E-116 Unigene12673 15.69 7.30 -1.10 2.33E-32 [139]Open in a new tab The prefix “CL” represents clusters, and “unigene” represents singletons. FPKM indicates the FPKM values of the unigenes in FS or OS. “Fold change” is equal to log[2] (OS-FPKM / FS-FPKM). “+” indicates upregulated transcription, and “-” represents downregulated transcription. Discussion Sequence assembly Since the Trinity method was developed [[140]17], NGS has become widely used for de novo transcriptome analysis [[141]14], especially in non-model plants lacking genome sequence information [[142]17], such as Reaumuria trigyna [[143]44], Camelina sativa [[144]45] and Hevea brasiliensis [[145]46]. In this study, we used Illumina RNA-Seq technology sequenced two libraries prepared from the FS and OS samples and obtained high quality of de novo assembly data. Further analysis indicated that the unigenes of S. grandis that were annotated and the expression patterns of S. grandis in response to grazing were calculated correctly. This report is the first to comprehensively analyze the transcriptome and to identify the differentially expressed genes in S. grandis under grazing conditions without prior genome information. Prior to this study, Chen et al. (2009) used monocot rice plants, which have been entirely sequenced, as a model and simulated grazing by cutting and dabbing cow saliva to study the corresponding gene expression in response to grazing defoliation; however, that experiment only simulated grazing using rice but not wild plants in the actual habitat. Recently, this research group also reported defoliation treatment response in Leymus chinensis and sorghum bicolor plants [[146]47–[147]49]. In the present study, we selected S. grandis, a typical species found in the grasslands of Inner Mongolia [[148]10], as our starting material, and using transcriptomic RNA-Seq analysis, we provided a complete gene expression profile for grazing, thus facilitating further studies on the complex responses of plants to grazing at the molecular level. The wound response in plants Wounding is an inevitable threat for the survival of all organisms [[149]50] that occurred more frequently under overgrazing condition. Many genes, such as those related to wounds and resistance to potential pathogen attacks, were activated in a grazing simulation study [[150]51]. As a typical wound hormone [[151]52], JA accumulates and modulates the expression of wound-associated genes in response to the plant’s wounds [[152]53]. Reymond et al. (2000) analyzed the expression dynamics of 150 genes in mechanically wounded Arabidopsis leaves using a cDNA microarray technique and found that the genes involved in the synthesis or metabolism of members of the JA family are induced, including LOX, FAD, AOS and OPR. Chen et al. (2014) using the Illumina / Solexa platform sequenced seven cDNA libraries prepared from control, wounded (2, 6 and 24 h) and defoliated (2, 6 and 24 h) L. chinensis plants, by comparing the transcriptomic data, 1836 and 3238 genes were significantly differential expressed between wounding and defoliation treatment within one day. Among these genes, such as LOX, AOS and OPR were commonly activated in response of wounding and defoliation. Consistent with this observation, we also found that a number of the above genes, along with 2 PLDs, were upregulated by grazing ([153]Table 3). Wounding also induced JAZ10 / TIFY9. In the core module (COI1–JAZs–MYC2) of the JA signaling pathway, the JAZ / TIFY genes are transcriptional targets of MYC2, and JA treatment quickly induces their transcripts, followed by the degradation of the SCF^COI1-dependent proteasome and activation of the JA response [[154]34, [155]35]. The activation of these genes indicates that the JA-dependent wounding response was likely activated. Repeatedly wounding the leaves of Arabidopsis resulted in stunted growth and increased endogenous JA content; however, these treatments did not stunt the growth of mutants that were deficient in JA synthesis, such as the aos and opr3 mutants and the fad3-2fad7-2fad8 triple mutant, indicating that wound-induced JA significantly suppresses plant growth [[156]28]. Based on the literature and our own research, it was hypothesized that the wound response is enhanced by overgrazing, and wound-induced JAs are likely to participate in overgrazing-inhibited plant growth. Additionally, PLD is involved in the response to drought stress [[157]54], and JAs confer plants with the capacity to counter multiple biotic stimuli, such as pathogens [[158]55]. These observations suggest that the complex interplay of gene expression patterns most likely also occurs under overgrazing conditions. The drought response in plants In addition to damage the plant, grazing also reduces the soil’s water content [[159]5]. Furthermore, water stress is an important participant in the plant’s response to mechanical wounding [[160]50]; therefore, plants also suffer from drought stress under grazing conditions. In plants, CRT / DRE (C-repeat / dehydration-responsive element) is a cis-acting DNA regulatory element that initiates transcription in response to water deficiency [[161]39]. In Arabidopsis, the expression levels of all of the CRT / DRE binding factors (CBFs / DREB1s) were low under normal growth conditions, but the transcripts were immediately enhanced following drought stress [[162]56, [163]57]. DREB1 overexpression in Arabidopsis not only strengthened the plant’s tolerance to drought but also resulted in plant growth retardation [[164]58]. The constitutive expression of Arabidopsis CBF1 in Brassica napus elevated drought tolerance [[165]57]. The LEA (late-embryogenesis abundant) proteins, expressed by many prokaryotes and eukaryotes, are hydrophilic proteins associated with tolerance to dehydration [[166]36, [167]37]. Many of the genes encoding the LEA proteins in Arabidopsis contain ABRE (ABA responsive element) and the DRE / CRT / LTRE element [[168]37]. The expression of the LEA proteins are highly induced by water stress in peas (Pisum sativum) [[169]36]. We also found that the transcripts of CBF / DREB and LEA were highly induced by overgrazing ([170]Table 3), indicating that grazing involve in plant’s tolerance to dehydration. The plant hormone ABA (abscisic acid) plays a critical role in the plant’s adaptation to abiotic environmental stressors (such as drought). During vegetative growth, ABA accumulates in the plant’s cells and regulates the expression of many genes during drought stress [[171]59]. NCED (9-cis-epoxycarotenoid dioxygenase) is a key enzyme in ABA biosynthesis. NCED expression is induced, coinciding with an increased level of endogenous ABA in Arabidopsis, when exposed to drought stress [[172]60]. AtNCED3 overexpression in Arabidopsis confers improved drought tolerance via an increase in the ABA level, while AtNCED3 antisense plants and the T-DNA insertion mutant show a drought-sensitive phenotype with lower ABA levels when compared to wild-type plants [[173]60]. We observed that NCED and 6 dehydrin transcripts were increased by overgrazing ([174]Table 3). Therefore, we hypothesize that the drought response in S. grandis may be activated and elevated drought tolerance when the plant is subjected to grazing. Plant immunity Plants under grazing conditions are more likely to be attacked by pathogenic organisms when compared to non-grazing plants. On the one hand, the feces and urine deposition of livestock contain various microorganisms, and an increasing richness and abundance of microorganisms have been identified in soil following grazing due to trampling and fecal and urine deposition [[175]61, [176]62]. On the other hand, the open wound damage to plant tissues caused by mechanical wounding provides a potential infection site for pathogen invasion [[177]50, [178]52]. Plants use two innate immune system modes to protect against microbial pathogen attacks: pattern-triggered immunity (PTI), which is triggered by the perception of microbe-associated or pathogen-associated molecular patterns (MAMPs or PAMPs) through pattern recognition receptors (PRRs); or effector-triggered immunity (ETI), which is triggered by the recognition of pathogen effectors [[179]63]. The perception of microbial pathogens by plants can be described in four phases using the “zigzag” model [[180]64]. In the present study, six FLS2 genes, which function as PRRs in plants to perceive bacterial flagella [[181]41], showed upregulated expression following overgrazing ([182]Table 4). Many pathogen virulence effectors are secreted through the type III secretion system (TTSS) and are recognized by plants with corresponding R genes [[183]55]. In Arabidopsis, the nucleotide-binding leucine-rich repeat (NB-LRR) classes of the R protein RPM1 confer resistance to Pseudomonas syringae strains expressing the avirulence genes avrB and avrRpm1 [[184]42, [185]65]; loss-of-function of RPM1 plants show sensitive symptoms to P. syringae [[186]66]. However, the RPM1-interacting protein RIN4, which is the target of the type III virulence effector and required for RPM1 activation [[187]42], acts as a negative regulator of the plant’s basal defense response [[188]67]. Consistent with this observation, we found that the transcripts of 11 RPM1 genes were enhanced, whereas 4 RIN4 genes were compromised by overgrazing ([189]Table 4). We also observed that the genes required for R gene activities were induced by grazing. For example, SGT1, a component of SCF (Skp1-Cullin-F-box protein) ubiquitin ligases and a regulator, was active early in the plant’s R gene-mediated defense; the SGT1b mutation of Arabidopsis was defective in the plant’s early defenses [[190]68]. Additionally, the molecular chaperone HSP90, which is critical for disease resistance in Arabidopsis due to its interaction with SGT1 [[191]69], showed increased gene expression under the grazing condition. Therefore, we hypothesize that grazing elevates the resistance of S. grandis to pathogenic organisms. Several transcription factors (TFs), including WRKYs, also show altered expression during ETI and PTI [[192]70]. Accumulating data implicate the WRKY TFs in the plant immune response, acting as both positive and negative regulators of disease resistance [[193]70]. For example, overexpression of the pathogen-inducible OsWRKY31, also known as WRKY55 [[194]71], enhances disease resistance in transgenic rice plants [[195]72]. In contrast, WRKY7 and WRKY11 function as negative regulators in the plant defense response [[196]73, [197]74]. In Arabidopsis, the loss-of-function WRKY70 mutant showed enhanced disease susceptibility to Erysiphe cichoracearum, while WRKY70-overexpressing transgenic plants increased resistance to E. cichoracearum and suppressed the JA-induced defense response [[198]75]. The altered expression of WRKY genes in present study ([199]Table 4 and [200]S3 Table) suggest that WRKY TFs may be an important component in the resistance of S. grandis to pathogenic organisms to provide grazing tolerance. However, the real roles of the WRKY TFs in S. grandis still need to be further identified. Based on the genes discussed above, we conclude that grazing not only subjects plants to biotic stressors and activates the plant’s immune response, it also confers resistance to pathogenic organisms. In conclusion, to cope with the stressful occurrences that are associated with grazing, plants develop an avoidance / resistance strategy for optimal growth, such as size reduction. It is well documented that pathogens and abiotic stressors, such as wounding and drought stress, severely impact plant performance and productivity [[201]28, [202]76, [203]77]. Dwarf plants under these stressors always show enhanced tolerance and altered gene expression [[204]28, [205]58, [206]78]. Therefore, our observation supports the notion that the dwarf phenotypes of S. grandis are induced by overgrazing, which is at least partially caused by the additive effects of multiple stressors, including wounding, drought and immunity signals. Supporting Information S1 Fig. The growth status of S. grandis under the overgrazing and non-grazing conditions. (DOC) [207]Click here for additional data file.^ (2.1MB, doc) S2 Fig. The length distributions of the FS and OS contigs. (DOC) [208]Click here for additional data file.^ (163.5KB, doc) S1 Table. The primers used for the qPCR analysis. (XLS) [209]Click here for additional data file.^ (18KB, xls) S2 Table. The GO functional classification and the KEGG pathway analysis of the DEGs. (XLS) [210]Click here for additional data file.^ (45KB, xls) S3 Table. A summary of the genes related to abiotic stress and defense. (XLS) [211]Click here for additional data file.^ (174KB, xls) Acknowledgments