Abstract
Switchgrass is an important bioenergy crop typically grown in marginal
lands, where the plants must often deal with abiotic stresses such as
drought and salt. Alamo is known to be more tolerant to both stress
types than Dacotah, two ecotypes of switchgrass. Understanding of their
stress response and adaptation programs can have important implications
to engineering more stress tolerant plants. We present here a
computational study by analyzing time-course transcriptomic data of the
two ecotypes to elucidate and compare their regulatory systems in
response to drought and salt stresses. A total of 1,693 genes (target
genes or TGs) are found to be differentially expressed and possibly
regulated by 143 transcription factors (TFs) in response to drought
stress together in the two ecotypes. Similarly, 1,535 TGs regulated by
110 TFs are identified to be involved in response to salt stress. Two
regulatory networks are constructed to predict their regulatory
relationships. In addition, a time-dependent hidden Markov model is
derived for each ecotype responding to each stress type, to provide a
dynamic view of how each regulatory network changes its behavior over
time. A few new insights about the response mechanisms are predicted
from the regulatory networks and the time-dependent models. Comparative
analyses between the network models of the two ecotypes reveal key
commonalities and main differences between the two regulatory systems.
Overall, our results provide new information about the complex
regulatory mechanisms of switchgrass responding to drought and salt
stresses.
Introduction
Switchgrass (Panicum virgatum L.) is a warm-season C4 grass native to
North America, and is considered a major biofuel crop for cellulosic
ethanol production. Because of its strong adaptability and rapid growth
[[36]1–[37]4], switchgrass is primarily grown in marginal lands
[[38]5], which tend to be affected by abiotic stress, such as drought
and salt stresses [[39]6, [40]7]. Recent studies suggest that these two
are the major factors that may limit the biofuel production
[[41]8–[42]10]. Hence, it represents an important task to understand
how plants adapt to these stresses and how to possibly engineer
switchgrass to make the plant more tolerant to drought and salt
stresses.
Previous studies have reported that lowland ecotypes tend to be more
tolerant to drought and salt than the upland ecotypes determined based
on specific physiological, morphological, or metabolic parameters
[[43]10, [44]11]. Specifically, (i) lowland ecotypes tend to have
higher levels of the following under drought stress: photosynthetic
rates (P[n]), transpiration rates (T[r]), stomatal conductance (g[s]),
intercellular CO[2] (C[i]) level and relative water content (RWC) in
leaves, and lower electrolyte leakage (EL), in comparison with the
upland ecotypes; and (ii) they generally have lower levels of salt
stress-induced cell-membrane damages due to their higher levels of
P[n], g[s], T[r], leaf photochemical efficiency and chlorophyll content
(Chl content) and lower EL content. As of now, no genetic-based
research has been published regarding detailed understanding of what
specifically make one ecotype more tolerant to these stresses than the
other.
Here we present a computational study of public time-course
gene-expression data of Alamo (a lowland ecotype of switchgrass) and
Dacotah (an upland ecotype) treated with drought and salt stresses. The
goals are (1) identification of key biological processes that respond
to each of the two stresses; (2) inference of the functional
relationships among the identified processes, hence providing a systems
level understanding about the response program to each stress; and (3)
comparison of commonalities and differences in the response processes
between the two ecotypes, aiming to gain improved understanding of what
may have made one ecotype more stress tolerant than the other.
A key in addressing these questions, particularly (2) and (3) lies in
identification of the main transcription regulators (TFs) of the
response programs to the two stress types. To accomplish this, we have
identified all differentially expressed TFs in stress-treated vs.
untreated switchgrass samples. In addition, we have also identified
differentially expressed genes (TGs) known to play key roles in stress
response and adaptation, such as biosynthesis of osmoprotectants;
control of the water flux; adjustment of the cellular osmolarity
[[45]12, [46]13]; influx and efflux through plasma membrane and vacuole
relevant to intracellular ion homeostasis [[47]14]; maintaining redox
homeostasis [[48]15]; reprogramming of plant primary metabolisms; and
alteration of the cellular architecture for more efficient energy
supply [[49]16]. Based on the identified genes in these categories, we
have predicted a regulatory network for the response system to each
stress, consisting of key TFs and predicted regulatory relationships
between the TFs and functional TGs involved in these metabolic
processes, which is most consistent with the time-course data collected
from the public domain.
Overall, two regulatory networks are predicted, one with 143 TFs, 1,693
stress-response TGs and 11,784 TF-TG regulatory relationships for
drought stress, and the other having 110 TFs, 1,535 stress response TGs
and 8,411 TF-TG relationships for salt stress. Among the predicted
TF-TG relationships, 46.1% and 43.5% in the drought and salt response
programs are validated against two independent data sources,
respectively. Comparative analyses revealed that Alamo has different
stress response systems from those in Dacotah.
Materials and methods
Data
Two ecotypes of switchgrass: Dacotah and Alamo are used in the study.
The transcriptomic data of drought and salt-stressed Dacotah and Alamo
used in this study are retrieved from the Sequence Read Archive (SRA)
[[50]17] with recorder numbers PRJNA323435 and PRJNA323349,
respectively. Specifically, plants of each ecotype to be
stress-treated, along with the controls, are grown in separate pots in
a greenhouse [[51]11].
Once the plants reach the E5 development stage after two months
[[52]18], they are exposed to one of two stress treatments: drought or
salt stress, for 30 days, separately. Leaf tissue samples from the same
plants of the two ecotypes are taken at six (0d, 6d, 12d, 18d, 24d,
30d) and eight (0h, 12h, 24h, 48h, 6d, 12d, 18d, 24d) time points after
treatment of drought and salt stresses, respectively, along with the
matching leaf samples collected from the untreated plants, for RNA-Seq
analyses. To minimize the leaf age effect, we have four tillers from
the original single tiller transplanted for each pot and at each
sampling time, and cut the same section of the leaf blade (~100 mg) of
the top 2^nd and 3^rd fully developed leaves from tillers with the same
size. For each time point, three biological replicates are made, giving
rise to a total of 72 and 96 samples for the two ecotypes,
respectively. It is noteworthy that 3 pots are used for each treatment
along with 3 controls pots, with each treatment having its own control
pot. All plants for each ecotype are the clonal replicates of a single
mother plant. These two data sets are of high-quality as reflected by
that the biological replicates for each sample have strong correlations
among themselves, with Pearson correlation coefficient (PCC) > 0.9.
The switchgrass genome version 4 (Pvir_v4) and annotation are
downloaded from JGI [[53]19]. Protein sequences encoded in three model
species: Arabidopsis, maize and rice are also downloaded from JGI and
used in this study. 4,134 TFs of switchgrass and GO-based functional
annotation of switchgrass genes are downloaded from PlantTFDB [[54]20].
The predicted TF-TG interactions in Arabidopsis, maize and rice are
collected from the literature [[55]21–[56]26], respectively.
RNA-Seq data processing
The downloaded files of gene-expression data for the 72 and 96 samples
are converted into the fastq format using fastq-dump command in the SRA
Toolkit (version 2.8.2–1) [[57]27]. The following data-processing
steps: adapter-trimming, quality-trimming, filtering and
contaminant-filtering, are applied to each fastq file using the BBDuk
tools (version 35.82) [[58]28], a JGI pipeline [[59]29] for filtering
and trimming RNA reads containing adapters and contaminants. Hisat2
(version 2.1.0) [[60]30, [61]31] is used, with default parameters, to
map RNA reads onto the Pvir_v4 genome. Gene-level raw RNA-read counts
are summarized using FeatureCounts (version 1.5.0) [[62]32, [63]33]
from the aligned sam files. Two and four samples with less than 70%
total alignments to the genome [[64]34] are removed from the two
collections of samples, leaving 70 and 92 samples for drought and salt
stresses, respectively ([65]S1 Table).
We have estimated the scaling factors between samples from two ecotypes
for each stress by using the Trimmed Mean of M-values (TMM) in the
edgeR package (version 3.20.9) [[66]35, [67]36]. The count per million
(CPM) is computed using the ‘cpm’ function provided in the same package
to estimate the gene-expression levels. Only genes with expression
levels higher than 2 CPM in more than 90% of the samples for each
stress type are kept. At the end, 20,581 and 20,836 genes for drought
and salt stresses are used for further analyses, respectively.
Statistical analysis
The raw RNA-read counts are put into the edgeR package [[68]35, [69]36]
and used to determine the differentially expressed genes (DEGs). For
each treated sample, a fold-change value (after log transformation) and
the associated p-value are calculated against the matching untreated
samples. Each gene with absolute fold change > 2 and p-value < 0.05 is
considered as a DEG. A gene in an ecotype is considered as stress
affected if it is a DEG at two time-points or more among the stress
treated samples of the ecotype. For each stress and each ecotype, 1,500
stress affected genes with the lowest p-values are kept to construct a
gene regulatory network (GRN). Overall, 2,441 and 2,429 stress affected
genes for drought and salt stresses are kept for the two ecotypes,
respectively, each referred as stress-responding gene, of which 147 and
125 are TFs.
Construction of genetic regulatory network
The initial GRN is constructed based on co-expression information
measured using the Pearson correlation coefficient. Specifically, a
pair of TF and TG with PCC > 0.70 is predicted to have a regulatory
relationship. Then, this network is put into the Network Component
Analysis (NCA) algorithm (version 2.3) [[70]37, [71]38] to calculate
differential expression profiles of the stress-regulated TFs, where NCA
is a method for reconstructing TF activities (TFAs) and estimating
control strengths (CSs) through estimating the partial and qualitative
network connectivity information based on gene-expression data.
Specifically, an NCA model is described as:
[MATH:
[X]=[A][P] :MATH]
where X is a matrix of gene expression values, with rows for genes and
columns for samples; A is a binary matrix of (predicted) regulatory
relationships between TFs (columns) and TG (rows), and P is a TFA
matrix with columns for samples and rows for TFs; and A[i,j] is 1 if
and only if there is a regulatory relationship between TF j and gene i;
otherwise 0. A desired decomposition of X to A and P is achieved by
minimizing the following objective function:
[MATH: min|<
mo>|[X]−[P][A]||2
mrow>, :MATH]
under the following constraints:
1. [A] has full column-rank;
2. when a node in the regulatory layer is removed along with all the
output nodes connected to it, the connectivity matrix for the
resulting network must have full column-rank; and
3. [P] has full row-rank.
Each TF-TG interaction predicted by NCA and its PCC ≥ 0.75 is regarded
as the final prediction for a regulatory relationship.
Validation of predicted TF-TG interactions
It has been widely observed that TF-TG relationships tend to be
conserved across related species [[72]39–[73]42]. Hence, we have used
known TF-TG relationships (see Data) in related organisms to validate
our TF-TG predictions. InParanoid (version 4.1) [[74]43, [75]44] is
applied to generate orthologous groups between switchgrass and
Arabidopsis, maize as well as rice, respectively. We consider a
predicted TF-TG interaction as validated if it has orthologous pairs of
TF-TG interactions in the above six databases.
Validation of predicted TGs for each TF
To quantify functional relevance of the predicted TGs for each TF, we
have calculated GO semantic similarity scores of the GO terms between
each pair of the co-regulated TGs, using the GOSemSim package (version
2.6.0) [[76]45, [77]46]. Here, we considered only the GO biological
processes. To evaluate the statistical significance of the derived
functional similarity among the TGs regulated by the same TF, we have
randomly selected the same number of genes from the background gene
set, namely a set of all the 25,971 switchgrass genes with GO
annotations, and calculated their functional similarities. Wilcoxon
rank-sum test is used to assess whether the calculated functional
similarity scores among the TGs are significantly higher than those
among randomly selected genes.
Dynamic regulatory maps
The Dynamic Regulatory Event Miner (DREM) (version 2.0.3) [[78]47,
[79]48] is used to reconstruct the relevant regulatory relationships in
each ecotype under each stress. Specifically, DREM takes time-course
fold-changes of TGs and predicted TF-TG interactions as inputs, and
produces a dynamic regulatory map consisting of multiple time-dependent
chains of events. Each chain consists of a collection of co-expressed
genes across all the time points, measured using fold changes in
expression levels with respect to time 0 as the reference, where there
are TFs that co-regulate genes at time t and time (t+1), across all
time points. An important property of the modeling technique is that it
allows the formation of bifurcation events at each time point if
co-expressed genes up till the current point diverge into two distinct
groups of co-expressed genes from the current time point on within each
group but not between the two groups. Each bifurcation event is
associated with two sets of TFs that each regulate a subset of all the
genes, where the TF split cutoff score is 0.001.
GO enrichment analysis
GO term enrichment is conducted over a given gene set using the topGO R
package (version 2.30.1) [[80]49, [81]50], where all annotated genes in
switchgrass is used as the background. A GO function is considered
enriched if the p-value < 0.01.
The protocol for the entire computational analysis pipeline, used in
this study, has been published in protocols.io with the following DOI:
[82]dx.doi.org/10.17504/protocols.io.saxeafn.
Results
Considering that only one switchgrass genome, namely that of Alamo, is
published [[83]19], we have mapped all the RNA-Seq reads of both
ecotypes to the same genome when estimating gene-expression levels in
each ecotype under each stress.
DEGs responding to drought and salt stresses
Comparison between the two ecotypes under each stress type revealed:
they have similar gene expression profiles at the global scale, as
shown by their similar expression-level distributions ([84]Fig 1A) and
high statistical correlations between gene expressions of the two
ecotypes under the same stress type ([85]Fig 1B).
Fig 1. Characterization of gene-expression profiles of each ecotype under
drought and salt stresses.
[86]Fig 1
[87]Open in a new tab
(a) The distribution of log (mean CPM+1) for each ecotype. (b) The
scatterplots for PCCs between expressions of each gene in the two
ecotypes averaged over all relevant samples under drought and salt
stress, respectively. (c) The number of DEGs for each ecotype at each
time point for each stress. The number of up- and down-regulated genes,
marked as red and blue, respectively, for each combination of ecotypes
and stresses.
Our differential gene-expression analyses have revealed that there are
significant differences between the two ecotypes in terms of their
responses to the two types of stresses under study ([88]Fig 1C, [89]S1
File). First, more DEGs are detected under the drought stress than the
salt stress for both ecotypes, suggesting that drought has larger
impacts on the plant than salt, and requires more diverse cellular
responses, which is also observed in Brassica napus [[90]51]. For each
stress, there are more DEGs in Dacotah than in Alamo almost at every
time point (see [91]Methods). We have also noted that the numbers of
up- vs. down-regulated genes are approximately the same for both
ecotypes under each of the two stress types, which is consistent with
previous studies regarding abiotic stress response in Arabidopsis
[[92]52].
Interestingly, there is one time-point (6d for salt and 18d for
drought) (see [93]Methods), where the number of up-regulated genes in
Alamo is higher than Dacotah, which is different from all the other
time points. A key function enriched by the up-regulated genes at this
point for salt stress is photosystem II assembly and stabilization, a
known adaptive mechanism for protecting plants from permanent damages
[[94]53]. Previous studies have observed that Alamo has higher
photosynthetic rates than Dacotah [[95]10]. Hence, we posit that this
energy-related mechanism may play a key role in protecting Alamo from
damages.
Overall, 676 distinct TFs are found to be differentially expressed in
total across the two ecotypes under two stresses (Panel a in [96]S1
Fig), specifically, 207, 266, 361 and 531 for Alamo and Dacotah under
drought and salt, respectively. We not e that most of these TFs are in
the families of bHLH, bZIP, NAC, G2-like, MYB_related and ERF (Panel b
in [97]S1 Fig), which are all known to play important roles in abiotic
stress response [[98]54, [99]55]. Of these TFs, 87 are found in all
four combinations of the two ecotypes under two stresses; and 64
additional (distinct) ones are in Alamo and 233 ones in Dacotah. See
[100]S2 File for details.
GRNs for the two stress types
We have constructed a GRN from the identified stress-responding genes
(see [101]Methods) for each stress type, respectively. The network
consists of a set of stress-responding TFs and TGs regulated by such
TFs, namely TFAs as defined earlier. Overall, 100 and 95 TFs are
identified in Alamo and Dacotah under drought, respectively; and 84 and
77 TFs are identified in Alamo and Dacotah under salt, respectively.
We have then conducted co-expression analyses between the TFs and the
TGs identified for each stress type, where a pair of TF and TG is
considered as having a regulatory relationship if their PCC > 0.70
[[102]25, [103]56]. This results in the initial prediction of the
regulatory relationships in the response system to the drought and salt
stresses, respectively.
Overall, 37,896 and 16,350 regulatory interactions are predicted
between 147 TFs and 2,262 TGs under drought and between 125 TFs and
1,893 TGs under salt ([104]Table 1), respectively. These regulatory
relationships and associated expression data are then fed into the NCA
algorithm (see [105]Methods), after 4 and 15 TFs being removed with the
number of TGs less than two or co-regulating with other TFs, which give
rise to the TFA profiles for the 143 and 110 TFs as the response
systems to drought and salt, respectively, as detailed in [106]S2 and
[107]S3 Figs. All the 143 and 110 TFs exhibit distinct expressions for
two ecotypes across each of the six and eight time-points under the
drought and salt stress, respectively, indicating different cellular
responses at different time points for the two ecotypes under the same
stress.
Table 1. Summary of GRNs of switchgrass in response to drought and salt
stresses, respectively.
Stress Processing #TFs #Targets #Interactions #Positive interactions
#Negative interactions
Drought PCC correlation >0.70 147 2,262 37,896 33,548 4,348
NCA prediction 143 1,844 21,239 18,149 3,090
PCC correlation ≥ 0.75 143 1,693 11,784 10,806 978
Salt PCC correlation >0.70 125 1,893 16,350 14,278 2,072
NCA prediction 110 1,831 14,956 12,906 2,050
PCC correlation ≥ 0.75 110 1,535 8,411 7,577 834
[108]Open in a new tab
Positive and negative interactions represent those with PCCs > 0 and <
0, respectively.
We have next conducted functional analyses of the predicted regulatory
relationships. First, we have applied a higher PCC cutoff 0.75 ([109]S4
Fig) to the predicted regulatory relationships so we can focus on the
most reliable predictions. This gives rise to 11,784 and 8,411
high-confident TF-TG interactions ([110]S3 File) in the combined lists
of the two ecotypes for drought and salt stresses, respectively.
[111]Table 1 summarizes these positive (activation) and negative
(repression) regulatory relationships. From the structure of partial
drought and salt networks ([112]Fig 2, [113]S5 Fig) generated by using
the igraph package (version 1.2.1) in R [[114]57, [115]58], we note
that some TF-TGs primarily occur in the GRN only for one ecotype while
others in both ecotypes.
Fig 2. The predicted GRNs (partial) for 15 TFs with the highest numbers of
TGs of switchgrass under drought stress.
[116]Fig 2
[117]Open in a new tab
TFs and TGs are represented as squares and circles, respectively. Genes
regarded as stress affected DEGs for both ecotypes, only in Alamo and
Dacotah are marked in pink, red and blue, respectively.
Validation of predicted regulatory interactions
We have validated the predicted TF-TG relationships using two
independent data sources: (1) TF-TG relationships in Arabidopsis, maize
and rice as such relationships tend to be conserved across related
plants, and (2) TGs regulated by the same TF should have highly related
or similar functions.
First, we have compared the predicted TF-TG relationships with known
TF-TG pairs in Arabidopsis, maize and rice (see [118]Methods).
Specifically, we have mapped 503 and 408 predicted regulatory pairs of
switchgrass to the orthologous TF-TG pairs in Arabidopsis, maize or
rice for drought and salt network, respectively, hence considered them
as validated as detailed in [119]Table 2 and [120]S4 File.
Table 2. Validation of the predicted TF-TG interactions by orthologous pairs
in Arabidopsis, rice and maize, respectively.
Species Drought Salt Reference
Arabidopsis 66 (38) 77 (37) [[121]24]
182 (15) 45 (14) [[122]21]
4 (3) 6 (4) [[123]25]
202 (48) 261 (47) [[124]26]
Rice 0 (0) 4 (1) [[125]23]
Maize 101 (34) 72 (29) [[126]22]
Total 503 (79) 408 (70) N/A
[127]Open in a new tab
The number inside each pair of parentheses represents the number of TFs
involved in the corresponding interactions.
We have then checked the TGs regulated by the same TFs in terms of
their functional relevance measured using the GO Biological Processes
(see [128]Methods). Among the 11,784 predicted TF-TG interactions,
5,212 interactions between 139 TFs and 815 TGs in the drought response
GRNs have functional relevance (using cutoff 0.3, Panel a in [129]S6
Fig), based on an saturation analysis of functional similarities
[[130]59] among TGs regulated by the TF. Similarly, among the 8,411
TF-TG interactions, 3,425 interactions between 97 TFs and 696 TGs in
the salt GRNs have functional relevance. All these are considered as
validated based on functional relevance information. This clearly
provides strong support to our predicted regulatory networks. The
detailed information is summarized in [131]S5 File.
Overall, 5,434 (46.1%) and 3,657 (43.5%) TF-TG interactions in the
predicted GRNs in response to drought and salt, respectively, are
considered as validated by the above two approaches. Although false
positive interactions may inevitably be present in our prediction, our
benchmark analysis (Panel b in [132]S6 Fig) demonstrates that the
average GO annotation similarities of drought and salt network are
significantly higher than that of a random gene interaction network
(p-value 2.2e^−16).
Functional analyses of the predicted stress-response networks
Massive responses are observed for each ecotype to each stress type
(detailed in [133]S2 and [134]S3 Tables). To put all the response
activities into a more logical organization, we have applied the DREM
program (see [135]Methods) to the predicted TF-TG interactions and the
associated fold-changes of the TGs to construct a temporal model for
how the TFs regulate the observed responses, referred to as regulatory
events, transiting from one time point to the next, represented as an
input–output hidden Markov model (IOHMM) [[136]60], which consists of
multiple chains of events, as illustrated in Figs [137]3 and [138]4.
Each chain from the start (time 0) to end (the last time point)
represents a collection of co-expressed genes, co-regulated by a common
set of TFs. A key property of the model is that it allows the formation
of bifurcation events, where a (partial) chain from the start to the
current time point can split into two when the set of co-expressed
genes up till this time point is split into two subsets of co-expressed
genes from this time point on, while the two subsets are not
co-expressed beyond the current time point. This process may continue
to give rise to multiple bifurcation points. Then, the detailed
functions of co-expressed genes for each chain of each ecotype under
each stress type are described in [139]S6 File. Overall, the
similarities and differences of response activities between two
ecotypes responding to each stress fall into the following function
categories.
Fig 3. A temporal model for chains of events in each ecotype under drought
stress.
[140]Fig 3
[141]Open in a new tab
Each chain represents a collection of co-expressed genes co-regulated
by a common set of TFs in the same collection. The y-value of the chain
at a specific x point represents the average fold-change (after long
transformation) of the expressions of genes represented by the chain at
the current point over the expression levels of the same genes at time
point 0. Each green node represents a bifurcation event in the network.
The TF families (partial) with the highest number of the TFs having
split scores below 0.001 appear next to the split that they regulate.
The GO functions for each chain are displayed in the right part and the
red, green, brown and blue colors indicate different function
categories. In each panel, an artificial node is added at 0h so each
gene has a “fold change” at time point 0. The panel in the top left
corner gives an expanded view of a chain of co-expressed events
consisting of one bifurcation event into three sub-chains, being red,
pink and blue, respectively.
Fig 4. A temporal model for chains of events in each ecotype under salt
stress.
[142]Fig 4
[143]Open in a new tab
Each chain represents a collection of co-expressed genes co-regulated
by a common set of TFs. The y-value of the chain at a specific x
represents the average fold-change (after long transformation) of the
expressions of genes represented by the chain at the current point over
the expressions of the same genes at time point 0. Each green node
represents a bifurcation node. The TF families (partial) with the
highest number of TFs with split scores < 0.001 appear next to the
split that they regulate. The GO functions for each chain are displayed
in the right part and different colors represent different function
categories.
The following summarizes our main observations about how each ecotype
responds to drought ([144]Fig 3 and [145]S2 Table) and how these
responses may influence various metabolisms [[146]11]:
1. TFs in the families of bZIP, NAC, MYB_related, bHLH and WRKY, which
may play roles in the biosynthesis of osmoprotectants such as
proline, choline, trehalose, mannitol, glutamate and
raffinose-family oligosaccharide, to keep membrane stability under
osmotic stress [[147]61–[148]63], are up-regulated in both
ecotypes. The main difference between the two ecotypes is: there
seems to be more accumulation of the osmoprotectants (like
trehalose) in Alamo than in Dacotah, based on the higher expression
levels of the relevant biosynthesis genes in Alamo than those in
Dacotah, which is consistent with the previous switchgrass
metabolic studies under drought stress [[149]11], hence suggesting
that the increased levels of these metabolites play a role in
Alamo’s drought tolerance, as shown in [150]Fig 3, marked in red;
2. Alamo uses a different mechanism to adapt to drought through
utilizing chloro-respiration [[151]64, [152]65] to prevent damages
to the photosynthetic apparatus, which seems to be one of the
mechanisms for Alamo to maintain higher P[r] levels than Dacotah;
and both ecotypes seem to use stomatal movement to avoid excessive
water loss through transpiration [[153]66], but fold-changes of
relevant genes are slightly different, which may be one reason for
why Alamo has higher RWC than Dacotah, as shown in [154]Fig 3,
marked in green;
3. Both Alamo and Dacotah are predicted use the following processes to
respond to drought: biosynthesis of jasmonic acid [[155]67,
[156]68], S-adenosylmethionine [[157]69], a precursor for the
biosynthesis of polyamines known to be involved in response to
abiotic stresses [[158]70], which has shown that Almao has higher
levels of spermine (one of polyamine) than Dacotah, and conversion
of arginine to glutamate [[159]71]; but fold-changes of the
relevant genes are significantly different between the two
ecotypes, as shown in [160]Fig 3, marked in blue; and
4. Alamo and Dacotah seem to use similar mechanisms to maintain their
sodium-ion homeostasis, particularly at the early stage (0d-24d),
and potassium transport for maintaining high K^+/Na^+ ratios and
proton-ATPase for sequestering Na^+ into the vacuole [[161]14,
[162]72], where a previous study has shown transgenic switchgrass
with overexpressed proton-ATPase can enhance growth and salinity
tolerance than the wild-type [[163]73]. But the expression levels
of the relevant genes are substantially higher in Alamo than in
Dacotah. And at a later stage (30d), Alamo expresses higher levels
of the potassium-transporter genes than the proton-ATPase
biosynthesis genes while Dacotah has these genes expressed with the
same fold-changes for the two processes, as shown in [164]Fig 3,
marked in brown.
The following summarizes our key observations about how each ecotype
may respond to the salt stress ([165]Fig 4, [166]S3 Table):
1. Higher levels of accumulation of osmoprotectants such as proline
are predicted in Alamo than in Dacotah, estimated based on the fold
changes of genes involved in biosynthesis vs. degradation in the
two ecotypes, as shown in [167]Fig 4, marked red;
2. Both Alamo and Dacotah are predicted to have employed the same
mechanism to adapt to salt through synthesis of phosphatidylcholine
(PC), the major phospholipid in eukaryotic cell membranes known to
play a function in the adaptation to salt-mediated hyperosmotic
stress [[168]74]. Again more accumulations of PC are predicted in
Alamo than in Dacotah based on expression comparisons of genes
involved in biosynthesis vs. degradation of the PC and higher
expression levels of genes in the assembly and stabilization of
photosystem II [[169]53], as shown in [170]Fig 4, marked in green;
3. Alamo seems to have stronger salt tolerance than Dacotah, based on
our observation that the former has higher expression levels of
genes involved in nitric oxide biosynthesis [[171]72, [172]75] and
the biosynthesis of chlorophyll (Chl), which is consistent with a
previous study that Alamo has higher Chl content than Dacotah under
salt stress [[173]10], as shown in [174]Fig 4, marked in blue,
increases in whose cellular concentrations are known indicators of
salt tolerance in crops [[175]76]; and
4. Alamo is predicted to use a different mechanism to reduce oxidative
stress through biosynthesis of phytyl diphosphate in the medium
stage (6-12d), the prenyl donor for tocopherol biosynthesis
[[176]77], as detailed in [177]Fig 4, marked in brown.
Overall, Alamo uses better regulations to respond and adapt to the two
stress types than Dacotah.
Discussion
Network-based analyses of stress response over time
Plant stress responses, particularly to drought and salt, are dynamic
and involve complex cross-talks among multiple metabolic processes and
regulations, affecting many aspects of a cellular system, such as
osmatic pressure, oxidative stress, ion homeostasis and homeostasis of
water supply and demand [[178]78]. Hence, thousands of genes can be
differentially expressed once a plant is under drought or salt stress.
Traditional gene-centric approaches, such as simple pathway-enrichment
analysis, can provide useful information regarding which pathways are
differentially expressed, the relationships among such pathways could
be very complex to sort out [[179]59]. In this study, we have used a
network-based method, focused on key TFs and metabolic TGs. Through
organizing the observed events, reflected by coordinated
gene-expression changes, as time-dependent Markov models, we have
derived a system-level new understanding about how different metabolic
processes are linked to a stress and with each other when responding
and adapting to a stress. By combining the regulatory networks with the
temporal Markov model, we have not only predicted novel information
about which transcription regulators are possibly responsible for which
parts of the metabolic reprogramming when the plant is stressed, but
also gained new understanding about the relative order in executing
these reprogrammed metabolisms.
From the detailed analyses of the time-dependent chain events, we can
see that (1) similarities as well as differences in terms of when
specific metabolic responses, suggested by gene-expression data, to the
same stress type take place in the two ecotypes, such as proline or
choline taking place approximately at the same time while photosystem
II assembly and stabilization happening at different times (Dacotah at
6d and Alamo at 6d-12d, but higher expression levels in Alamo) in the
two ecotypes responding to salt; (2) similarities as well as
differences between the durations of specific metabolic responses,
again reflected by gene-expression data, in the two ecotypes, such as
mannitol or trehalose biosynthesis and chloro-respiration responding to
drought; (3) responding to salt stress, there is one special time point
6d, a larger decrease in proline and choline biosynthesis in Dacotah
than Alamo, a larger increase in the expressions of genes in Chl and
phytyl diphosphate biosynthesis in Alamo rather than Dacotah.
Future work
In the future, we see a few ways to extend our work. One is to design a
unified mathematic model to capture all the key time-dependent
regulatory and metabolic roles by all genes found to be differentially
expressed, which can answer questions through automated analyses of the
models such as: (1) what the main and detailed differences between the
regulatory systems in response to salt or drought in the two ecotypes
are; and (2) what the shared parts between the response systems to
drought and salt stresses are. Another area that we plan to do is to
enhance our validation procedures for the predicted TF-TG relationships
by using more general and sophisticated tools including ChIP-seq data
from related plants such as Arabidopsis [[180]20] to compare our
predictions against.
Conclusions
Regulatory networks responding to drought and salt stresses are
predicted for two ecotypes of switchgrass, Alamo and Dacotah, based on
publicly available gene-expression data of the two ecotypes treated
with drought and salt stresses. 40+% of our predicted regulatory
relationships are validated against independent data sources. By taking
advantage of the time-dependent expression data, we have also
constructed time-dependent hidden Markov models to connect the key
observed biological activities as reflected by gene-expression data
into multiple chains of time-specific events, hence providing a dynamic
picture of each of the stress-response processes under study.
Supporting information
S1 Fig. A summary of the differentially expressed TFs for each ecotype
under each stress type.
(a) Overlapping statistics of differentially expressed TFs. (b) The
distribution of TF families for differentially expressed TFs. Note:
“Alamo_salt” and “Dacotah_salt” represents the ecotype Alamo and
Dacotah under salt stress, respectively; “Alamo_drought” and
“Dacotah_drought” represents the ecotype Alamo and Dacotah under
drought stress, respectively; and “common” represents the intersections
of both ecotypes under both stress types.
(DOCX)
[181]Click here for additional data file.^ (402.9KB, docx)
S2 Fig. A heat-map of predicted TFA profiles for 143 drought-responding
TFs.
Hierarchical clustering of the 143 TFs performed using PCC similarities
with average linkage method. Rows are for TFs, and columns for samples.
Higher and lower levels of activities are indicated with yellow and
blue color, respectively.
(DOCX)
[182]Click here for additional data file.^ (302.2KB, docx)
S3 Fig. A heat-map of predicted TFAs of 110 salt-responding TFs.
Hierarchical clustering of the 110 TFs performed using PCC similarities
with average linkage method. Rows are for TFs, and columns for samples.
Higher and lower levels of activities are indicated with yellow and
blue color, respectively.
(DOCX)
[183]Click here for additional data file.^ (251.9KB, docx)
S4 Fig. Distributions of correlation coefficients for TF-TG pairs
predicted by the NCA algorithm, for drought and salt responding
networks, respectively.
(DOCX)
[184]Click here for additional data file.^ (24.1KB, docx)
S5 Fig. An inferred GRN for ten TFs with the highest numbers of TGs
under salt stress.
TFs and TGs are represented as squares and circles, respectively. Genes
regarded as stress affected DEGs for both ecotypes, only in Alamo and
Dacotah are marked in pink, red and blue, respectively.
(DOCX)
[185]Click here for additional data file.^ (150.3KB, docx)
S6 Fig. Pair-wise function similarity of co-regulated genes for each
TF.
(a) Cutoff of gene-pairs similarity based on saturation analysis. (b)
Comparison the GO term annotation similarities of pair-wise
co-regulated genes for each TF, for drought and salt network, and the
corresponding random network, respectively. Note: the pair-wise
function similarities for drought and salt network are significantly
higher than random generate network, with both p-value less than
2.2e^−16, determined by Wilcoxon rank-sum test.
(DOCX)
[186]Click here for additional data file.^ (392.4KB, docx)
S1 Table. A summary of samples used in this study.
(DOCX)
[187]Click here for additional data file.^ (13.7KB, docx)
S2 Table. A summary of key up-regulated biological processes for each
ecotype responding to drought stress.
(DOCX)
[188]Click here for additional data file.^ (14.3KB, docx)
S3 Table. A summary of key up-regulated biological processes for each
ecotype responding to salt stress.
(DOCX)
[189]Click here for additional data file.^ (13.1KB, docx)
S1 File. The DEGs for each ecotype at each time point for each stress.
(XLSX)
[190]Click here for additional data file.^ (1.4MB, xlsx)
S2 File. The differentially expressed TFs in each ecotype under each
stress type.
(XLSX)
[191]Click here for additional data file.^ (30.6KB, xlsx)
S3 File. The 11,784 and 8,411 predicted TF-TG interactions for drought
and salt regulated networks.
(XLSX)
[192]Click here for additional data file.^ (256.3KB, xlsx)
S4 File. The 503 and 408 validated TF-TG interactions by three related
species for drought and salt networks.
(XLSX)
[193]Click here for additional data file.^ (28.7KB, xlsx)
S5 File. The 5,212 and 3,425 validated TF-TG interactions based on
functional relevance of co-regulated genes, for drought and salt
networks.
(XLSX)
[194]Click here for additional data file.^ (116.1KB, xlsx)
S6 File. The TFs, genes and its corresponding GO functions for each
chain of each ecotype under each stress type.
(XLSX)
[195]Click here for additional data file.^ (2.3MB, xlsx)
Acknowledgments