Abstract
Abstract. Maize may be exposed to several abiotic stresses in the
field. Therefore, identifying the tolerance mechanisms of natural field
stress is mandatory. Gene expression data of maize upon abiotic stress
were collected, and 560 differentially expressed genes (DEGs) were
identified through meta-analysis. The most significant gene ontology
terms in up-regulated genes were ‘response to abiotic stress’ and
‘chitinase activity’. ‘Phosphorelay signal transduction system’ was the
most significant enriched biological process in down-regulated DEGs.
The co-expression analysis unveiled seven modules of DEGs, with a
notable positive correlation between the modules and abiotic stress.
Furthermore, the statistical significance was strikingly high for the
turquoise, green and yellow modules. The turquoise group played a
central role in orchestrating crucial adaptations in metabolic and
stress response pathways in maize when exposed to abiotic stress.
Within three up-regulated modules, Zm.7361.1.A1_at, Zm.10386.1.A1_a_at
and Zm.10151.1.A1_at emerged as hub genes. These genes might introduce
novel candidates implicated in stress tolerance mechanisms, warranting
further comprehensive investigation and research. In parallel, the R
package glmnet was applied to fit a logistic LASSO regression model on
the DEGs profile to select candidate genes associated with abiotic
responses in maize. The identified hub genes and LASSO regression genes
were validated on an independent microarray dataset. Additionally,
Differential Gene Correlation Analysis (DGCA) was performed on LASSO
and hub genes to investigate the gene-gene regulatory relationship. The
P value of DGCA of 16 pairwise gene comparisons was lower than 0.01,
indicating a gene–gene significant change in correlation between
control and abiotic stress. Integrated weighted gene correlation
network analysis and logistic LASSO analysis revealed Zm.11185.1.S1_at,
Zm.2331.1.S1_x_at and Zm.17003.1.S1_at. Notably, these 3 genes were
identified in the 16 gene-pair comparisons. This finding highlights the
notable significance of these genes in the abiotic stress response.
Additional research into maize stress tolerance may focus on these
three genes.
Keywords: Acid soils, drought, mineral deficiency, waterlogging, WGCNA
__________________________________________________________________
Recognizing that crops often face multiple stresses simultaneously and
that adverse environmental conditions negatively impact crop
productivity, our research aimed to identify key genes that confer
tolerance to these conditions. We conducted a detailed microarray
meta-analysis and identified 560 differentially expressed genes (DEGs)
in maize under abiotic stress. We found that the most significant gene
ontology terms among these DEGs were related to stress response and
chitinase activity, particularly emphasizing the phosphorelay signal
transduction system. Our co-expression analysis revealed seven gene
modules strongly correlated with various stress conditions. Employing
logistic LASSO regression and differential gene correlation analysis,
we confirmed our results and pinpointed three genes, Zm.11185.1.S1_at
(stress-associated endoplasmic reticulum protein 2-like),
Zm.2331.1.S1_x_at (Late Embryogenesis Abundant Protein 41) and
Zm.17003.1.S1_at (transcription factor HHO5), as crucial in the abiotic
stress response. These genes are potential targets for further research
and could be key in developing stress-tolerant maize varieties.
Introduction
Maize (Zea mays), an important cereal, is widely used for human food,
animal feed and biofuel production ([26]Jin et al. 2019). Like other
crops, maize production is threatened by environmental stresses,
including drought ([27]Žalud et al. 2017), mineral deficiency
([28]Mallikarjuna et al. 2020), acid soils ([29]Mattiello et al. 2010)
and waterlogging ([30]Thirunavukkarasu et al. 2013). Abiotic stress
hardly affects the morphological and biochemical traits of plants,
ultimately reducing plant yield severely ([31]Wang et al. 2003). Plants
may confront more than one stress coincidently in the field, which
leads to accomplishing homeostasis by evolving several mechanisms.
Demonstrating the tolerance mechanisms of naturally filed stress is
necessary to accelerate plant adaptation and subsequent growth and
yield ([32]Sharma et al. 2018).
A substantial amount of transcriptomic data has been generated where
the plants are subjected to several abiotic stresses. Profiles of the
responsive genes expressed under single stress may not explain the
molecular mechanisms and the differentially expressed genes (DEGs)
under various stress conditions. Additionally, this type of study can
provide information associated with conserved stress mechanisms.
Of the advantages that transcriptomic data in multiple stresses could
provide a meta-analysis has received much attention ([33]Amrine et al.
2015; [34]Muthuramalingam et al. 2017). There are extensive gene
expression datasets via microarray technology in maize, a well-studied
model plant. Accordingly, meta-analysis can be done by merging distinct
microarray gene expressions. It is beneficial to understand the
pathways as well as DEGs obtained under various stress conditions.
Unraveling the stable co-expression networks across distinct
experiments can be achieved from plant transcriptome data ([35]Shaik
and Ramakrishna 2013; [36]Takehisa et al. 2015).
Co-expression network analysis as a tool for microarray analysis is
increasingly applied to explore gene clusters demonstrating similar
transcription-correlated patterns as well as gene functionality
([37]Langfelder and Horvath 2008). In a weighted gene co-expression
network (WGCNA), genes are represented by the nodes, and the
connections between the nodes are based on the co-expression measure
([38]Zhang and Horvath 2005). In parallel, we explored a diagnostic
gene signature related to abiotic stress using logistic regression and
least absolute shrinkage and selection operator (LASSO) analysis. LASSO
is a powerful feature selection method, along with regularization, that
can enhance the prediction accuracy of the statistical model as well as
its interpretability due to shrieked variables ([39]Tibshirani 1996;
[40]Xiong et al. 2019). The regulatory gene–gene relationship in
biological systems can be investigated via differential gene
correlation (DGC) analysis (DGCA) ([41]McKenzie et al. 2016). DGCA
based on the calculation of the z score is notably better than the
alternative method for the calculation of differential correlation.
Here, we conducted a meta-analysis on maize microarray transcriptomic
datasets of abiotic stress. Additionally, gene ontology (GO) enrichment
analysis was implemented to find enriched MFs, BPs and cellular
components (CCs). After that, LASSO regression and a weighted gene
co-expression network analysis (WGCNA) were constructed for the DEGs to
elucidate potential candidate genes, modules and key pathways related
to the abiotic stress response. As well, the gene–gene regulatory
system was conducted on candidate genes selected through LASSO and
WGCNA. Altogether, the present analysis demonstrates a common mechanism
involved in abiotic stress in maize.
Materials and Methods
Data collection, preprocessing and DEG finding
Data were downloaded from the NCBI Gene Expression Omnibus (microarray,
[42]GPL4032) Archive, containing 17734 probe sets in February 2022. To
collect and filter the data, the keyword ‘maize’ was used under the
platform [43]GPL4032, and the study type set to ‘Expression profiling
by array’ resulted in 19 datasets in CEL format, of which 5 datasets
corresponded to abiotic stress ([44]Table 1). The expression data
within each dataset underwent normalization and background correction
using Robust Multichip Average (RMA) algorithm in the Affy R package
([45]Irizarry et al. 2003). After preprocessing, the datasets were
merged, and batch effects among multiple datasets were removed using
the SVA R package ([46]Leek et al. 2012), and the ComBat function,
which relies on an empirical Bayes method. DEGs were determined by the
Limma package. The FDR (False Discovery Rate) cut-off value was
considered 0.01 to calculate the DEGs between the control and abiotic
stress conditions.
Table 1.
Characteristics of data sets selected for meta-analysis including
accession number, stress type, genotypes, platform, control samples,
treated samples, treated sample conditions, time course, tissue type,
total number of samples, publication date and reference
Reference Public on Total number of samples tissue Time course treated
sample conditions Treated samples Control samples Type of used
genotypes (name of genotypes) Platform Stress type Accession number
[47]Mallikarjuna et al. (2020) 04/01/2021 12 Shoot 12 days after
treatment Hydroponic solution lacking Fe, Zn, and Fe+Zn 9 3 None
Affymetrix Maize Genome Array Fe+Zn [48]GSE122581
[49]Thirunavukkarasu et al. (2013) 14/08/2013 16 Root 32, 35 and 42
days after sowing Waterlogging 4 4 HK1105 (tolerant),
V-372 (susceptible) Affymetrix Maize Genome Array Moderate and severe
stress, post-stress recovery [50]GSE43088
Mattiello et al. (2014) 01/12/2010 12 Leave 3 days Acid soil
(aluminium) 6 6 Cat100-6
(Al-tolerant)
S1587-17
(Al-sensitive) Affymetrix Maize Genome Array Acid soil [51]GSE22479
[52]Mattiello et al. (2010) 20/09/2010 24 Root 1 and 3 days Acid soil
(aluminium) 12 12 Cat100-6
(Al-tolerant)
S1587-17
(Al-sensitive) Affymetrix Maize Genome Array Acid soil [53]GSE21070
([54]Zheng et al., 2009) 03/06/2010 24 Shoot Three-leaf stage seedling
Moderate and severe drought stress, re-watering 18 6 Han21
(drought-tolerant)
Ye478
(drought-sensitive) Affymetrix Maize Genome Array Drought stress
[55]GSE16567
[56]Open in a new tab
GO enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway
analysis
The DAVID web tool ([57]http://david.abcc.ncifcrf.gov/) was used to
implement the GO of DEGs. The significant threshold for the GO terms
was considered to be a P value < 0.1. The important enriched pathways
of DEGs were pinpointed based on the KEGG database.
Co-expression gene network and hub gene identification
To discover the modules, a WGCNA network was constructed for the DEGs
using the WGCNA package. For this end, expression data of 560 DEG were
used as input [Supporting Information—[58]Supplementary file; sheet
S1]. In short, using a Pearson correlation, a similarity matrix [Sij =
|0.5 + 0.5 × cor (xi, xj)|] was formed, and then, utilizing a β of 15
as a soft-thresholding power, it was converted into an adjacency matrix
[A[ij] = (|0.5 + 0.5 × cor (xi, xj)|)β]. Afterward, a topological
overlap similarity measure (TOM) was developed out of the adjacency
matrix, out of which modules were obtained using the dynamic tree cut
algorithm ([59]Langfelder and Horvath 2008) with a deep split level of
2, a height of 0.2 and a minimum module size of 15.
Hub module genes were identified based on the connectivity score using
the ChooseTopHubInEachModule function in the WGCNA package.
GO and pathway enrichment analysis in each module
The KEGG and GO enrichment analyses were carried out on all modules to
specify the function modules. For this, all probe IDs in each module
were uploaded to the DAVID database ([60]https://david.ncifcrf.gov/).
The GO and pathway terms were assumed to be significant when the P
value was <0.1.
Gene selection through the LASSO
To improve variable selection, Tibshirani ([61]Tibshirani 1996)
developed LASSO as a combination of ridge regression. In this method, a
subset of informative features is selected by the regression
coefficients shrinking to zero ([62]Tibshirani 2013). To select
candidate genes reliably associated with abiotic tolerance in maize,
the R package glmnet (Version 4.1.4) ([63]Hastie et al. 2021) was
applied for a logistic LASSO regression model fitting on the DEGs
profile. We performed 10-fold cross-validation for tuning parameter
selection.
Differential Gene Correlation Analysis (DGCA)
The regulatory relationship between genes was analysed through
differential correlations between gene pairs in control and abiotic
stress using the DGCA package in Rstudio. In this method, correlation
coefficients are transformed to z scores, and differences in z scores
are used to calculate P values corresponding to DGC ([64]McKenzie et
al. 2016). The Fisher z-transformation was calculated using the
following formula for normalization:
[MATH: z=atanh(r) =12lo
ge(
1+r1
−r) :MATH]
where r is the correlation coefficient of the sample, log[e] stands for
the natural logarithm function and atanh is the arc-tangent hyperbolic
function. The z-score variance may depend on the correlation type
whether it is Pearson correlation (r[p]) or Spearman correlation (r[s])
([65]Fieller et al. 1957). The variance of a normalized distribution
can be calculated by the formula where n is the sample size of the
correlation:
[MATH: var(rp
mi>) =1n−3
:MATH]
or
[MATH: var(rs
mi>) =1.06n−3
:MATH]
.
Afterward, the difference in z-scores (dz) between the control and
abiotic stress conditions was obtained by:
[MATH: dz=(z1
mn>−z2)|Sz12− Sz22|,
:MATH]
where
[MATH: Sz
mi>12<
/mrow> :MATH]
and
[MATH: Sz
mi>22<
/mrow> :MATH]
are the variance of the z-score in the control and abiotic stress
conditions. Using dz, a two-sided P value was calculated for the
standard normal distribution, and gene pairs were ranked according to
their differential correlation values.
Internal and external validations of hub and LASSO genes
The study employed a data set containing 88 samples, each with 24
attributes (genes), including those selected by LASSO (21 genes) and
hubs (3 genes). Leave-One-Out Cross-Validation was conducted on the
Weka platform. The linear regression algorithm was utilized with the
following parameter settings: a slope parameter (S) of 0 and a ridge
parameter (R) of 1.0E−8. Following model training, an internal
validation was conducted to assess the efficacy of the model on the
test set. The correlation coefficient, the mean absolute error, the
root mean square error, the relative absolute error and the root
relative square error were calculated. To evaluate the model’s efficacy
in applying to new data, an independent dataset was used. The external
validation dataset ([66]GSE122581) included 12 samples; each was
distinguished by 24 attributes. A Venn diagram was created to identify
shared genes selected by LASSO, hub genes and modules ([67]Heberle et
al. 2015).
Results
DEGs screening
Before the analysis of DEGs, background correction and normalization
were done. The similarity in the median expression level of the
different arrays indicated the proper performance of the correction.
One major concern of such comparative gene expression studies is
comparing different microarray samples derived from different studies
under different conditions. So, we corrected batch effects. Batch
effects correction causes other parameters, such as developmental
stage, to be eliminated, and the comparison is just between stress
conditions and non-stress conditions. Then, the distribution of the
samples before and after batch effect correction was controlled by a
PCA plot ([68]Fig. 1). The results of normalized gene expression levels
between before and after batch effect correction of microarray data are
presented in [69]Fig. 2. Analysis of DEGs resulted in a total of 560
DEGs, including 291 up-regulated and 269 down-regulated genes in
abiotic stress compared to normal conditions. The list of DEGs along
with analysis parameters is indicated in the [70]Supporting
Information—[71]supplementary file; sheet S2.
Figure 1.
[72]Figure 1.
[73]Open in a new tab
Panel A illustrates the PCA plot of samples prior to batch effect
removal, while panel B demonstrates the PCA plot of samples after batch
effect removal.
Figure 2.
[74]Figure 2.
[75]Open in a new tab
The box plot of gene expression values of different datasets
([76]GSE122581, [77]GSE43088, [78]GSE22479, [79]GSE21070 and
[80]GSE16567) in maize subjected to various abiotic stresses (A) before
the batch effect correction and (B) after batch effect removal.
KEGG and GO enrichment analysis
To further determine and classify the role of common DEGs under abiotic
stress in maize, we performed GO and KEGG pathway enrichment analysis
on the up-regulated and down-regulated DEGs separately ([81]Fig. 3). Go
enrichment of up-regulated genes revealed ‘response to salt stress’,
followed by ‘glutathione metabolic process’, ‘response to cold’ and
‘response to abscisic acid’ as the prominent BPs ([82]Fig. 3). The
significantly enriched GO term among up-regulated DEGs in terms of the
CC was ‘extracellular region’. As well, ‘chitinase activity’, ‘protein
self-association’ and ‘pyridoxal phosphate binding’ were distinguished
as top-rank in the category of MF ([83]Fig. 3). ‘Metabolic pathways’
were the most significant pathways enriched by the most up-regulated
DEGs ([84]Fig. 3).
Figure 3.
[85]Figure 3.
[86]Open in a new tab
KEGG and GO analysis of DEGs in maize under abiotic stress conditions.
GO analysis of DEGs. BP, MF and (CC) cellular component.
‘Phosphorelay signal transduction system’, ‘protein dephosphorylation’
and ‘tryptophan biosynthetic process’ were identified as the top GO
terms for BPs of the down-regulated genes ([87]Fig. 3). Apart from the
BP, MF enrichment analysis detected ‘protein serine/threonine
phosphatase activity’, ‘metal ion binding’ and ‘transmembrane
transporter activity’ as the top-rank GO terms of the down-regulated
genes. Moreover, GO cell component analysis revealed ‘cytoplasm’,
‘P-body’ and ‘chloroplast inner membrane’ as the top-rank GO in terms
of the down-regulated genes. The most significant pathway enriched by
the most down-regulated DEGs was ‘plant hormone signal transduction’
([88]Fig. 3).
Gene co-expression network analysis
The normalized, background-corrected and batch effect-removed dataset
of DEGs containing 88 samples (31 controls and 57 stresses) was used
for weighted gene correlation network analysis (WGCNA). Genes were
classified into different modules using the WGCNA package. [89]Fig. 4A
illustrates the dendrogram and abiotic stress status heatmap. To
warrant high-scale independence (about 0.8) and low mean connectivity
(about 0), the value of β was set equal to 15 ([90]Fig. 4B). The value
of dissimilarity between the modules was equal to 0.2, and a total of
seven modules were obtained ([91]Fig. 4C). The module sizes span a
range from 26 to 278 genes per module, encompassing diverse functional
gene clusters. Specifically, these modules are denoted as green (n =
26), yellow (n = 30), red (n = 34), brown (n = 44), blue (n = 61),
turquoise (n = 98) and grey (n = 278) ([92]Fig. 4; [93]Supporting
Information—[94]supplementary file; sheet S3). The visual
representation of the module-trait relationship is depicted in [95]Fig.
4C, which offers a comprehensive overview of the relationship between
these modules and abiotic stress. Among these modules, four showed a
clear negative connection with abiotic stress, while three modules had
a noticeable positive relationship. The module trait relationship was
highly positively correlated (0.52, 0.53 and 0.49) with low P values
(2e−07, 1e−07 and 2e−06) in the turquoise, green and yellow modules,
respectively. Therefore, these modules can be used for GO enrichment
and to identify the hub genes pertinent to the abiotic stress response.
Figure 4.
[96]Figure 4.
[97]Open in a new tab
Hierarchical cluster tree of the DEGs in maize under abiotic stress
conditions (A). The colour bands and branch patterns depict the
designated module. The branch ends stand for genes. The scale-free fit
index and mean connectivity of WGCNA (B). Heatmap of the relationship
between modules and traits using the WGCNA package (C). Each cell
displays the correlation score and P value. Positive correlations are
represented by red colour, while negative correlations are indicated by
blue colour.
Hub genes were identified as genes with the highest connectivity within
the up-regulated module. Among these, Zm.7361.1.A1_at ([98]AY108454.1;
MTD1) was identified as a hub gene within the turquoise module.
Additionally, genes like Zm.10386.1.A1_a_at ([99]BU051031;
uncharacterized LOC100274659) in the green module and Zm.10151.1.A1_at
([100]CK986091; PEBP; phosphatidylethanolamine-binding protein family)
in the yellow module were also designated as hub genes.
GO and pathway annotation of co-expressed modules
The turquoise module exhibited enrichment of BP terms including
‘response to cold’, ‘response to salt stress’, ‘glutathione metabolic
process’, ‘l-phenylalanine catabolic process’, ‘ER-associated misfolded
protein catabolic process’ and ‘fatty acid beta-oxidation’. Within the
MF terms, enrichment encompassed ‘chitinase activity’, ‘glutathione
transferase activity’, ‘flavin adenine dinucleotide binding’ and
‘electron carrier activity’. In the context of CC terms, the turquoise
module showed enrichment in ‘extracellular region’, ‘prefoldin
complex’, ‘cytoplasm’ and ‘mitochondrial intermembrane space’.
The pathways enriched in the turquoise module encompassed ‘metabolic
pathways’, ‘propanoate metabolism’, ‘tyrosine metabolism’,
‘beta-alanine metabolism’, ‘fatty acid degradation’, ‘amino sugar and
nucleotide sugar metabolism’, ‘fatty acid metabolism’, ‘biosynthesis of
secondary metabolites’ and ‘carbon metabolism’ [[101]Supporting
Information—[102]supplementary file; sheet S4].
The yellow module did not exhibit any significant enrichment in BP and
MF terms. The only enriched CC term in the yellow module is ‘cytosol’.
Additionally, within the yellow module, the pathways enriched include
‘Plant hormone signal transduction’ and ‘2-Oxocarboxylic acid
metabolism’ [[103]Supporting Information—[104]supplementary file; sheet
S4].
The green module did not display significant enrichment in terms of BP,
CC or pathways. Within the green module, the enriched MF term pertains
to ‘ubiquitin binding’ [[105]Supporting Information—[106]supplementary
file; sheet S4].
LASSO gene selection
A total of 560 DEGs were selected between control and abiotic stress
conditions to fit a LASSO regression model ([107]Fig. 5A). In the next
step, we found the most appropriate values for λ = 0.03858 using
10-fold cross-validation ([108]Fig. 5B). Finally, 21 genes with
non-zero coefficients were identified in abiotic stress in maize
([109]Fig. 5A and [110]B; [111]Supporting
Information—[112]supplementary file, S5].
Figure 5.
[113]Figure 5.
[114]Open in a new tab
Feature selection in maize in different abiotic stress conditions using
the logistic LASSO regression model by 10-fold cross-validation at
lambda.1se. (A) The path of variable coefficient against the L1-norm of
the whole coefficient vector as λ varies with the number of non-zero
coefficients represented on the axis above (B) LASSO coefficients of 21
significant genes in abiotic stress in maize (vertical lines related to
lambda.1se).
Differential Gene Correlation Analysis
To establish a proper prediction model for biological systems, analysis
of the correlation differences between gene pairs in two different
conditions is a solution to investigate gene–gene regulatory
relationships. In the present work, we performed a pairwise analysis in
maize under control and abiotic stress conditions. We considered the
difference in correlation between every two genes from a collection of
genes selected by LASSO regression and those selected as hub genes (24
total), resulting in 276 pairwise comparisons. The P value of
differential gene correlation (DGC) of 36, 16, 12 and 6 pairwise
comparisons was lower than 0.05, 0.01, 0.005 and 0.001, respectively,
indicating a gene–gene significant change in correlation between
control and abiotic stress [[115]Supporting Information—sheet S6]. For
more inquiry, we listed the top 16 different gene pairs in control and
abiotic stress conditions ([116]Table 2).
Table 2.
A pairwise analysis was conducted on maize samples under control and
abiotic stress conditions using the DGCA package to examine the
correlation differences between gene pairs. In the analysis, a total of
24 genes were included, selected through a combination of LASSO
regression and hub gene identification. A significance threshold of P
value 0.05 was applied, resulting in a total of 276 pairwise
comparisons. Among these comparisons, the top 16 gene pairs
demonstrated a heightened level of significance, with a P value of
0.01.
Gene1 Gene2 control_cor control_pVal stress_cor stress_pVal zScoreDiff
pValDiff Classes
Zm.11185.1.S1_at Zm.14951.3.S1_x_at −0.32759 0.072019 0.583284 1.92E−06
4.326496 1.52E−05 0/+
Zm.10003.1.A1_at Zm.1912.1.A1_at −0.47629 0.006757 0.442103 0.000575
4.26408 2.01E−05 −/+
Zm.10386.1.A1_a_at Zm.19036.1.S1_at 0.6025 0.000335 −0.22862 0.087173
−3.99258 6.54E−05 +/0
Zm.3237.1.S1_at Zm.684.1.S1_at 0.617164 0.000217 −0.09671 0.47423
−3.51006 0.000448 +/0
Zm.14951.3.S1_x_at Zm.7865.1.A1_at 0.425244 0.017088 −0.34339 0.008919
−3.48681 0.000489 +/−
Zm.14951.3.S1_x_at Zm.2331.1.S1_x_at −0.03113 0.867969 0.645411
5.95E−08 3.428951 0.000606 0/+
Zm.19066.1.S1_at Zm.3474.1.A1_at −0.48546 0.005634 0.218889 0.101867
3.231693 0.001231 −/0
Zm.1912.1.A1_at Zm.4551.1.A1_at 0.611225 0.000259 −0.02553 0.850492
−3.16218 0.001566 +/0
Zm.10386.1.A1_a_at Zm.684.1.S1_at 0.32041 0.078863 −0.35738 0.00635
−3.03153 0.002433 0/−
Zm.19066.1.S1_at Zm.4776.2.A1_at −0.43654 0.014079 0.232997 0.081122
3.028603 0.002457 −/0
Zm.12118.1.A1_at Zm.14951.3.S1_x_at −0.59717 0.00039 0.014355 0.915599
3.019107 0.002535 −/0
Zm.1489.1.A1_at Zm.7865.1.A1_at 0.739083 2.05E−06 0.242183 0.069515
−3.01171 0.002598 +/0
Zm.17003.1.S1_at Zm.19036.1.S1_at 0.421388 0.01823 −0.18363 0.171523
−2.72723 0.006387 +/0
Zm.1912.1.A1_at Zm.7865.1.A1_at −0.23289 0.207376 0.367679 0.004897
2.675117 0.00747 0/+
Zm.10151.1.A1_at Zm.12118.1.A1_at −0.48276 0.005946 0.093521 0.488973
2.663937 0.007723 −/0
Zm.4551.1.A1_at Zm.684.1.S1_at 0.333594 0.066657 −0.25085 0.05982
−2.59012 0.009594 0/0
[117]Open in a new tab
To further investigate these findings, we presented a list of the top
16 distinct gene pairs (P value 0.01) under both control and
waterlogging stress conditions ([118]Table 2). These differentially
correlated gene pairs were categorized into seven classes. Under the
control condition, three gene pairs showed no correlation, whereas
under abiotic stress, they exhibited a positive correlation (0/+). In
the control condition, five gene pairs displayed a positive
correlation, which was not observed under the abiotic stress condition
(+/0). A single gene pair demonstrated a negative correlation in the
control condition but shifted to a positive correlation under the
abiotic stress condition (−/+). Conversely, one gene pair manifested a
positive correlation in the control setting but presented a negative
correlation under the abiotic stress condition (+/−). In the control
condition, four gene pairs exhibited a negative correlation, which was
absent under the abiotic stress condition (−/0). Furthermore, a gene
pair showed no correlation under control conditions but displayed a
negative correlation under the abiotic stress condition (0/−).
Moreover, one gene pair displayed no correlation in both the control
and abiotic stress conditions. The first two columns in [119]Table 2
show the probe ID of paired genes, the columns third and fourth are
correlation and P value of pair genes under control, the columns fifth
and sixth are correlation and P value of pair genes under stress and
the seventh column shows the change value of Z-score which is an
indicator of the change in correlation between gene pairs.
To further scrutinize the gene–gene relationship for the key common
genes identified through LASSO and WGCNA ([120]Fig. 8), expression
correlations of these three genes were plotted in [121]Fig. 6. As well,
the expression values for all five genes in the three pair comparisons
are presented in [122]Fig. 7. A Venn diagram showing the number of
identified genes and overlap among the LASSO and WGCNA analyses is
presented in [123]Fig. 8.
Figure 8.
[124]Figure 8.
[125]Open in a new tab
Venn diagram showing the number of DEGs under abiotic stress conditions
in maize in each module.
Figure 6.
[126]Figure 6.
[127]Open in a new tab
The top five significant correlated pair genes (P < 0.001) were
identified through DGCA in maize in different abiotic stress
conditions. All gene pairs are correlated inversely in control compared
to stress condition. The x and y axes indicate the gene expression
values, and each dot represents a sample. Coloured lines and shaded
areas show the linear regression lines and their corresponding 95 %
confidence interval for each control and stress condition.
Figure 7.
[128]Figure 7.
[129]Open in a new tab
The values of individual gene expression in maize in control and
abiotic stress from three gene pair comparisons identified by DGCA
across conditions are plotted.
Internal and external validations of hub genes and genes chosen by LASSO
regression
The summary of the internal cross-validation results is as follows:
Correlation Coefficient: 0.854, Mean Absolute Error: 0.208, Root Mean
Squared Error: 0.252, Relative Absolute Error: 45.086 %, and Root
Relative Squared Error: 52.107 %. In addition, external
cross-validation was performed using an independent dataset. The
summary of the external cross-validation results indicated a higher
correlation coefficient of 0.958, suggesting a stronger positive
relationship between the predicted and actual values. The mean absolute
error was reduced to 0.105, indicating a lower average absolute
difference between the predicted and actual values. The root mean
squared error decreased to 0.128, signifying a smaller average
magnitude of the prediction errors. The relative absolute error was
improved to 25.786 %, representing a lower average absolute error as a
percentage of the average actual value. The root relative squared error
was reduced to 27.132 %. These results demonstrate the effectiveness of
the model in both internal and external validation, with improved
performance observed in the external validation.
Discussion
Meta-analysis combines data from multiple studies, resulting in an
increased sample size and statistical efficacy. Some DEGs may not
attain statistical significance in individual studies with smaller
sample sizes due to the decreased statistical power. Additionally,
variations in experimental design, tolerant or sensitive genotypes,
germplasm, treatment, time-point and other factors may exist among
studies included in a meta-analysis. These differences can induce
heterogeneity, and certain genes may exhibit differential expression in
one study but not in another, resulting in inconsistency between
studies. Notably, although some genes may be unique to individual
studies, a subset of genes identified through meta-analysis likely
intersect with genes identified in individual studies. These
overlapping genes are regarded as more robust and dependable research
candidates ([130]Ramasamy et al. 2008). Moreover, to address the issue
of heterogeneity, we have taken specific steps to ensure the
reliability of our results. Firstly, we performed normalization and
batch effect correction on the transcriptome datasets to minimize any
variations introduced during data processing. We aimed to mitigate the
influence of technical artefacts and confounding factors by applying
the rigorous normalization technique RMA algorithm. Secondly, we
carefully chose relevant and comparable genotypes to reduce
heterogeneity due to genome variations. We aimed to minimize potential
biases and improve the accuracy of their analysis by selecting
germplasms with similar genetic backgrounds and stress responses. To
assess the effectiveness of our data correction methods and to validate
removing heterogeneity, we performed principal component analysis (PCA)
on the transcriptome datasets. The PCA plot allowed us to visualize the
clustering patterns of samples and demonstrated the successful
elimination of confounding effects, including variations in
experimental design, tolerant or sensitive genotypes, germplasm,
treatment and time-point. The clear separation of samples in the PCA
plot highlights the efficiency of their data correction strategies in
addressing heterogeneity. The performed PCA, conducted both prior to
and after batch effect removal once on total genes and once on
differential genes, yielded informative results. These results robustly
indicated that the removal of batch effects significantly attenuated
their influence ([131]Fig. 1). We focussed our WGCNA analysis
specifically on DEGs. By doing so, we achieve a finer evaluation of
gene co-expression patterns linked to abiotic stress conditions.
According to GO enrichment analysis, up-regulated DEGs enriched
pathways and GO terms associated with the response to abiotic stress.
Up-regulated genes associated with various stress responses such as
‘response to salt stress’, ‘glutathione metabolic process’, ‘response
to cold’, ‘response to abscisic acid’, ‘response to hydrogen peroxide’,
‘response to heat’, ‘response to water deprivation’ and ‘metabolic
pathways’ suggest the presence of common patterns in response to
different abiotic stresses. The enrichment of various GO terms
associated with abiotic stress can be attributed to a combination of
factors, including cross-talk in stress signalling pathways,
overlapping regulatory networks, energy conservation and adaptive
advantage. To delve deeper, many stress signalling pathways are
interconnected and share common components. For example, stress factors
like reactive oxygen species (ROS) are produced as a result of multiple
stress conditions, leading to the activation of stress-responsive genes
involved in antioxidant defense ([132]Heap et al. 2020).
Furthermore, stress-responsive genes are often regulated by a complex
network of transcription factors. Some of these transcription factors
are involved in multiple stress signalling pathways, resulting in the
activation of similar sets of genes in response to different stresses
([133]Keshan and Rather 2021).
In the face of multiple stresses, the cell may prioritize energy
conservation by employing a ‘generic’ stress response that activates
common protective mechanisms. This can involve the up-regulation of
genes responsible for stress tolerance, regardless of the specific
stressor and the down-regulation of genes involved in growth and
development. This allows plants to allocate resources towards stress
tolerance and survival ([134]Tiwari et al. 2017).
In natural environments, plants often encounter multiple stresses
simultaneously. Genes that respond similarly to different stresses may
enable plants to cope with various challenges at once effectively. This
investigation facilitates the discernment of noteworthy GO terms and
prevailing trends pertaining to a range of stress conditions.
Moreover, ‘phosphorelay signal transduction system’ was the top-ranking
BP enriched by down DEGs. The observed decrease in the expression of
genes associated with the ‘phosphorelay signal transduction system’
(GO:0000160) indicates a potential inhibition of specific signalling
pathways, maybe as a means to preserve energy and allocate resources
more efficiently during periods of stress. The down-regulation of the
‘protein dephosphorylation’ and ‘intracellular signal transduction’
processes suggests the modification of intricate signalling networks.
‘phosphorelay signal transduction system’ functions as a signalling hub
that integrates signals from hormones and the environment into a single
pathway that maintains a balance among growth, development, and
adaptability to the environment ([135]Skalak et al. 2021). Cytokinin is
regarded as the major regulator of phosphorelay signal transduction
([136]Skalak et al. 2021). In this context, there is also a concomitant
decrease in ‘response to cytokinin’ which is another BP prioritized for
down DEGs in our study ([137]Fig. 3). The research ([138]Dobrá et al.
2015) also indicates that stress induction causes an immediate (less
than an hour) rise in ABA content, but a decline in endogenous
concentration of cytokinin and expression of cytokinin receptor genes.
One could assume that the response of maize to diverse abiotic stresses
may have been mediated via the control of phosphorelay signal
transduction by cytokinin and the up-regulation of BPs associated with
abiotic stress response. ‘Chitinase activity’ was distinguished as the
most significant GO term in up DEGs in the category of MF. Plants
normally have very low levels of chitinase activity; however, when
exogenous factors such as pathogens, heavy metals, growth regulators,
elicitors, chemicals and wounding impact them, the chitinase activity
rises noticeably ([139]Hong et al. 2000; [140]Yu et al. 2001; [141]Ding
et al. 2002). Enrichment of chitinase activity shows that under abiotic
stress conditions in maize, there is likely a crosstalk between abiotic
and biotic stress pathways, resulting in the activation of genes
involved in both stress responses. Rapid increases in chitinase
activity have been observed in studies of plants exposed to abiotic
stresses such as high or low temperatures, heavy metal ions, drought,
jasmonic acid and salicylic acid ([142]Davis et al. 2002). Chitinases
are known to play a pivotal role in several abiotic stress responses in
plants, encompassing phenomena such as wounding, osmotic stress, cold,
heavy metal stress and salt stress. These enzymes have a crucial role
in both defense-related systems and the improvement of abiotic stress
tolerance in plants ([143]Liu et al. 2020; [144]Vaghela et al. 2022;
[145]Poza-Viejo et al. 2023). Further information can be obtained from
a complicated network by identifying important hubs and modules while
taking into account network characteristics such as relative graphlet
frequency distribution, degree distribution, diameter and clustering,
([146]Pržulj et al. 2004). A set of genes represents a gene expression
module in which the pairs of genes are extensively co-expressed in the
sample set. Studies have shown that some critical functions can be
expected from a gene module ([147]Ruan et al. 2010). Hubs, highly
interconnected with nodes in a co-expression network, may act as a
controller or regulator of cellular responses to a special
physiological stimulus ([148]Lin et al. 2008) and be functionally
significant ([149]Chen et al. 2021). Gene co-expression analysis was
applied to determine hub genes and gene modules corresponding to the
abiotic stress response in maize.
As outlined earlier, among the seven modules examined, four of them
demonstrated a distinct negative association with abiotic stress, while
the remaining three exhibited a clear positive relationship. This
positive correlation signifies that these three modules experience
enhanced expression levels under abiotic stress conditions.
Consequently, based on this observation, we directed our attention
towards these particular three modules for subsequent analyses,
encompassing tasks such as the identification of hub genes, GO
exploration and enrichment analysis of KEGG pathways.
Uncovering module functions contributes to the comprehension of the
complex mechanisms underlying biological systems, hence enhancing our
knowledge of gene interactions that collectively shape an organism’s
lifecycle and its array of responses. The increased activity of the
turquoise module in metabolic adaptations and stress responses,
including ‘response to cold’, ‘response to salt stress’, ‘glutathione
metabolic process’, ‘l-phenylalanine catabolic process’, ‘ER-associated
misfolded protein catabolic process’ and ‘fatty acid beta-oxidation’
highlights its potential function as a hub for organizing critical
changes in response to environmental challenges. The MF enrichment
results, specifically ‘chitinase activity’, ‘glutathione transferase
activity’, ‘flavin adenine dinucleotide (FDA) binding’ and ‘electron
carrier activity’, suggest that the turquoise module may play multiple
functional roles, such as defense mechanisms, redox regulation and
energy-related activities. FAD is an essential cofactor for
flavoenzymes. Flavoproteins participate in auxin and cytokinin
metabolism ([150]Schall et al. 2020). The response to abiotic stress is
significantly influenced by these phytohormones. Flavoproteins also
take part in the abiotic stress response, for example, wounding, ROS,
salinity and drought ([151]Cona et al. 2006). Electron carriers are
involved in the movement of electrons in the respiratory and
photosynthetic systems. Under adverse environmental conditions, ROS
gives rise. Electron carriers, NAD/NADP and thiol members interact with
ROS and preserve redox homeostasis in plants ([152]Kapoor et al. 2015).
Besides, genes related to electron carriers, as well as several other
genes, were observed to be differentially expressed in OsHOX24
Arabidopsis transgenics under abiotic stresses ([153]Bhattacharjee et
al. 2016). Our study indicated electron carrier activity as one of the
significant functions in abiotic stress response in maize. These MF
enrichments suggest a finely tuned molecular repertoire that enables
the module to participate in a variety of cellular functions.
According to the enrichment of CC terms, the presence of ‘extracellular
region’ and ‘cytoplasm’ suggests potential roles in intercellular
signalling or communication, whereas the enrichment of ‘mitochondrial
intermembrane space’ suggests a role in energy-related processes.
According to the enrichment of pathways within the turquoise module,
these pathways encompass numerous aspects of cellular metabolism, such
as energy production, amino acid breakdown and the biosynthesis of
essential compounds. The centrality of this module in coordinating
these disparate pathways indicates its importance in optimizing the
plant’s response to stress and its capacity for environmental
adaptation.
The results of the enrichment analyses of the turquoise module reveal
that this module emerges as a nexus for integrating various pathways
crucial for the plant’s survival and adaptation. Its ability to manage
such a wide array of functions solidifies its position as a key
regulatory hub, contributing to the plant’s resilience against abiotic
stressors and optimizing its physiological responses.
The enriched CC term ‘cytosol’ from the yellow module implies that the
genes in this module are mainly located in the cell’s cytosol. The
enrichment of pathways such as ‘Plant hormone signal transduction’ and
‘2-Oxocarboxylic acid metabolism’ within the yellow hint at the
module’s potential involvement in signal transduction and metabolic
processes.
The identification of the enriched MF term ‘ubiquitin binding’ in the
green module suggests a possible molecular activity associated with
protein regulation and degradation. This may suggest that genes within
the green module are involved in recognizing and interacting with
ubiquitin, a key protein involved in targeting other proteins for
degradation or altering their functions.
In this survey, Zm.7361.1.A1_at ([154]AY108454.1; MTD1),
Zm.10386.1.A1_a_at ([155]BU051031) and Zm.10151.1.A1_at ([156]CK986091;
PEBP; phosphatidylethanolamine-binding protein family) were identified
as hub genes. The PEBP protein family, which encompasses a phosphatidyl
ethanolamine-binding protein (PEBP) domain, acts as a central component
in the integration of environmental and developmental signals. This
integration effectively directs the regulation of flowering in all
angiosperms, as highlighted by the study conducted by [157]Zheng et al.
(2016). The research further indicates that PEBP genes play a role in
promoting flower bud differentiation and regulating the balance between
vegetative and reproductive growth, as elucidated by [158]Zhao et al.
(2020). Stress-induced flowering stands out as a crucial response to
stress and emerges as the ultimate adaptation strategy, ensuring plant
survival by enabling them to produce seeds when they are unable to live
as individuals under extreme stress ([159]Takeno 2016). Numerous
stresses have been documented to elicit the process of flowering.
Previous reviews have summarized a variety of them, such as varying
levels of light intensity (either high or low), exposure to ultraviolet
(UV) light, fluctuations in temperature (either high or low),
inadequate nutrition, insufficiency of nitrogen, drought conditions,
limited oxygen availability, overcrowding, removal of roots and
mechanical stimulation ([160]Wada and Takeno 2010; [161]Takeno 2012;
[162]Kazan and Lyons 2016). Therefore, stress typically serves to
initiate the process of flowering ([163]Takeno 2016). Considering the
ontology enrichment of the yellow module associated with plant hormone
signal transduction and the pivotal role of the PEBP gene family as a
central component in integrating environmental signals, it appears that
the yellow module plays a significant role in signal transduction in
response to abiotic stress in maize.
Based on our current knowledge, the GO investigations of
[164]AY108454.1, [165]BU051031 and [166]CK986091 did not yield any
relevant GO information in maize. These genes may represent novel
candidates that play a role in stress tolerance mechanisms,
necessitating additional scrutiny and research. The potential
significance of these genes in conferring tolerance to a variety of
stresses emphasizes the need for thorough investigation and analysis.
Therefore, intensive studies are required to unravel the complexity of
their roles. This could provide important insights into stress response
mechanisms and pave the way for the development of strategies to
increase stress resistance in maize. The manipulation of the hub genes
identified within three up-regulated modules, namely [167]AY108454.1,
[168]BU051031 and [169]CK986091, is presented as one strategy for
enhancing abiotic stress tolerance.
According to the logistic LASSO regression result, the abiotic stress
response in maize was found to be associated with 21 genes with
non-zero coefficients. Although the genes found by co-expression
network analysis and Lasso regression were almost different, three of
the genes identified by LASSO regression, Zm.11185.1.S1_at,
Zm.17003.1.S1_at and Zm.2331.1.S1_x_at (Late Embryogenesis Abundant
(LEA) protein), were assigned to the green and turquoise modules
([170]Fig. 8), respectively. Conducting a blastx analysis on
Zm.11185.1.S1_at revealed a significant match marked by an e-value of
5e−09 and a percentage identity of 91.84 %. Notably, this hit
corresponds with the stress-associated endoplasmic reticulum protein
2-like identified within the Triticum dicoccoides genome. In the
analysis, a blastn was performed on Zm.17003.1.S1_at, uncovering a
match with Zea mays clone 240112 DNA binding protein mRNA, complete
cds. The alignment showcased an e-value of 0 and a sequence identity of
91.84 %. Following this, the identified sequence underwent a subsequent
blastp analysis, revealing a notable correspondence with the
transcription factor HHO5 [Zea mays] sequence with an e-value of 0 and
a sequence identity of 100.00 %. By employing blastx, an analysis was
conducted on Zm.2331.1.S1_x_at, leading to the revelation of a
significant hit characterized by an e-value of 4e−06 and a percent
identity of 100.00 %. This alignment is associated with the LEA Protein
41 [Zea mays]. Three genes have been identified through an integrated
approach of LASSO regression and WGCNA analysis. Furthermore, an
inspection using DGCA analysis revealed a set of 16 gene pairs
characterized by differential correlations in response to both control
and stress conditions. These three genes were found to be present
within three distinct gene-pair comparisons, with each gene being a
component of the respective comparison ([171]Fig. 6). This observation
underscores the substantial significance attributed to these three
genes in abiotic stress conditions. The values of individual gene
expression in maize in control and abiotic stress present within three
gene-pair comparisons are provided in [172]Fig. 7.
Our hypothesis regarding the function of these genes in response to
abiotic stress in maize is supported by a strong body of evidence on
the function of these genes in response to stress in a variety of plant
species. For instance, [173]Liu et al. (2013) suggested that
overexpression of ZmLEA3, a maize Group 3 LEA protein, confers
tolerance to osmotic and oxidative stress. LEA proteins were initially
associated with plant embryo desiccation tolerance because gene
expression and protein levels were high in the later stages of seed
development ([174]Galau et al. 1986). However, it has since been
discovered that LEA genes accumulate in vegetative tissues in response
to abiotic stresses such as salinity, drought, cold, desiccation and
heat ([175]Liu et al. 2019, [176]2021; [177]Wang et al. 2019;
[178]Shiraku et al. 2022). Stress-associated ER proteins play a vital
role in the plant stress response by restoring ER homeostasis,
regulating calcium signalling and preserving membrane fluidity and
integrity. These proteins protect the plant from the damaging
consequences of environmental stress and alleviate stress-induced
damage ([179]Yang et al. 2021; [180]Kim et al. 2022; [181]Shen et al.
2022). The HHO transcription factor family has recently been classified
as a distinct subfamily of the G2-like transcription factor family,
which belongs to the larger GARP superfamily. The HHO transcription
factors are characterized by the existence of two conserved domains,
namely the Myb-DNA binding domain and the hydrophobic and globular
domain. HHO transcription factors also play a crucial role in plant
growth and development and in responses to abiotic stresses ([182]Li et
al. 2021). For instance, the nitrogen starvation response (NSR) in
Arabidopsis has been found to be significantly regulated by the HHO
subfamily of GARP transcription factors through two possible methods.
First, the high-affinity nitrate transporters NRT2.4 and NRT2.53 are
directly repressed by HHOs. Second, HHO transcription factors are
involved in the regulation of NSR through ROS-dependent and independent
pathways ([183]Safi et al. 2018, [184]2021). Several studies have shown
that HHO transcription factors are involved in coordinating the
absorption and utilization of nitrogen and phosphorus ([185]Ueda and
Yanagisawa 2019). Notably under the control condition, there was no
correlation observed between Zm.11185.1.S1_at (a stress-associated
endoplasmic reticulum protein 2-like) and Zm.2331.1.S1_x_at (the Late
Embryogenesis Abundant Protein 41) with Zm.14951.3.S1_x_at (extensin).
However, when subjected to abiotic stress, a positive correlation
emerged between them and Zm.14951.3.S1_x_at. Extensins are structural
proteins of the cell wall that are among the key components responsible
for the rigidity of the cell wall ([186]Sujkowska-Rybkowska and Borucki
2014). Salt stress up-regulated the expression of genes encoding cell
wall proteins such as extension in barley roots ([187]Ueda et al.
2007). Under aluminium stress, extensin protein was found to accumulate
in the apoplast of the pea root nodule ([188]Sujkowska-Rybkowska and
Borucki 2014).
In the control condition, Zm.17003.1.S1_at (transcription factor HHO5)
displayed a positive correlation with Zm.19036.1.S1_at (Purple acid
phosphatase 15), which was not observed under the abiotic stress
condition.
Parallelly, we utilized internal and external validations of hub genes
and LASSO genes to confirm the robustness of selected genes. On the
basis of a gene expression dataset, the repeatability of the results
was validated by examining the discriminative potential of hub genes
and genes selected by LASSO regression using the linear regression
algorithm. The linear regression model exhibits promising performance
on both the internal and external validation datasets, giving accurate
predictions and demonstrating fairly low prediction errors. The
evaluation metrics illustrated the potential of three hub genes and
genes selected by LASSO regression to distinguish between the control
and stress samples. These genes seem particularly notable and are
discussed further.
Conclusions
Maize evolved adaptive responses to a wide range of environmental
conditions throughout its life to survive. Meta-analysis sheds light on
the DEGs as well as the underlying molecular mechanisms that occur
under different types of stress. DEG enrichment analysis unravelled the
up-regulation of BPs related to the abiotic stress response as well as
the down-regulation of phosphorelay signal transduction and response to
cytokinin. The co-expression network analysis revealed the presence of
seven modules containing DEGs, with sizes ranging from 26 to 278 genes
per module. Notably, the relationship between these modules and abiotic
stress exhibited a significant positive correlation (0.52, 0.53 and
0.49) supported by remarkably low P values (2e−07, 1e−07 and 2e−06) for
the turquoise, green and yellow modules, respectively. Importantly,
these positive correlations underscore the critical significance of
these modules in responding to abiotic stresses in maize. These
findings jointly emphasize the pivotal role of these modules in shaping
the maize’s response to abiotic stress.
According to module enrichment analysis, the heightened activity of the
turquoise module in metabolic adaptations and stress responses
underscores its central role in orchestrating essential adjustments in
response to abiotic stress. Additionally, the identification of
‘ubiquitin binding’ in the green module implied its involvement in
protein regulation and degradation. Moreover, the enrichment of
pathways such as ‘plant hormone signal transduction’ and
‘2-oxocarboxylic acid metabolism’ within the yellow module suggested
its potential participation in signal transduction and metabolic
processes. These findings collectively provide insights into the
dynamic functions of these modules and their significance in maize’s
stress response mechanism. Moreover, Zm.7361.1.A1_at ([189]AY108454.1;
MTD1), Zm.10386.1.A1_a_at ([190]BU051031) and Zm.10151.1.A1_at
([191]CK986091; PEBP; phosphatidylethanolamine-binding protein family)
were found to act as hub genes, so they can be taken into consideration
for the future. In parallel with the co-expression analysis, the DEGs
profile was fitted to a logistic LASSO regression model using 10-fold
cross-validation to select candidate abiotic response genes in maize.
Moreover, we used DGCA in maize to compare the correlation between each
pair of LASSO regression genes and hub genes under control and abiotic
stress conditions. The P value of DGC for 16 pairwise gene comparisons
was less than 0.01, indicating a significant change in gene–gene
correlation between abiotic stress and control, such as being adversely
correlated or correlated in one condition but not the other. These 16
pairwise gene comparisons can serve as excellent research subjects. Our
study identified 3 specific genes, namely Zm.11185.1.S1_at
(stress-associated endoplasmic reticulum protein 2-like),
Zm.2331.1.S1_x_at (Late Embryogenesis Abundant Protein 41) and
Zm.17003.1.S1_at (transcription factor HHO5), utilizing a combined
approach of LASSO regression and WGCNA analysis. Notably, these 3 genes
were identified in 3 separate gene-pair comparisons within this set of
16 pairs. This finding highlights the notable significance of these
genes in the abiotic stress response.
In general, this study’s results could be applied to further research
into the potential of enhancing abiotic stress tolerance techniques. To
verify the identified genes, additional experimental research is
needed, including overexpression and knockout experiments as well as
qRT-PCR gene expression analysis.
Supporting information
Supplementary file; sheet S1. Expression data of 560 DEG in different
samples.
Supplementary file; sheet S2. The list of DEGs along with analysis
parameters in maize samples in abiotic conditions (drought, mineral
deficiency, acid soils and waterlogging) obtained through microarray
meta-analysis.
Supplementary file; sheet S3. A list of genes obtained for each module
using the WGCNA package.
Supplementary file; sheet S4. GO (BP, MF, CC) and KEGG pathway
enrichment analysis of the modules.
Supplementary file; sheet S5. A collection of genes selected through
LASSO regression and those identified as hub genes and their
corresponding information, such as gene name, tair ID, description, GO,
pathway, domain and family.
Supplementary file; sheet S6. A pairwise analysis was conducted on
maize samples under control and abiotic stress conditions using the
Differential Gene Correlation Analysis (DGCA) package to examine the
correlation differences between gene pairs. In the analysis, a total of
24 genes were included, selected through a combination of LASSO
regression and hub gene identification. A significance threshold of P
value 0.05 was applied, resulting in a total of 276 pairwise
comparisons.
plad087_suppl_Supplementary_Files_S1-S6
[192]Click here for additional data file.^ (818.9KB, xlsx)
Acknowledgments