Abstract
Background
Advances in forward and reverse genetic techniques have enabled the
discovery and identification of several plant defence genes based on
quantifiable disease phenotypes in mutant populations. Existing models
for testing the effect of gene inactivation or genes causing these
phenotypes do not take into account eventual uncertainty of these
datasets and potential noise inherent in the biological experiment
used, which may mask downstream analysis and limit the use of these
datasets. Moreover, elucidating biological mechanisms driving the
induced disease resistance and influencing these observable disease
phenotypes has never been systematically tackled, eliciting the need
for an efficient model to characterize completely the gene target under
consideration.
Results
We developed a post-gene silencing bioinformatics (post-GSB) protocol
which accounts for potential biases related to the disease phenotype
datasets in assessing the contribution of the gene target to the plant
defence response. The post-GSB protocol uses Gene Ontology semantic
similarity and pathway dataset to generate enriched process regulatory
network based on the functional degeneracy of the plant proteome to
help understand the induced plant defence response. We applied this
protocol to investigate the effect of the NPR1 gene silencing to
changes in Arabidopsis thaliana plants following Pseudomonas syringae
pathovar tomato strain DC3000 infection. Results indicated that the
presence of a functionally active NPR1 reduced the plant’s
susceptibility to the infection, with about 99% of variability in
Pseudomonas spore growth between npr1 mutant and wild-type samples.
Moreover, the post-GSB protocol has revealed the coordinate action of
target-associated genes and pathways through an enriched process
regulatory network, summarizing the potential target-based induced
disease resistance mechanism.
Conclusions
This protocol can improve the characterization of the gene target and,
potentially, elucidate induced defence response by more effectively
utilizing available phenotype information and plant proteome functional
knowledge.
Electronic supplementary material
The online version of this article (doi:10.1186/s12870-017-1151-y)
contains supplementary material, which is available to authorized
users.
Keywords: Gene silencing, Plant defence gene discovery, Semantic
similarity, Arabidopsis thaliana, Pseudomonas syringae
Background
Plants are sessile organisms often subjected to several attacks and
invasions from herbivorous and pathogenic organisms. Pathogenic
organisms have been repeatedly reported to cause outbreaks on bean,
cucumber, stone fruit, kiwi and olive tree, as well as on other crop
and non-crop plants [[31]1]. Plants respond to these attacks by
switching on an array of defence pathways whose end products serve to
limit the progression of invading pathogens. The innate response is the
first stratum and earliest form of response reported. It involves
interaction between pathogen associated molecular patterns (PAMP/MAMP)
from the invading pathogen, and the plant’s membrane-localized pattern
recognition receptors [[32]2–[33]4]. Downstream of this is the
hypersensitive response (HR) which is highly specific and requires the
recognition of specific avirulent genes from the pathogen by specific
resistance (R) genes in the plant [[34]5, [35]6]. Further downstream is
a broad spectrum-type of response called systemic acquired resistance
(SAR), a long-lasting defence response which primes the plant against
future pathogenic attack [[36]2, [37]7]. In 1999, Pieterse and Van Loon
[[38]8] highlighted the existence of yet another form of systemic
response called induced systemic response (ISR), which can occur
independently of the HR and SAR.
These strata of defence response pathways are generally governed by
perturbations in the cells redox state, fluctuations in intracellular
ionic concentration, activation/repression of kinases, synthesis of
diverse signalling molecules, including salicylic acid (SA), jasmonic
acid (JA), ethylene (Et), activation/repression of transcriptional
co-regulators, such as the non-expressor of pathogenesis-related 1
(NPR1), activation/repression of transcription factors, such as TGA and
WRKY, and the ultimate expression/suppression of defence end products,
such as pathogenesis-related (PR) proteins, which influences the growth
and progression of the invading pathogens [[39]9–[40]14]. Understanding
the biology of the plant protection and defence response may help in
controlling the performance and survival of plants, which are very
often faced with several types of stresses. The induced resistance
response requires a coordinate action of many defence-related genes
interacting within processes and signalling pathways to control
down-stream responses during stress conditions. Computational modelling
is therefore playing an additional role [[41]15] in improving our
understanding of plant defence and disease resistance mechanisms in a
manner that is less costly, reduces redundancies and increases
robustness. Such models offer attractive alternatives for the
identification of gaps and opportunities in already characterized
biological systems, and could eventually inform the design of more
targeted molecular biology experiments [[42]16].
Pseudomonas syringae (P. syringae) is a rod shaped Gram-negative
bacterium, a major bacterial leaf pathogen that causes diseases in a
wide range of plant species with visible symptoms [[43]17, [44]18]. A
specific variety of P. syringae pathovar tomato strain DC3000
(Pst-DC3000), is known to infect DC3000 tomato and Arabidopsis thaliana
plants [[45]19], using the plant leaf surface to negotiate its entry
into the plant. Both Arabidopsis [[46]20] and Pst-DC3000 [[47]19] have
been sequenced and are genetically tractable, economically and
environmentally convenient, and constitute a model system in research
for studying molecular pathogenesis and dynamics of plant-pathogens
interactions [[48]21]. Interestingly, Arabidopsis thaliana shares high
similarity with high agronomic value crops, such as rice with
similarity of 0.71 [[49]22], suggesting that the findings related to
Arabidopsis thaliana can be assessed to determine whether they could be
applied to these different plants.
In this study, we used NPR1 (or SAI1 for salicylic acid-insensitivity
or NIM1 for non-inducible immunity) as a reference due to the available
experimental data and its central role in SAR signalling pathway, a
broad-spectrum resistance protecting against several pathogenic attacks
[[50]23], which activates antimicrobial genes products in both the
model plant Arabidopsis thaliana and a diverse number of food crops
[[51]4, [52]24–[53]32], including canola, cabbage, broccoli, tobacco,
tomato, potato, corn. NPR1 is also involved in the regulation of
signalling pathways like abscisic acid (ABA) [[54]33], which, as JA-
and Et-dependent activation of defense responses resulting from ISR
[[55]17], offers resistance to necrotrophic pathogens. We propose a
post-gene silencing bioinformatics (post-GSB) protocol, a two-step
approach (see Fig. [56]1), which can be used to efficiently analyze
plant differential responses based on leaf spore count or expression
level experimental data sets. This post-GSB protocol introduces the
closeness score concept between plants to extract differentially
infected plants in a population of Arabidopsis thaliana wild type and
npr1 mutant plants infected with Pst-DC3000 used in further analyses.
It uses a Gene Ontology (GO) based semantic similarity model to
identify proteins collaborating with NPR1 as well as enriched
biological processes involved in the defence response and elucidate
these process occurrence sequences. This new model manages eventual
uncertainty of data, potential noise inherent in the experiment and
redundancy from the semantic-based GO structure to effectively assess
contributions of and validate plant-defence genes, and predict
plant-defence mechanisms.
Fig. 1.
Fig. 1
[57]Open in a new tab
The post-gene silencing bioinformatics (post-GSB) scheme. This
describes the two steps of the model as described in the scheme. The
first step consists of producing experimental data sets and using the
closeness scores between plants to extract the set of differentially
infected plants to be considered for further analyses. The second step
initially focuses on statistical analyses checking whether there is any
difference between the two data sets: wild type and npr1 gene mutant
plants. If this difference is significant, then the Gene Ontology
process annotation based functional analysis is performed to identify
proteins, enriched processes and pathways contributing to the
NPR1-based plant defence mechanisms
Methods
The study was divided into two parts, the first was to check whether
NPR1 plays an important role in plant protection and secondly to
perform functional analyses to predict the biological mechanisms of the
NPR1-based defence response.
Experimental data from Arabidopsis wild type and npr1 mutant plants
Data analyzed in this study was derived from a study conducted at the
John Innes Centre (England) in 2010 [[58]32] and consisted of data from
wild type (Wt) and npr1 mutant Arabidopsis thaliana Columbia (Col – 0)
plants 48 hours post infection with Pst-DC3000 [[59]34]. These plants
were grown and inoculated under controlled greenhouse conditions (23^°
C, 10 h day/14 h dark regime and a relative humidity of 65±5%) in the
John Innes Centre in England [[60]32]. Pst-DC3000 bacteria strain was
cultured as described in [[61]34] using King B’s media [ 20 gL ^−1
proteose peptone (w/v), 1.5 gL ^−1 di-potassium hydrogen phosphate
(w/v), 1.5 gL ^−1 magnesium sulphate (w/v), 1.5% glycerol (v/v) and
1.2% bacterio agar (w/v)] supplemented with 50 mgL ^−1 kanamycin. The
bacteria cells were re-suspended in a 10 mM MgCl [2] solution and the
concentration adjusted to 5×105 colony forming unit cfu.mL ^−1. For
each plant to be inoculated, three leaves were hand-infiltrated using a
1 mL needless syringe containing either a 10 mM MgCl [2] solution or an
inoculum of Pst-DC3000 (5×105 cfu.mL ^−1). Bacteria growth was measured
as spore counts 48 h post treatment using 8 mm leaf discs excised from
the treated plants. Measurements were conducted using a FB12
luminometer with a single photon counter [[62]32, [63]34]. For adequate
representation and statistics purposes, the experiment was conducted
three times using a total of 60 plants (three infected leaves per
plant) per genotype (see raw data in Additional file [64]1).
Sample extraction and analysis
Data from a total of 31 Arabidopsis Wt plants (controls) and 29 npr1
mutant plants (cases) inoculated with Pst-DC3000 were used for further
analysis. Figure [65]2 is a graphical representation of the spore count
data to visually assess the distribution of these spore counts.
Noticing that there was a distribution pattern, samples were further
analysed in three main steps: (1) computing the closeness score of each
individual in a sample and ranking these individuals based on these
scores; (2) extracting the list of differentially infected plants using
Pearson χ ^2 scores based on aggregated closeness scores; and (3)
statistically measuring the significant difference among the two
phenotypes.
Fig. 2.
Fig. 2
[66]Open in a new tab
Plotted heat-maps of spore counts for the case and control raw
datasets. These plots (cases and control datasets on left and right
sides, respectively) show variations of leaf spore counts from
relatively lower (dark orange) to higher (dark green) counts as an
indication of low and high levels of infection. An asterix on a given
plant identifier indicates that the plant is among differentially
infected plants within a dataset
Computing individual closeness scores
Generally, a set of plants within each group consists of a mix of
phenotype tendency associated with different levels (moderate or high)
of infection observed through plant spore counts. Since at a higher
level of infection, it may not be evident to distinguish between the
two groups under consideration, there is a need for a scoring scheme
which can appropriately classify plants with moderate level of
infection, referred to as differentially infected plants, based on
plant spore counts. Here we set up a closeness score approach to
measure the length of the line-segment joining the plant phenotypes
using spore count vectors, enabling the selection of differentially
infected plants. The individual closeness score of an individual
(plant) ı, denoted ICS(ı), in a sample is computed as follows:
[MATH:
ICS(ı)=∑ȷ=1nSıȷ
:MATH]
1
where n is the size of the sample (number of individuals in the sample)
and
[MATH: Sıȷ
:MATH]
is the similarity score between individuals ı and ȷ calculated using
the Manhattan metric defined on a vector of spore counts for three
different readings and converted to similarity measure following the
Resnik-edge-based approach [[67]35, [68]36]. This similarity score
[MATH: Sıȷ
:MATH]
is given by:
[MATH: Sıȷ=<
/mo>2×δxı−dxı,xȷ<
/msub> :MATH]
2
with x [k] a spore count vector of individual k, δ(x [ı])= max{d(x
[ı],x [ȷ]):ȷ=1,…,n} and d(x [ı],x [ȷ]) the distance between spore count
vectors of the two individuals based on the Manhattan metric defined on
a vector of spore counts, computed as follows:
[MATH: dxı,xȷ<
/msub>=min∑k
=1nxık−xȷp(<
/mo>k):p<
/mi>∈P :MATH]
3
where
[MATH: P :MATH]
is the set of all permutations of the ordered set S={1,2,…,n} of n
readings in the sample. The individual or plant with the highest ICS is
referred to as reference individual or plant, whose similarity scores
to other plants are used to extract the set or list of plants of
interest for further statistical analyses.
Extracting sets of differentially infected plants in different samples
In order to minimize the impact of potential biases which may emanate
from highly infected plants, we design a model for selecting sets of
differentially infected plants in which spore counts based phenotypes
are significantly associated with moderate level of infection in
different samples based on closeness or similarity score described in
the subsection above. Using similarity scores between reference plant,
denoted d, and all plants in a sample, the list of these plants were
ranked from the highest to the lowest scores. The ranked plant list is
then used to compute a Pearson χ ^2 score for each plant subset,
reflecting the tendency of plants in a particular set to occur towards
the extremes of the list. The Pearson χ ^2 score of a subset containing
plants re-indexed from ℓ to m, denoted
[MATH:
χP2
ℓ,m
:MATH]
, is given by:
[MATH:
χP2
ℓ,m
=∑ȷ=ℓm<
mrow>Sdȷ−<
/mo>μℓm
2μℓm :MATH]
4
which is known to approximate a χ ^2-distribution with (m−ℓ) degree of
freedom (dof) and where their expected value μ [ℓm] represents
aggregated score (ES) of the extracted subset and estimated as follows:
[MATH:
μℓm=<
mtext>ESm−ℓ+1<
/mrow>=∑ȷ=ℓmSdȷ
:MATH]
5
We identified s top individuals as differentially infected with s the
index fold-change
[MATH:
χP2
:MATH]
. Assuming that the ranked list contains n individuals, s (s≤n) is the
smallest index satisfying the following inequality:
[MATH:
χP21,s
−r∗χP
mi>2s+1,n<
/mrow>1+r2>0 :MATH]
6
where
[MATH:
r=s−1
n−s−1 :MATH]
7
which is the ratio of standard deviations of
[MATH:
χP2
1,s
:MATH]
and
[MATH:
χP2
s+1,n<
/mrow> :MATH]
, and set to 1 if s=1.
The significance of the aggregated score, ES(s), of the subset of
differentially infected plants was assessed using sample randomization.
So, we randomly selected 1000 independent subsets (same size with the
set of differentially infected) of plants and compute ES of each subset
and then perform the Shapiro-Wilk test under the null hypothesis that
the generated sample is drawn from a normal distribution. Following the
rejection or the acceptance of the null hypothesis, we perform the
Wilcoxon- or the T-test to check whether the identified set of
differentially infected individuals is more than expected by chance.
Statistical analyis of npr1 mutant and wild-type plant based disease symptoms
The contribution of NPR1 protein to the plant protection during
infection as the defence co-regulator is statistically assessed on sets
of differentially infected plants extracted from the two population
samples (mutant/case and wild/control). We test for the differences
between the mutant and the wild type plants using parametric or non
parametric statistics given that agreements with respects to the
violations of the assumptions are met. Thus, the normality and variance
homogeneity assumptions of the two samples are tested using the
Shapiro-Wilk and Bartlett tests, respectively. In order to satisfy
these assumptions for datasets which are characterized by extreme
non-normality or heterogeneity of variance, we apply a data
transformation [[69]37, [70]38] based on the Fisher-Pearson coefficient
of skewness [[71]39] to bring the data into greater agreement with the
normality and variance homogeneity assumptions. Depending on whether
normality and variance homogeneity assumptions are met or not, we
performed the analysis of variance (ANOVA) method or the non-parametric
Kruskal-Wallis rank sum test.
Bioinformatics analysis of potential NPR1-based plant defence mechanisms
We used the Gene Ontology (GO) [[72]40, [73]41] and the protein GO
Annotation (GOA) mapping provided by the UniProtKB-GOA project
[[74]42–[75]44] to reveal the potential NPR1-based regulatory network
using enriched biological processes and pathways in which a set of
protein targets are involved. The complete set of GO data and
protein-GO term associations were extracted from the GO and GOA
databases, respectively, accessed on the 11th of October, 2016. For the
pathway enrichment analysis, we use the Kyoto Encyclopeadia of Genes
and Genomes (KEGG) datasets. The whole Arabidopsis thaliana pathway
dataset was extracted from the KEGG database at
[76]http://www.genome.jp/kegg/.
Identification of other putative proteins participating to the plant defence
response
Proteins perform an astonishing range of biological functions in an
organism by collaborating in pathways and processes, and interacting
with the cellular environment in some way to promote the cell’s growth
and function [[77]45, [78]46]. This argues that the induced resistance
response requires concerted biological action of many genes involved in
diverse processes or defence signalling pathways. There are various
computational approaches for identifying proteins co-working or
involved in a similar process, which, in general, rely on a priori
knowledge of protein functional features. This functional knowledge can
be either a protein-protein interaction (PPI) network, in which case,
these proteins interact and influence each other in the same
sub-network or protein functional annotations, in which case, these
proteins are involved in semantically similar processes or in the same
pathway. In this study, since we are looking at the NPR1 biological
role and functional annotations, we use protein biological process
dataset to identify putative proteins which are functionally similar to
Arabidopsis NPR1 by quantifying functional similarity between proteins
based on sets of strict non-redundant processes annotating proteins
under consideration [[79]47].
Given two proteins p and q, the functional similarity between p and q,
BMA(p,q), is computed using the Best Match Average model
[[80]48–[81]51] as follows:
[MATH: BMAp,q
=121n∑s∈Tp
Ss,Tq+1m∑
mrow>s∈Tq
Ss,Tp :MATH]
8
where T [r] is a set of process terms annotating a given protein r and
n=|T [p]| and m=|T [q]| are the number of processes terms in these
sets.
[MATH: Ss,Tr=maxSs,t
:t∈Tr
:MATH]
with
[MATH: Ss,t
, :MATH]
the semantic similarity score between process terms s and t computed
using the GO-universal metric [[82]48]. It is worth noting that
semantic similarity measure has proved its effectiveness to solving
biological issues related to the Gene Ontology annotation-based
knowledge discovery, including predicting and validating
protein-protein interaction (PPI) [[83]52]. With curated PPIs, which
are still relatively scarce, high-throughput biology experiments and
computational methods producing high rate of false positive PPIs
[[84]53], and the lack of appropriate techniques to address these
shortcomings [[85]45], semantic similarity-based approach is more
effective, providing a classification indicator of PPIs in the absence
of reliable information.
Identifying enriched Arabidopsis defence response processes and pathways
For process or pathway enrichment analysis, we used hyper-geometric
test adjusted using the Bonferroni multiple testing correction. For
each process or pathway, the p-value was calculated using its
frequencies of occurrence in the reference dataset (Arabidopsis
thaliana proteome) and target set, which composed of proteins
identified to be functionally similar to NPR1. Thus, this p-value,
which is the probability of observing at least s proteins from a target
gene set of size n by chance, knowing that the reference dataset
contains m annotated genes out of N genes, is given by the following
formula [[86]54, [87]55]:
[MATH: PX≥s
=1−∑k=0s−1
mkN−m
n−kNn :MATH]
9
where the random variable X is the number of genes involved in the GO
process or participating to the pathway under consideration within a
given target gene set contributing to plant defence response. To
account for relationships between GO terms in the GO structure, we used
the concept of the GO term semantic similarity score [[88]55] and the
frequency of occurrence f(t) of the target-associated process t in a
set of proteins P is given by:
[MATH:
f(t)=∑
mo>q∈Pδq(t<
mo>) :MATH]
10
where δ [q] is the q-function indicator given by
[MATH:
δq(t)=1ifSq(
mo>t)>ε0<
/mn>otherwise :MATH]
11
where ε≥0 is the agreement level or customized agreement at which a GO
process is considered to be a possible annotation of the protein q and
[MATH: Sq(
mo>t)=St,Tq :MATH]
, representing the semantic similarity degree at which a related term
is considered to semantically reflect in the specification of the term
t [[89]47]. Here, ε was set to 0 for t∈T [q] or
[MATH: t∈Aq :MATH]
, and to 0.7 for
[MATH: t∈Cq :MATH]
where
[MATH: Aq :MATH]
and
[MATH: Cq :MATH]
are sets of ancestors and descendants of processes in T [q],
respectively. The p-value of each process term was adjusted using the
Bonferroni multiple testing correction.
Results and discussion
In the first part of the post-GSB framework developed in this study, we
have designed a spore count closeness scoring approach. This enables
the identification of differentially infected plants used to ascertain
the role of NPR1 in Arabidopsis plant defence response with bacteria
spore count data collected from Arabidopsis Wt and npr1 mutant leaves
48 h post Pst-DC3000 infection as described in the “[90]Methods”
section. We then predicted the potential NPR1-based regulatory network
based on the identified set of putative proteins, enriched biological
processes and pathways which participate in plant defence response.
NPR1 protein is required for Arabidopsis defence response
The heat map representation of the raw experimental spore count data
from both data sets in Fig. [91]2 revealed that spore distribution
patterns in the Arabidopsis Wt infected leaves are more closely similar
(see Fig. [92]2 [93]b) in comparison to those of the mutant (Fig. [94]2
[95]a). This is in agreement with previous results which showed a
significant difference in spore numbers between the two phenotypes
(data not shown) [[96]32]. In a study of Wt and npr1 mutant plants by
Cao et al. [[97]56], it was observed that infection spread event to
uninfected parts of the leaf is quite rapid due to its high level of
susceptibility. To eliminate the possibility that the presence of
highly infected plants in the mutant population is contributing
significantly to the observed differential response found in this
study, we statistically tested the difference between the two groups of
plants on specific sets of plants from these two groups. These sets
contains only differentially or moderately infected plants in order to
keep out the highly infected plants within the two phenotype groups
using the closeness scoring approach. Outputs generated 13 moderately
infected npr1 mutant plants, and 14 moderately infected Arabidopsis Wt
plants which were used for further downstream analysis (Fig. [98]2,
plants marked with an asterisk). Note that these identified sets of
differentially infected plants for both npr1 mutant and wild-type
plants are statistically significant with p-values < 2e−16.
The graphical representation (Q-Q plots) of the above Wt plant sub-data
indicated that this sub-data was not normally distributed (see
Additional file [99]2: Figure S1) with the Shapiro-Wilk test p–value of
0.000152≤0.05, thereby leading to a two-step statistical analysis
process. Firstly, we tested for statistical differences using the
non-parametric Kruskal-Wallis rank sum test. Results from this test
indicated that within phenotypes, there was no significance difference
(p–value =0.70>0.05), but, between the two phenotypes, there was a
significant difference in the number of bacteria spores (p–values
=4.551e−12≤0.05). Noting that non-parametric tests have weaker
statistical power in comparison to parametric tests, and to further
eliminate the possibility of introducing a type II error [[100]57], we
applied a parametric approach to increase the robustness of our
analysis.
In order to bring different data subsets into agreement with the
normality and homogeneity of variance (equal population variance)
assumptions, these data subsets were log transformed (see Additional
file [101]2: Figure S2) and scaled using their standard deviations.
Note that the choice of log10 transformation was guided by the
Fisher-Pearson skewness coefficient [[102]37, [103]38] of the Wt
dataset with a value of 1.46524, suggesting that its distribution was
highly skewed (substantially positive skewness) [[104]39]. The
Shapiro-Wilk normality test and the Bartlett variance homogeneity
p–values were greater than 0.1169 and 0.98 for both Wt and npr1 mutant
data subsets. Results from our ANOVA using this log transformed data
subsets still confirmed that there was no significance difference
within data subsets (p–value =0.481), but there existed a significant
difference in spore numbers between both phenotypes (p–value < 2e−16)
with the measure of association ω=0.98893. This indicates that 99% of
the variability between Wt and npr1 mutant samples may be attributed to
whether the NPR1 protein is functionally active or not, providing
evidence that NPR1 plays a critical role in plant defence response.
In summary, we applied the Pearson χ ^2-based plant closeness scoring
scheme in order to extract specific sets of plants in populations under
consideration to be used for further statistical inferences. These
experimental data subsets showed evidence that a functional NPR1 is
required to limit bacteria growth in Arabidopsis plants. This finding
correlates with results from previous studies that have equally
demonstrated that the presence of a functional Arabidopsis NPR1 is
important in limiting pathogen growth in infected plants [[105]56,
[106]58, [107]59], playing a key role in the SAR signalling pathway
[[108]23, [109]58, [110]60], the broad-spectrum type of response
providing defence against secondary pathogens after a primary attack.
This Pearson χ ^2-based plant closeness scoring approach enables the
use of more relevant data subsets in further analyses, thus limiting
the effect of the eventual uncertainty of these datasets and potential
noise inherent in the biological experiment used. This provides an
increased robustness of statistical analyses and stronger validation of
inferences from experimental datasets.
Elucidating potential NPR1-based enriched process regulatory network
Similar to other organisms, plants combat the adverse effects of
stressors by activating or deactivating various processes and pathways.
We therefore applied a semantic similarity based computational model to
gene products that are functionally similar to NPR1 and identify
enriched plant defence biological processes using protein GO annotation
mapping from the GOA-UniProtKB dataset [[111]42, [112]44] with the
reviewed proteins in the Arabidopsis proteome retrieved from the
UniProt database [[113]61]. Furthermore, we used the Arabidopsis KEGG
pathway dataset to retrieve enriched pathways participating to the
NPR1-based regulation of Arabidopsis defence response and elucidated
the GO-based regulatory network triggered by NPR1.
Identification of NPR1-associated plant defence proteins
In general, defence mechanism is driven by several gene products that
act dependently or independently. In order to identify NPR1-associated
plant defence proteins, we computed the functional similarity scores
between NPR1 and other 20 545 annotated proteins [[114]62] out of 51874
found in the complete list of proteins in the Arabidopsis proteome. In
order to avoid over-estimating functional similarity scores between
proteins [[115]52], which is induced by the redundant processes from
parent-child relations from the GO structure [[116]47], we used the set
of filtered non-redundant processes in which proteins are involved.
Results indicated that 30 proteins shared high functional similarity to
the Arabidopsis NPR1 (Table [117]1) of which, 25 were reviewed or
manually curated (in Table [118]1; protein with status ✓). A
hierarchical clustering representation of these 25 annotated proteins
together with the Arabidopsis NPR1 protein, which shows functional
relatedness between these 26 reviewed proteins, produced two main
clusters (Fig. [119]3) with Cluster 1 comprising of two sub clusters
(1A and 1B). Sub-Cluster 1A contains proteins closest to NPR1 (at the
maximum distance of approximately 0.245) and distant at 0.275 from
Sub-Cluster 1B, while proteins in Cluster 2 were the farthest subgroup
being distant at 0.375 maximum from Cluster 1. Proteins in the
sub-cluster 1A comprised of the NPR1/NIM-1 interacting protein
(NIMIN-1), transcription factor proteins (TGA3, WRKY70), Resistance to
P. syringae 2 (RPS2), two Enhanced Diseases Susceptibility proteins
(EDS1, PAD4) and Mitogen-activated protein kinase kinase 4 (MKK4).
Those in the sub-cluster 1B comprised of WRKY40, AT1G74360, XBAT34, the
putative calmodulin-like protein 47 (CML47), AT2G23680, the Yellow
Leaf-Specific protein 9 (YLS9), Constitutively Activated Cell Death 1
(CAD1), PUB23, CML40, and ATL2. Proteins in cluster 2 comprised of
Mitogen-activated Protein Kinase 11 (MPK11), Phosphoinositide 4-Kinase
Gamma 4 (PI4KG4), PUB2, two basic Leu zipper (BZIP) transcription
factors (BZIP60 and bZIP28), Molybdopterin biosynthesis protein (CNX1),
Calcium-dependent Protein Kinase 9 (CPK9), and Oxysterol-binding
Protein-Related Protein 1C (ORP1C2).
Table 1.
Proteins highly similar to NPR1 or Non-inducible immunity protein 1
(NIM1)
Entry name Gene names Status Protein name
[120]Q9LY00 WRKY70 or At3g56400 ✓ Probable WRKY transcription factor 70
(WRKY DNA-binding protein 70)
[121]Q9LMM5 MPK11 or At1g01560 ✓ Mitogen-activated protein kinase 11
(AtMPK11) (MAP kinase 11)
[122]O64834 At2g23680 or F26B6.33 ✓ Cold-regulated 413 plasma membrane
protein 3 (AtCOR413-PM3)
[123]Q9ZPY9 PI4KG4 or PI4KGamma4 ✓ Phosphatidylinositol 4-kinase gamma
4 or At2g46500 (AtPI4Kgamma4) (PI4K gamma 4)
[124]Q39054 CNX1 or At5g20990 or F22D1.6 ✓ Molybdopterin biosynthesis
protein CNX1
[125]Q5XEZ8 PUB2 or At5g67340 ✓ U-box domain-containing protein 2
[126]Q9SJ52 YLS9 or NHL10 or At2g35980 ✓ Protein YLS9 (Protein
NDR1/HIN1-LIKE 10)
[127]Q9SGI8 CML40 or At3g01830 ✓ Probable calcium-binding protein CML40
[128]Q9FNZ5 NIMIN-1 or At1g02450 ✓ Protein NIM1-Interacting 1 (NIMIN-1)
[129]Q38868 CPK9 or At3g20410 ✓ Calcium-dependent protein kinase 9
Q9C7SO BZIP60 or At1g42990 ✓ bZIP transcription factor 60 (AtbZIP60)
[130]O80397 MKK4 or At1g51660 ✓ Mitogen-activated protein kinase kinase
4 (AtMKK4)
[131]Q9L9T5 ATL2 or At3g16720 ✓ RING-H2 finger protein ATL2 (Protein
Arabidopsis Toxicos En Levadura 2)
[132]Q9S745 PAD4 or EDS9 or At3g52430 ✓ Lipase-like PAD4 (Protein
Enhanced Disease Susceptibility 9)
[133]Q42484 RPS2 or At4g26090 ✓ Disease resistance protein RPS2
(Resistance to P-syringae protein 2)
[134]Q9SN89 CML47 or At3g47480 ✓ Probable calcium-binding protein CML47
[135]Q9SU72 EDS1 or EDS1A or At3g48090 ✓ Protein EDS1 (Enhanced Disease
Susceptibility 1)
[136]Q8L751 ORP1C or At4g08180 ✓ Oxysterol-binding protein-related
protein 1C
[137]Q39234 TGA3 or BZIP22 or At1g22070 ✓ Transcription factor TGA3
(AtbZIP22)
[138]Q9SAH7 WRKY40 or At1g80840 ✓ Probable WRKY transcription factor 40
(WRKY DNA-binding protein 40)
COLGJ1 At1g74360 or F1M20.4 ✓ Threonine-protein kinase
[139]Q84TG3 PUB23 or At2g35930 ✓ Plant U-box protein 23
Q9FPHO XBAT34 or At4g14365 ✓ Putative E3 ubiquitin-protein ligase
XBAT34
[140]Q9SG86 BZIP28 or At3g10800 ✓ bZIP transcription factor 28
(AtbZIP28)
[141]Q9C7N2 CAD1 or At1g29690 ✓ MACPF domain-containing protein CAD1
(Protein Constitutively Activated Cell Death 1)
[142]Q9C9H6 RLP11 or At1g71390 ✗ Putative disease resistance protein
(Receptor like protein 11)
[143]Q9SJQ8 At2g36470 ✗ Expressed protein (Uncharacterized protein)
[144]Q8RXN8 At5g42050 ✗ Development and Cell Death (DCD)
[145]Q8GYH5 ANK or At5g54610 ✗ Ankyrin repeat family protein
[146]Q9LN03 At1g08050 ✗ C3HC4-type RING finger-containing protein
[147]Open in a new tab
The gene names are given in the second column whilst the full protein
name as downloaded from the UniProt database [[148]61] is given in the
fourth column. A gene can have more than one name, but throughout this
study we use the first names of each gene. The third column indicates
whether the protein has been reviewed (i.e., manually curated: marked
✓) or not (✗)
Fig. 3.
Fig. 3
[149]Open in a new tab
Functional similarity between NPR1 and other associated proteins. A
dendrogram showing the relatedness of 26 reviewed proteins in
Arabidopsis thaliana proteome. This was obtained by applying
hierarchical agglomerative clustering algorithm on the set of annotated
proteins related to NPR1 using a distance retrieved from functional
similarity scores and represented on the horizontal axis
An in-depth literature review demonstrated that these functionally
related proteins could contribute in the same pathway as NPR1 or
independently to either positively or negatively drive defence response
to pathogens and elicitors. For instance, calcium signalling is known
to be associated to the innate response, HR and SAR, acting upstream of
the pathogen-induced SA-signalling pathway to favor the activation of
NPR1 and associated downstream activities, such as TGA binding, to
inhibit pathogen growth [[150]63–[151]65]. The calcium signalling
triggers the activation of a set of Mitogen protein kinase (MPK)
contributing to the plant innate immunity related cell death [[152]66]
and the MAMP defence [[153]67], which is also an innate-form of defence
response, and to other processes that stimulate ROS production during
plant pathogen attack [[154]68]. In this study, two MPK proteins,
namely MPK11 (Cluster 2) and MKK4 (Cluster 1A) which is an important
mediator of plant response to osmotic stress [[155]69], were found to
be closely related to NPR1. Proteins belonging to the EF hand super
family have been shown to harbour motifs that are important for calcium
sensing [[156]70, [157]71]. Typical examples of proteins within this
family are the calmodulin (CaM)-like proteins (CML) and Calcium protein
kinase (CPK) [[158]70, [159]72, [160]73] whose members: CML47, CML40,
CPK9 identified in this study belonged to Clusters 1A and 2. Moreover,
two other proteins identified, NDR1 (YLS9) in Cluster 1B and EDS1 in
Cluster 1A, are known to mediate the Effector Triggered Immunity
through specific R genes [[161]74, [162]75]. As illustration, EDS1 was
found to be a regulatory gene controlling down-stream defence gene
expression in Arabidopsis [[163]76], targeting R genes with TIR-NB-LRR
motifs required for the recognition of Pst-DC3000 avrRps4 proteins
[[164]75, [165]77, [166]78]. R genes, such as RPS2 (Cluster 1A), which
recognizes Pst-DC3000 avrRpt2, are targeted by NDR1 and harbour
LZ-NBS-LRR motifs [[167]74, [168]75, [169]77, [170]78]. YLS9, an
NDR1/HIN1-like 10 (NHL-10) identified in our study has been implicated
in HR and is inducible by several CPKs [[171]79], thereby suggesting
its action downstream of the calcium signalling pathway.
As pointed out previously, proteins functionally related to NPR1 may
contribute to the plant defence mechanism degradation. For example,
CAD1 (Cluster 1A) has been implicated in the repression of the
programmed cell death (PCD) pathway mediated by SA and this negative
function of CAD1 is inhibited under stress conditions to favor the PCD
pathway [[172]80]. However, even under undisturbed conditions, plant
proteins undergo proteolysis as a strategy to effectively turnover its
protein reservoir and to ensure for efficient functioning of biological
processes through an ubiquitination process [[173]81, [174]82]. The
NPR1 protein, for instance, which can switch between two conformational
states (oligomeric and monomeric) or two cell compartments (cytosol and
nucleus), is a typical example of a protein whose nuclei-localized
monomers undergo proteolysis through a CUL-3-based E3 ligase
ubiquitination process to ensure for the effective expression of its
target genes [[175]59, [176]83, [177]84]. Our analysis identified five
putative ubiquitin-like proteins - XBA34, PUB23, PUB2, ATL2, PI4KG4
[[178]85–[179]87]. Although their direct role in NPR1 ubiquitination
has not been reported, their transcript levels, especially those of
ATL2 and PI4KG4, have been shown to increase following exposure of
plants to chitinases [[180]88] and treatment with SA [[181]89],
respectively. The role of these proteins during drought stress has
equally been demonstrated [[182]85]; and proteins, such as XBAT34 have
been shown to be induced by 6-BA (6-benzylaminopurine) and SA
[[183]90]. Additionally, XBAT34 contains two Ankyrin repeats, which are
also core motifs found in NPR1 protein and are important in mediating
interactions with other proteins [[184]58, [185]90–[186]92]. The
importance of the ankyrin repeats within the NPR1 sequence is elegantly
demonstrated in the inability of npr1 mutants to activate PR genes
following pathogen attack due to a distortion in this region [[187]58,
[188]93]. A distorted ankyrin repeat within the NPR1 protein hampers
its ability to bind to transcription factors belonging to the basic
leucine zipper (BZIP) family such as TGA3 (Cluster 1A) for a positive
defence outcome [[189]92, [190]94–[191]97].
Two other BZIPs (BZIP60 and BZIP28) were identified in this study and
have been implicated in Unfolded Protein Response (UPR) during stress
response [[192]98–[193]100]. The ability of the BZIP domain to
translocate to the nucleus during stress could be a key contribution to
its role during stress responses. In fact, although we did not find a
direct relation between BZIP28 and BZIP60 to NPR1 in literature,
existing evidence shows that the inositol-requiring protein-1a
(IRE1a)/BZIP60 branch of the UPR pathway have a role in PR protein
secretion following treatment with SA [[194]99]. In addition to the
BZIP transcription factor (TGA3) identified here, proteins belonging to
the WRKY family of transcriptional factors [[195]101] – WRKY70
(Fig. [196]3; Cluster 1A) and WRKY40 (Fig. [197]3; Cluster 2) were
equally identified. Unlike TGA3, no direct interaction between NPR1 and
WRKY40 was found during our literature search. However, similar to NPR1
monomers, evidence for the nuclei localization of WRKY40 does exist
[[198]102]. Its role in stress response has also been demonstrated and,
its ability to interact with other WRKY proteins has been shown
[[199]102, [200]103]. For instance, co-expression of WRKY18 and WRKY40
led to increased susceptible of plants to P. syringae and Botrytis
cinerea correlating with reduced PR gene expression [[201]102]. WRKY70
on the other hand is known to positively mediate pathogen defence
response in an SA-dependent fashion which is independent of the NPR1
pathway [[202]104]. PAD4, which is another protein identified (Cluster
1A), can act dependently (e.g., for SA-dependent defence response) or
independently (e.g., for basal immunity [[203]105]) of NPR1 for PR-1
gene expression [[204]106, [205]107]. NPR1 also has the ability to
interact with non-transcription factors, such as NIMIN-1, identified in
this study (Cluster 1A) and experimentally verified to bind to NPR1
[[206]108, [207]109] to modulate PR-1 gene expressions during SAR.
Retrieving enriched processes and pathways mediating the Arabidopsis thaliana
plant defence
A total of 114 biological processes were found to be involved in the
Arabidopsis thaliana protection and defence mechanisms, among which 74
were found to be non-redundant considering the feature of the GO
structure [[208]47]. 21 processes were identified to be statistically
significant in plant defence mechanisms and Table [209]2 provides a
summary of the type of response and biological activities in which
these enriched processes are involved. These processes are related to
both biotic and abiotic triggered responses, leading to the initiation
of major pathways involved in the innate response (GO:0010200), HR
(GO:0009626) and SAR (GO:0009862) in plants [[210]2–[211]7]. In
general, these processes are triggered on the basis of the (pathogen)
attacks classified using the mode of plant-pathogen interactions. For
example, a pathogen may survive by using the living plant cells (for
biotrophic pathogens) or by killing plant cells and feeding on these
dead cells (for necrotrophic pathogens) [[212]110]. Interestingly, the
enrichment pathway analysis reveals two enriched KEGG pathways
participating to NPR1-associated plant defence response, namely
Plant-pathogen interaction
([213]http://www.genome.jp/kegg-bin/show_pathway?ath04626) and Plant
hormone signal transduction
([214]http://www.genome.jp/kegg-bin/show_pathway?ath04075) with
p-values of 0.013085 and 0.02070, respectively. These pathways show
evidence of regulating a wide variety of enriched biological processes,
including programmed cell death (PCD) and defence, HR, ABA and SA
dependent responses.
Table 2.
Statistically significant or enriched biological processes protecting
the Arabidopsis thaliana plant
GO ID Process name L p-value ap-value Process functions Stressor Tps RF
GO:0010200 Response to chitin 5 1.40e−11 1.04e−09 Defence against
chitin bearing pathogens such as fungi, exoskeletons of animals and
nematodes. Biotic Biotrophics ✓
GO:0000165 MAPK cascade 9 7.69e−12 5.69e−10 The process involves
programmed cell death as a way of responding to biotic and abiotic
factors. Biotic Biotrophics ✓
GO:0006612 Protein targeting to membrane 8 0.0 0.0 Directs the proteins
to the cell membrane for further transportation and communication of
chemicals. Biotic & abiotic - ✗
GO:0001666 Response to hypoxia 5 8.12e−09 6.01e−07 The response is
triggered when the oxygen goes below, 20.8−20.95%. Abiotic - ✗
GO:0009611 Response to wounding 3 8.61e−06 6.37e−04 A biological
mechanism that takes place indicating the damage of an organism. Biotic
& abiotic Biotrophics ✓
GO:0009738 Abscisic acid (ABA)-activated signalling pathway 7 0.0 0.0
Controls environmental stress for example, drought and plant pathogens
[[215]115]. Abiotic & biotic Necrotrophics ✓
GO:0009409 Response to cold 4 0.0 0.0 Reacts to changes in low
temperatures. Abiotic - ✗
GO:0061025 Membrane fusion 4 0.0 0.0 Ensures life continuity in
organisms. Biotic & abiotic Necrotrophic & biotrophics ✓
GO:0009626 Plant-type hyper-sensitive response 6 5.29e−12 3.91e−10
Killing of cells surrounding the infected area (programmed cell death)
to starve pathogens like biotrophics Biotic Biotrophics ✓
GO:0010310 Regulation of hydrogen peroxide metabolic process 6 3.51e−11
2.60e−09 Controls plant stress and programmed cell death. It is also a
go-between the wounding responses and biotic interactions. Biotic &
abiotic Necrotrophics ✓
GO:0009697 Salicylic Acid biosynthetic process 9 2.72e−11 2.01e−09
Forms salicylic acid which is a fungicide present in some plants.
Biotic Biotrophics ✓
GO:0002679 Respiratory burst involved in defense response 4 2.86e−11
2.12e−09 Ensures life continuity in plants when the defence response
consumption rate increases for instance, the presence of phagocytic
leukocytes. Biotic Necrotrophics & biotrophics ✓
GO:0035304 Regulation of protein dephosphorylation 8 0.0 0.0 Controls
the removal rate of phosphorous, suppresses pathogenesis related
expression and promotes programmed cell death. Biotic Biotrophics ✓
GO:0031348 Negative regulation of defence response 6 0.0 0.0 Controls
the frequency of the defence response. Biotic Necrotrophics ✓
GO:0010363 Regulation of plant-type hyper-sensitive response 7 1.01e−11
7.51e−10 Balances the hyper-sensitive response. Biotic Biotrophs ✓
GO:0009862 Systemic acquired resistance, salicylic acid mediated
signalling pathway 8 4.99e−12 3.70e−10 Salicylic acid plays a vital
role in the expression of pathogenesis related proteins. Biotic
Biotrophics ✓
GO:0043069 Negative regulation of programmed cell death 6 0.0 0.0 Stops
the occurrence of programmed cell death. Biotic Necrotrophics ✓
GO:0050832 Defence response to fungus 6 2.41e−11 1.78e−09 Protects the
cell plants from fungi. Biotic Necrotrophics & biotrophics ✓
GO:0009410 Response to xenobiotic stimulus 3 4.06e−06 3.00e−04 Takes
place when the system identifies foreign compounds in the organism.
Biotic Necrotrophics & biotrophics ✓
GO:0010112 Regulation of systemic acquired resistance 7 1.70e−04 0.0126
Balancing of systemic acquired resistance. Biotic Biotrophics ✓
GO:0016045 Detection of bacterium 6 4.11e−11 3.04e−09 Converts the
signal of presence of bacteria into a molecular signal. Biotic
Necrotrophics & biotrophic ✓
[216]Open in a new tab
For each process identified, following information is provided: Level
(L) in the GO directed acyclic graph structure, p-value and adjusted
p-value (ap-value), functional activities triggered, nature of stress
thwarted, type of pathogens suppressed (Tps) and whether it contributes
to response to fungi (RF) or not
NPR1-associated plant defence proteins share six processes, namely MAPK
cascade (GO:0000165), SAR, SA mediated signalling pathway (GO:0009862),
Regulation of plant-type hyper-sensitive response (GO:0010363), Protein
targeting to membrane(GO:0006612), Defence response to fungus
(GO:0050832) and Negative regulation of defence response (GO:0031348).
The number of processes in which each protein is contributing is shown
in Fig. [217]4 [218]b and [219]a, highlighting different
protein-process associations. Interestingly, NPR1 is involved in all 21
enriched processes identified, suggesting that NPR1 is a potential
regulator of the plant ubiquitin proteolytic system during infection,
controlling associated protein expression levels and functioning at the
cross-roads of several signalling pathways, as well as modulating
antagonistic cross-talk between different signalling pathways [[220]83,
[221]97] critical to the plant defence-mediated response. This suggests
that the success of the plant defence mechanism is a result of these
different concerted and controlled signal transduction events that
sequentially occur in the plant system, leading to an increased
resistance to diverse attacks. As a consequence, the failure of these
processes to function properly may yield increased susceptibility to
the pathogen, which possibly negotiates its entry into the cell by
activating processes to thwart plant defence mechanisms. Thus, we built
a potential enriched process-based regulatory network using the
following binary relation rule: Given two processes s and t, the
process t is triggered after the process s, which is indicated in the
regulatory network by an arrow from s to t, if A [t] ⊂ A [s] and
[MATH:
As⊇P. :MATH]
Under the assumption that a process is triggered after another as a
result of specific degradation of a single or a subset of proteins, and
where A [x] is the set of proteins participating to the process x and
[MATH: P :MATH]
is the set of proteins in at least one of the enriched pathways
identified.
Fig. 4.
Fig. 4
[222]Open in a new tab
Summary results of the biological mechanism of the NPR1-based defence
response. a plotted using the PINV tool [[223]113] is a protein-process
map showing NPR1-associated plant defence proteins and enriched
processes in which they are involved, and the number of enriched
processes in which each protein is involved is shown in (b). Note that
processes in which all proteins are involved are not displayed in (a)
to avoid that high number of links limits visibility. c plotted using
the Cytoscape tool [[224]114] displays NPR1-based process regulatory
network, showing possible process occurrence sequences and (d) shows
the number of proteins involved in or frequency of each enriched
process. The suffix +, ∗, ×, or − were added to enriched process
according to the fact all NPR1-associated plant defence proteins are
involved in the process (+), or set of proteins involved in the process
contains all NPR1-associated plant defence proteins are implicated in
the Plant-pathogen interaction (×) or Plant hormone signal transduction
(−) or both (∗) pathways
Figures [225]4 [226]c and [227]d show a potential enriched
process-based regulatory network constructed and process frequencies
(number of proteins annotated by each process), respectively. This
regulatory network reveals, for example, that there are only five
possible processes that may be triggered following the occurrence of
the negative regulation of PCD (GO:0043069), namely: Response to
xenobiotic stimulus (GO:0009410), Membrane fusion (GO:0061025),
Plant-type hyper-sensitive response (HR) (GO:0009626), Abscisic acid
(ABA)-activated signalling pathway (GO:0009738) and Response to cold
(GO:0009409) processes. Most of these processes are expected to occur
after the occurrence of the negative regulation of PCD, which could be
triggered after the occurrence of processes, such as Negative
regulation of defence response (GO:0031348). This is consistent as the
system may trigger mechanisms susceptible to promote or normalize the
plant defence response by coordinating different actions in order to
offer appropriate response to and to contain the infection. For
example, HR and ABA-activated signalling pathway may relaunch new PCD
by killing the infected cells at the point of invasion [[228]2, [229]6,
[230]7, [231]111]. This regulatory network also reveals that the
occurrence of the negative regulation of PCD can result from the
dysfunction of critical processes, such as MAPK cascade (GO:0000165),
Protein targeting to membrane (GO:0006612) and regulation to hydrogen
peroxide (GO:0010310), which are vital for the regulation of a wide
variety cellular activities, including transportation and communication
of chemical materials, proliferation, differentiation, apoptosis and
stress response [[232]112].
Conclusions
This study proposes a novel bioinformatics protocol that can be readily
applied to plant phenotype based gene knock-out or -down experimental
dataset to assess the contribution of the gene under consideration to
plant-defence mechanisms and to build a potential enriched biological
process-based regulatory network using publicly available biological
data. This protocol enables the extraction of the specific
(differentially infected) samples of the experimental dataset to
eliminate potential artefacts, such as uncertainty of the data set and
potential noise inherent in the experiment, which could mask downstream
analysis. We applied the protocol to the phenotype dataset expressed in
terms of leaf spore counts of 60 npr1 gene mutant and wild-type
Arabidopsis thaliana plants following Pst-DC3000 infection. Results
obtained still showed that NPR1 plays an important role during plant
pathogen response by suppressing the growth of the pathogen. More
specifically, we found that differences in the plant susceptibility to
the Pst-DC3000 infection depend on whether the NPR1 protein is
functionally active or not with about 99% of variability in Pseudomonas
spore growth between npr1 mutant and Wt plant samples. The increased
disease susceptibility phenotypes observed in npr1 mutant plants
following to Pst-DC3000 infection may be due to the fact that the SAR
signalling pathway, which is essential for plant defence response, is
blocked in these plants as a result of knocking out or down the protein
NPR1, leading to the attenuation of PR gene expressions. This SAR
signal requires a functionally active NPR1, which is a key regulatory
protein of SAR, acting as a controller and modulator of PR gene
expressions.
With the increasing number of publicly available biological data which
already demonstrates the complexity of the plant defence network, new
computational approaches are essential to provide an easy-to-analyse
picture which can lead to the identification of gaps and opportunities
for the development of future ‘wet or dry lab’ experiments. The
proposed protocol uses the GO-universal metric based Gene Ontology
semantic similarity model to identify putative proteins collaborating
with NPR1 in regulating plant defence mediated response and retrieve
enriched biological processes involved. We identified 26
NPR1-associated plant defence proteins and 21 highly specific
processes, which related to the major forms of defence processes
reported (innate response, HR, SAR, etc.), and construct the potential
process-based regulatory network, predicting occurrence sequences of
different processes. This NPR1-based enriched process regulatory
network (Fig. [233]4 [234]c) can effectively reveal eventual sequences
of biological processes occurring during the plant defence response. As
illustration, it suggests that the suppression of the SAR signalling
pathway (GO:0009862) may thwart processes that enable the detection of
the pathogen (GO:0016045), control plant stress, promote PCD and
attenuate PR gene expressions (e.g., GO:0010310, GO:0035304) to favor
processes leading to the negative regulation of PCD (GO:0043069) in
order to trigger processes that possibly ensure pathogen life
continuity within plant cells. Different results obtained demonstrate
that this novel protocol is effective for assessing plant-defence genes
and may help better understand the plant defence mechanisms, an
important aspect in the biology of plants.
Additional files
[235]Additional file 1^ (10KB, xls)
Experimental data from Arabidopsis wild type and npr1 mutant plants.
Leaf bacteria spore count (phenotype) dataset of 31 Arabidopsis wild
type (control) and 29 npr1 mutant (case) plants 48 h post Pst-DC3000
infection. (XLS 10 kb)
[236]Additional file 2^ (536.2KB, pdf)
Checking normality assumption using Q-Q plots. Checking normality
assumption for initial and transformed differentially infected wild and
npr1 mutant plant spore count datasets using Q-Q plots. (PDF 536 kb)
Acknowledgements