Abstract
Background
The fertile and sterile plants were derived from the self-pollinated
offspring of the F[1] hybrid between the novel restorer line NR1 and
the Nsa CMS line in Brassica napus. To elucidate gene expression and
regulation caused by the A and C subgenomes of B. napus, as well as the
alien chromosome and cytoplasm from Sinapis arvensis during the
development of young floral buds, we performed a genome-wide
high-throughput transcriptomic sequencing for young floral buds of
sterile and fertile plants.
Results
In this study, equal amounts of total RNAs taken from young floral buds
of sterile and fertile plants were sequenced using the Illumina/Solexa
platform. After filtered out low quality data, a total of 2,760,574 and
2,714,441 clean tags were remained in the two libraries, from which
242,163 (Ste) and 253,507 (Fer) distinct tags were obtained. All
distinct sequencing tags were annotated using all possible CATG+17-nt
sequences of the genome and transcriptome of Brassica rapa and those of
Brassica oleracea as the reference sequences, respectively. In total,
3231 genes of B. rapa and 3371 genes of B. oleracea were detected with
significant differential expression levels. GO and pathway-based
analyses were performed to determine and further to understand the
biological functions of those differentially expressed genes (DEGs). In
addition, there were 1089 specially expressed unknown tags in Fer,
which were neither mapped to B. oleracea nor to B. rapa, and these
unique tags were presumed to arise basically from the added alien
chromosome of S. arvensis. Fifteen genes were randomly selected and
their expression levels were confirmed by quantitative RT-PCR, and
fourteen of them showed consistent expression patterns with the digital
gene expression (DGE) data.
Conclusions
A number of genes were differentially expressed between the young
floral buds of sterile and fertile plants. Some of these genes may be
candidates for future research on CMS in Nsa line, fertility
restoration and improved agronomic traits in NR1 line. Further study of
the unknown tags which were specifically expressed in Fer will help to
explore desirable agronomic traits from wild species.
Background
The novel allo-cytoplasmic male sterility (CMS) system, Nsa CMS line,
and the corresponding restorer system, NR1 line, have been successfully
developed from somatic hybrids between Brassica napus (oilseed rape)
and its wild relative Sinapis arvensis (Yeyou 18, Xinjiang wild mustard
from northwestern China) by fusing mesophyll protoplasts [[40]1,[41]2].
Yeyou 18, a Chinese wild population cataloged into S. arvensis based on
genetic analyses [[42]3], possesses valuable agricultural traits such
as enhanced resistance to Sclerotinia sclerotiorum, Leptosphaeria
maculans and insects, greater tolerance to low temperatures and
drought, low contents of erucic acid and glucosinolates [[43]4], as
well as a low incidence of pod shattering [[44]5,[45]6]. The Nsa CMS
line contains the S. arvensis cytoplasm, which is essentially different
from other rapeseed CMS systems such as ogu[[46]7], nap[[47]8],
pol[[48]9], tour[[49]10] and hau[[50]11], based on their origins and
molecular characterization [[51]12]. And the cytoplasmic sterile gene
in the Nsa CMS line was likely derived from the S. arvensis parent
[[52]13]. Furthermore, the Nsa CMS line is more stable to temperature
changes compared to pol and nap[[53]2]. NR1, as a B. napusS. arvensis
disomic alien addition line, carries one pair of homologous chromosomes
from S. arvensis and 19 chromosome pairs from B. napus, and displays
important agricultural characters which arise from the alien
chromosomes, such as fertility restoration ability to Nsa CMS line, low
erucic acid and low glucosinolate contents, S. sclerotiorum resistance
and pod shattering resistance [[54]14].
Fertile and sterile plants were derived from the self-pollinated
offspring of the F[1] hybrid between the novel restorer line NR1 and
the Nsa CMS line. Because NR1 contains one S. arvensis homologous
chromosome pair, on which the restorer genes reside, F[1] hybrids from
NR1 crossed to Nsa CMS line are monosomic [[55]14]. The fertility
segregation was observed in self-pollinated plants of F[1] hybrid
because of the loss of added chromosome, producing fertile and sterile
plants, which possess the identical cytoplasmic genetic background
arising from Nsa CMS line and similar nuclear genetic background
arising from B. napus, except one or two members of the added S.
arvensis alien chromosome pair in fertile plants. Floral morphology of
fertile plants are normal, whereas sterile plants have stamens reduced
in size, abnormal anthers and no pollen produced.
To elucidate gene expression and regulation caused by the A and C
subgenomes, alien chromosome and cytoplasm from S. arvensis during the
development of young floral bud, especially stamens, we performed a
genome-wide high-throughput transcriptomic sequencing for young floral
buds of sterile and fertile plants. The transcriptome is the complete
set and quantity of transcripts in a cell at a specific developmental
stage or under a physiological condition, providing information on gene
expression, gene regulation, and amino acid content of proteins
[[56]15]. Therefore, transcriptome analysis is essential to interpret
the functional elements of the genome and reveals the molecular
constituents of cells and tissues. Because of the deep coverage and
single base-pair resolution provided by the next-generation sequencing
instrument, digital gene-expression (DGE), driven by Solexa/Illumina
technology, is an efficient method to analyze transcriptome data. Base
on genome-wide expression profiles by sequencing, DGE is able to
identify, quantify, and annotate expressed genes on the whole genome
level without prior sequence knowledge, opening doors to higher
confidence target discovery and pathway studies. This technique has
also been widely used in plant research. DGE analysis using Solexa
sequencing was performed to identify candidate genes encoding enzymes
responsible for the Siraitia grosvenorii triterpene biosynthesis
[[57]16]. High-throughput tag-sequencing analysis based on the Solexa
Genome Analyzer platform was applied to analyze the gene expression
profiling of cucumber plant and revealed the comprehensive mechanisms
of waterlogging-responsive transcription [[58]17]. Using the Solexa
sequencing system, the transcriptomes were compared between seedlings
of two soybean varieties to find genes associated with nitrogen use
efficiency [[59]18]. Early developing cotton fiber was analyzed by
deep-sequencing, and differential expression patterns of genes in a
fuzzless/lintless mutant were revealed [[60]19]. DGE signatures were
also used to study maize development, and the results from that study
provided a basis for the analysis of short-read expression data and
resolved specific expression signatures that will help define
mechanisms of action of the maize RA3 gene [[61]20]. In addition,
Solexa / Illumina technology was used to analyze gene expression during
female flower development [[62]21] and gene expression of Sinapis alba
leaves under drought stress and rewatering growth conditions [[63]22].
Overall, the DGE approach has provided more valuable tools for
qualitative and quantitative gene expression analysis than the previous
micro array-based assays.
Currently, no progress has been made in identifying the genes that
specify CMS and fertility restoration in Nsa CMS and NR1 system, and
little is understood about how expression of these genes influences the
physiological, developmental or biochemical processes such that pollen
formation is disrupted. In this study, the transcriptomes of young
floral buds were compared between fertile and sterile plants using the
Solexa sequencing system. By investigating the expression of genes
related to young floral bud development, a number of candidate genes
that are important in this process were identified. Here, to our
knowledge, this study is the first to characterize genome-wide
comparative analysis of gene expression in young floral buds between
fertile and sterile plants. And these sequencing datasets will serve as
a valuable resource for novel gene discovery from Yeyou 18, the Chinese
wild population of S. arvensis, laying the foundation for elucidating
the mechanisms of cytoplasmic male sterility, fertility restoration and
improved resistance.
Results
Selection of fertile and sterile plants
To obtain the plant materials with the closest genetic background,
fertile and sterile plants were made from the self-pollinated offspring
of the F[1] hybrid between novel restorer line NR1 and Nsa CMS line.
The cytoplasmic genetic background of fertile plants is the same as
that of sterile plants, arising from Nsa CMS line. Both the fertile and
sterile plants had the complete set of chromosomes from B. napus,
however, there was one or two members of the added S. arvensis alien
chromosome pair in the fertile plants. And thus there were prominently
morphological differences appearing at the early anther stage between
fertile and sterile buds which were less than 2 mm in diameter
(Figure [64]1A, B). Anthers of fertile bud were larger and fuller than
those of sterile bud (Figure [65]1B). The corresponding cross sections
of anthers at this stage showed that in contrast to this early stage in
fertile bud, during which four laterally symmetrical locules developed,
with two identical adaxial locules and two identical abaxial locules,
in sterile bud the loss of locules per anther was evident
(Figure [66]1C, D). At the late anther stage during which mature
pollens were released, floral morphology and architecture of fertile
plants were normal, whereas sterile plants had defective male flower
organs (stamens reduced in size, abnormal anthers and no pollen
produced), and had otherwise normal flowers (Figure [67]1E, F).
Figure 1.
Figure 1
[68]Open in a new tab
Comparation of floral morphology between sterile and fertile plants.A,
young buds of sterile and fertile plants (left, fertile; right,
sterile); B, Anthers corresponding to young buds (left, sterile; right,
fertile); C, the cross section of fertile anthers; D, the cross section
of sterile anthers; E, mature flowers of sterile and fertile plants
(left, fertile; right, sterile); F, Anthers corresponding to mature
flowers (left, fertile; right, sterile). Scale bars = 50 μm.
CMS-related abnormalities may involve different stages during the
formation of male organs and pollen. In many CMS types, the anther
development or pollen maturation are impaired. The defect can appear
before, during or after meiosis of pollen mother cells resulting in
gametophytic or sporophytic CMS types [[69]23]. Although the morphology
of normal anther and flower development had been characterized in this
study, no detailed morphological comparison of the development of
fertile versus sterile anthers was available. To address this, serial
sections of fertile and CMS floral buds, representing different
development stages of anther, would be compared in further work.
Solexa sequencing
Table [70]1 showed that the result of Solexa sequencing and the
distribution of distinct clean tags. A total of 4,415,866 and 4,244,140
raw tags were sequenced in sterile plants (Ste) and fertile plants
(Fer) libraries, respectively. After filtering out low quality data
(tags containing unknown base N and only adaptor tags), 4,199,934 and
4,103,291 tags (designated herein as “clean” tags) remained in Ste and
Fer libraries, respectively. To increase the robustness of the
approach, single-copy tags in the two libraries (1,439,360 in Ste and
1,388,850 in Fer library) were excluded from further analysis. As a
result, a total of 2,760,574 and 2,714,441 clean tags remained from the
two libraries, from which 242,163 (Ste) and 253,507 (Fer) distinct tags
were obtained. The Fer library had higher number of distinct tags, and
showed higher ratio of number of distinct clean tags to total clean
tags, and lower percentage of distinct high copy number tags than Ste
library. These data suggested that more genes were detected in the Fer
library than in the Ste library, and more transcripts were expressed at
lower levels in Fer library. There were 11,344 more distinct tags in
the Fer than in the Ste library, possibly representing genes related to
floral stamen development. The percentage of distinct tags rapidly
declined as copy number increased, indicating only a small portion of
the transcripts were expressed at high level in the conditions tested.
The frequency of these distinct tags was shown in Table [71]1, which
listed the copy numbers in the range 2–100 or higher, in which the
majority of distinct tags (more than 80% from each) were present at low
copy numbers (< 10 copies), and less than 20% tags from each library
were counted between 11 and 100 times. Only approximately 1.3% tags
were detected more than 100 times. The data used in this publication
have been deposited in NCBI's Gene Expression Omnibus [[72]24] and are
accessible through GEO Series accession number [73]GSE42513
([74]http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE42513).
Table 1.
Solexa sequencing tags and the distribution of distinct clean tags in
Ste and Fer libraries
Ste Fer
total raw data
__________________________________________________________________
4415866
__________________________________________________________________
4244140
__________________________________________________________________
tags containing N
__________________________________________________________________
17065
__________________________________________________________________
17032
__________________________________________________________________
adaptors
__________________________________________________________________
198867
__________________________________________________________________
123817
__________________________________________________________________
tag copy number =1
__________________________________________________________________
1439360
__________________________________________________________________
1388850
__________________________________________________________________
total clean tag
__________________________________________________________________
2760574
__________________________________________________________________
2714441
__________________________________________________________________
total distinct tag
__________________________________________________________________
242163
__________________________________________________________________
253507
__________________________________________________________________
2–5
__________________________________________________________________
162169
__________________________________________________________________
170723
__________________________________________________________________
6–10
__________________________________________________________________
32958
__________________________________________________________________
35528
__________________________________________________________________
11–20
__________________________________________________________________
22111
__________________________________________________________________
23436
__________________________________________________________________
21–50
__________________________________________________________________
16065
__________________________________________________________________
15826
__________________________________________________________________
51–100
__________________________________________________________________
5383
__________________________________________________________________
4848
__________________________________________________________________
>100 3477 3146
[75]Open in a new tab
Annotation analysis of the distinct tag
The full genome sequence for B. napus is not available. As described by
‘U’s triangle’, the three diploid species B. rapa (AA genome), Brassica
nigra (BB genome) and B. oleracea (CC genome) formed the amphidiploid
species B. juncea (AABB genome), B. napus (AACC genome) and B. carinata
(BBCC genome) by hybridization [[76]25]. In addition, there was a close
relationship between the S. arvensis and B. nigra[[77]26]. Therefore,
to identify the genes corresponding to the distinct tags in each
library, all distinct sequencing tags were annotated using all possible
CATG+17-nt sequences of the genome and transcriptome of B. rapa[[78]27]
and those of B. oleracea (data not shown, the B. oleracea Genome
Project headed by the Oil Crops Research Institute of the Chinese
Academy of Agricultural Sciences) as the reference sequences,
respectively, allowing only a 1-bp mismatch (Table [79]2). Altogether,
45, 557 genes (99.56%) and 40, 908 ones (99.35%) had the CATG sites,
resulting in a total number of 291, 392 and 282, 216 unambiguous
reference tags, respectively, for B. oleracea and B. rapa. By assigning
the experimental distinct Solexa tags to the virtual reference ones, we
observed that 67, 633 (27.93%) and 72, 088 (28.44%) tags were perfectly
matched to the B. oleracea reference genes, 60, 153 (24.84%) and 64,
735 (25.54%) tags were perfectly matched to the B. rapa reference
genes, respectively, in Ste and Fer libraries. Out of the tags matched
to reference genes, approximately 10% were mapped to multiple
locations, including low complexity tags with poly(A) tails and tags
which derived from repetitive sequences and were mapped to highly
conserved domains shared by different genes. In addition, approximately
3% tags in two libraries were mapped to the antisense strands,
demonstrating that those regions might be bidirectionally transcribed.
The tags unmatched to the reference genes were then blasted against the
genome of B. rapa and B. oleracea, respectively, and approximately 10%
tags were matched to the genomic sequences in the two libraries. These
might represent non-annotated genes or noncoding transcripts that
derived from intergenic regions. As a result of the significant
sequencing depth of Solexa technology, incomplete annotation of the
genome of B. rapa or B. oleracea and one added S. arvenis chromosome,
however, about 25% unmatched tags in each library were observed.
Table 2.
Annotation of distinct Solexa tags against reference sequences ofB.
rapa andB. oleracea, respectively
__________________________________________________________________
Ste
__________________________________________________________________
Fer
__________________________________________________________________
Reference sequences B.oleracea B.rapa B.oleracea B.rapa
Clean distinct tag
__________________________________________________________________
242163
__________________________________________________________________
__________________________________________________________________
242163
__________________________________________________________________
__________________________________________________________________
253507
__________________________________________________________________
__________________________________________________________________
253507
__________________________________________________________________
__________________________________________________________________
Perfect match(Sense)
__________________________________________________________________
1 tag->1 gene
__________________________________________________________________
67633
__________________________________________________________________
27.93%
__________________________________________________________________
60153
__________________________________________________________________
24.84%
__________________________________________________________________
72088
__________________________________________________________________
28.44%
__________________________________________________________________
64735
__________________________________________________________________
25.54%
__________________________________________________________________
1 tag->n gene
__________________________________________________________________
9238
__________________________________________________________________
3.81%
__________________________________________________________________
7449
__________________________________________________________________
3.08%
__________________________________________________________________
9908
__________________________________________________________________
3.91%
__________________________________________________________________
7975
__________________________________________________________________
3.15%
__________________________________________________________________
1 bp Mismatch(Sense)
__________________________________________________________________
1 tag->1 gene
__________________________________________________________________
58766
__________________________________________________________________
24.27%
__________________________________________________________________
55215
__________________________________________________________________
22.80%
__________________________________________________________________
58677
__________________________________________________________________
23.15%
__________________________________________________________________
55444
__________________________________________________________________
21.87%
__________________________________________________________________
1 tag->n gene
__________________________________________________________________
18495
__________________________________________________________________
7.64%
__________________________________________________________________
16709
__________________________________________________________________
6.90%
__________________________________________________________________
17955
__________________________________________________________________
7.08%
__________________________________________________________________
16434
__________________________________________________________________
6.48%
__________________________________________________________________
Perfect match(Antisense)
__________________________________________________________________
1 tag->1 gene
__________________________________________________________________
4560
__________________________________________________________________
1.88%
__________________________________________________________________
4714
__________________________________________________________________
1.95%
__________________________________________________________________
4400
__________________________________________________________________
1.74%
__________________________________________________________________
4581
__________________________________________________________________
1.81%
__________________________________________________________________
1 tag->n gene
__________________________________________________________________
935
__________________________________________________________________
0.39%
__________________________________________________________________
809
__________________________________________________________________
0.33%
__________________________________________________________________
839
__________________________________________________________________
0.33%
__________________________________________________________________
728
__________________________________________________________________
0.29%
__________________________________________________________________
1 bp Mismatch(Antisense)
__________________________________________________________________
1 tag->1 gene
__________________________________________________________________
1770
__________________________________________________________________
0.73%
__________________________________________________________________
2395
__________________________________________________________________
0.99%
__________________________________________________________________
1809
__________________________________________________________________
0.71%
__________________________________________________________________
2530
__________________________________________________________________
1.00%
__________________________________________________________________
1 tag->n gene
__________________________________________________________________
342
__________________________________________________________________
0.14%
__________________________________________________________________
323
__________________________________________________________________
0.13%
__________________________________________________________________
368
__________________________________________________________________
0.15%
__________________________________________________________________
290
__________________________________________________________________
0.11%
__________________________________________________________________
Mapping to genome
__________________________________________________________________
Perfect match
__________________________________________________________________
1 tag->1 position
__________________________________________________________________
9375
__________________________________________________________________
3.87%
__________________________________________________________________
10515
__________________________________________________________________
4.34%
__________________________________________________________________
9904
__________________________________________________________________
3.91%
__________________________________________________________________
11025
__________________________________________________________________
4.35%
__________________________________________________________________
1 tag->n position
__________________________________________________________________
933
__________________________________________________________________
0.39%
__________________________________________________________________
512
__________________________________________________________________
0.21%
__________________________________________________________________
1045
__________________________________________________________________
0.41%
__________________________________________________________________
582
__________________________________________________________________
0.23%
__________________________________________________________________
1 bp Mismatch
__________________________________________________________________
1 tag->1 position
__________________________________________________________________
10448
__________________________________________________________________
4.31%
__________________________________________________________________
13951
__________________________________________________________________
5.76%
__________________________________________________________________
11326
__________________________________________________________________
4.47%
__________________________________________________________________
14545
__________________________________________________________________
5.74%
__________________________________________________________________
1 tag->n position
__________________________________________________________________
2140
__________________________________________________________________
0.88%
__________________________________________________________________
1560
__________________________________________________________________
0.64%
__________________________________________________________________
2327
__________________________________________________________________
0.92%
__________________________________________________________________
1817
__________________________________________________________________
0.72%
__________________________________________________________________
Reference genes
__________________________________________________________________
45758
__________________________________________________________________
__________________________________________________________________
41174
__________________________________________________________________
__________________________________________________________________
45758
__________________________________________________________________
__________________________________________________________________
41174
__________________________________________________________________
__________________________________________________________________
Unambiguously matched genes
__________________________________________________________________
23909
__________________________________________________________________
52.25%
__________________________________________________________________
23206
__________________________________________________________________
56.36%
__________________________________________________________________
25495
__________________________________________________________________
55.72%
__________________________________________________________________
24819
__________________________________________________________________
60.28%
__________________________________________________________________
Unknown distinct tag 57528 23.76% 67858 28.02% 62861 24.80% 72821
28.73%
[80]Open in a new tab
To estimate whether or not the sequencing depth was sufficient for the
transcriptome coverage, the sequencing saturation was analyzed in two
libraries. The genes that were mapped by all clean tags and unambiguous
clean tags increased with the total number of tags. However, when the
sequencing counts reached 2 million tags or higher, the number of
detected genes was saturated (Additional file [81]1: Figure S1).
Differential gene expression between the Ste and Fer libraries
Based on “The significance of digital gene expression profiles”
[[82]28], a rigorous algorithm was developed to identify the
differentially expressed genes (DEGs) in the two samples. The
expression abundance of tag mapped genes in the data sets was analyzed
by counting the number of transcripts per million (TPM) clean tags.
First, the read density measurement was normalized as described in
detail by Benjamini and Yekutieli [[83]29]. FDR ≤ 0.001 and the
absolute value of |log2Ratio|≥ 1 were as thresholds to judge the
significance of differences in transcript abundance. Differences of tag
frequencies that appeared in the Ste and Fer libraries were used for
estimating gene expression levels. The distribution of fold-changes in
tag number between the two libraries was shown in Figure [84]2A. The
great majority of tags were observed at similar levels, showing a <
5-fold difference in the two libraries. Tags with five folds or greater
differences in accumulation accounted for 2.31%, including 2.07% tags
which increased by at least five folds and 0.24% tags which were
decreased by at least five folds in the Fer library (Figure [85]2A).
The transcripts detected with at least two-fold differences in the two
libraries were shown in Figure [86]2B (FDR <0.001). The red dots and
green dots represented transcripts higher or lower in abundance for
more than two folds in Fer library, respectively. The blue dots
represented transcripts that differed less than two folds between the
two libraries, which were arbitrarily designated as “no difference in
expression” (Figure [87]2B). By comparing our two Solexa libraries, a
great number of differentially expressed transcripts were identified.
And based on above a rigorous algorithm, there were 3231 genes of B.
rapa (Additional file [88]2) and 3371 genes of B. oleracea (Additional
file [89]3) which were detected with significant differential
expression levels. These included both up-regulated and down-regulated
genes in Fer. Of the 1855 up-regulated B. rapa genes, 760 genes were
uniquely expressed in Fer, and 44 of the 1376 down-regulated genes were
only expressed in Ste, whereas there were 717 genes of the 1886
up-regulated B. oleracea genes, which were specially expressed in Fer,
and 54 of the 1485 down-regulated genes were only expressed in Ste
(Figure [90]3). In addition, there were 1089 specially expressed tags
in Fer, which were neither mapped to B. oleracea nor mapped to B. rapa,
and these unique tags were presumed to localize on the alien chromosome
(Additional file [91]4).
Figure 2.
Figure 2
[92]Open in a new tab
Differentially expressed tags and corresponding genes in Ste and Fer.A,
Differentially expressed tags in Ste and Fer. The “x” axis represents
fold-change of differentially expressed unique tags in two libraries.
The “y” axis represents the number of unique tags (log10).
Differentially accumulating unique tags with a 5-fold difference
between libraries are shown in the red region (97.66%). The blue
(0.24%) and green (2.07%) regions represent unique tags that are up-
and down-regulated for more than 5 folds in the Fer and Ste libraries,
respectively. B, Comparision of gene expression levels between the two
libraries. For comparing gene expression levels between the two
libraries, each library was normalized to 1 million tags. Red dots
represent transcripts more prevalent in Fer library, green dots show
those present at a lower frequency in Fer and blue dots indicate
transcripts that did not change significantly. The parameters “FDR
<0.001” and “log2 Ratio ≥ 1” were used as the threshold to judge the
significance of gene expression difference.
Figure 3.
Figure 3
[93]Open in a new tab
Changes in gene expression in Fer and Ste libraries. Numbers of
up-regulated, down-regulated, specific to Fer and specific to Ste genes
were summarized.
Gene Ontology functional analysis of DEGs
Gene Ontology (GO) is an international standardized gene functional
classification system that describes properties of genes and their
products in any organism, containing three ontologies: molecular
function, cellular component and biological process. A total of 3,231
DEGs of B. rapa and 3371 DEGs of B. oleracea that could be categorized
into 70 categories were found (Figure [94]4). These DEGs were involved
in biochemistry, metabolism, growth, development, and apoptosis. Among
the biological process category, metabolic process (about 45%) was the
most dominant group, followed by cellular process (about 41%), response
to stimulus (about 23%), and developmental process (about 12%). There
were two special processes, including cell killing which was unique to
B. rapa, and locomotion which was unique to B. oleracea. Regarding
molecular functions, 43.3% of the B. rapa unigenes and 42% of the B.
oleracea unigenes were assigned to binding, followed by catalytic
activity (42.1% for B. rapa, 41.3% for B. oleracea), transporter
activity (5.3% for both), and structural molecule activity (2.1% for B.
rapa, 2.3% for B. oleracea). And metallochaperone function was unique
to B. rapa. In the cellular compo-nent category, cell and cell part
(62.4% for both) were the dominant groups, followed by intracellular
(45.9% for B. rapa, 46.7% for B. oleracea) and intracellular part
(45.8% for B. rapa, 46.5% for B. oleracea).
Figure 4.
[95]Figure 4
[96]Open in a new tab
Histogram showing Gene Ontology functional analysis of DEGs. The
frequency of GO terms was analyzed using GO Slim Assignment. The y-axis
and x-axis indicate the names of clusters and the ratio of each
cluster, respectively.
To reveal significantly enriched GO terms in DEGs comparing to the
genome background, GO enrichment analysis of functional significance
was performed. The GO term with P ≤ 0.05 was defined as significantly
DEGs enriched GO term. This analysis allowed us to determine the major
biological functions, biological processes and cellular components with
which DEGs were associated. For molecular function, significantly
enriched GO terms of DEGs in B. rapa (Additional file [97]5) and B.
oleracea (Additional file [98]6) included transferase activity
(transferring acyl groups), aldehyde-lyase activity and oxidoreductase
activity, and zinc ion transmembrane transporter activity was specific
to B. rapa. For enriched biological processes, there were 19 GO terms
of DEGs from B. oleracea (Additional file [99]7) and 28 GO terms of
DEGs from B. rapa (Additional file [100]8), which participated in
morphogenesis (pollen wall assembly, pollen exine formation, cellular
component assembly involved in morphogenesis, anatomical structure
formation involved in morphogenesis), development (pollen development,
gametophyte development), metabolic process (small molecule metabolism,
fatty acid biosynthetic metabolism, alcohol metabolic metabolism), ion
transport (cadmium ion transport, di-, tri-valent inorganic cation
transport), lipid localization and response to stimulus (response to
metal ion, response to inorganic substance).
Pathway analysis for DEGs
Different genes cooperate to achieve their biological functions.
Pathway-based analysis helps to further understand the biological
functions of DEGs. Biological pathways, including metabolic pathways,
signal transduction pathways, protein transport and degradation
pathways, and genetic information processing pathways, were identified
by KEGG pathway analysis of the DEGs, using the major public
pathway-related database [[101]30]. Obtained pathways corresponding to
DEGs from B. oleracea and B. rapa were listed in Additional files
[102]9 and [103]10, respectively. Significantly enriched pathways that
up-regulated and down-regulated genes were involved in were identified
by the same statistical algorithms used in the GO analysis. In this
study, the enriched pathways were ‘metabolic pathways’ (1888 members
for B. rapa, 1,886 for B. oleracea) which were large complexes
comprising several metabolism patterns, such as ‘amino acid
metabolism’, ‘carbohydrate metabolism’, ‘nitrogen metabolism’ and
‘biosynthesis of other secondary metabolites’, ‘cellular process
pathways’ (79 members for B. oleracea, 22 for B. rapa), ‘genetic
information processing pathways’ (51 members for B. oleracea) and
‘organismal systems pathways’ (9 members for B. rapa) (Figure [104]5).
And genomic manipulation of these genes which were involved in pathways
might be important for elucidate the mechanisms of cytoplasmic male
sterility, fertility restoration and improved resistance.
Figure 5.
Figure 5
[105]Open in a new tab
The summary enriched pathways of DEGs in B. rapa and B. oleracea.
Expressed genes specific to sterile and fertile plants
Fertile plants carry one or two members of an added homologous
chromosome pair from S. arvenis besides the complete set of chromosomes
of B. napus compared with sterile plants. It was proposed that DEGs
specific to fertile plants comprised genes associated with floral
development and fertility restoration, whereas DEGs specific to sterile
plants may be involved in CMS. Using B. rapa as reference, a total of
760 DEGs were only expressed in fertile plants, and 44 DEGs were
specific to sterile plants, while using B. oleracea as reference, a
total of 717 DEGs were only expressed in fertile plants, and 54 DEGs
were specific to sterile plants (Figure [106]3). By comparing the GO
terms of these DEGs, for cellular component, GO terms which were
exclusive to fertile plants contained cell envelope, cell projection,
endomembrane system, extracellular region part, membrane-enclosed
lumen, non-membrane-bounded organelle, organelle lumen and organelle
membrane, whereas only cell fraction GO term was specific to sterile
plants. For molecular function, GO terms only belonging to fertile
plants were enzyme regulator, structural molecule, transcription
regulator and translation regulator. For biological process, four GO
terms were specific to fertile plants, including anatomical structure
formation, cellular component biogenesis, cellular component
organization and rhythmic process. No GO terms were specific to sterile
plants for molecular function and biological process. The above results
were shown in Figure [107]6.
Figure 6.
[108]Figure 6
[109]Open in a new tab
Histogram showing GO analysis of DEGs which specific to Fer or Ste in
B. rapa.
DEGs encoding transcription factors
Transcription factors are essential for the regulation of gene
expression. Changes in gene transcription are associated with changes
in expression of transcription factors. Our DGE results showed that
among the significantly DEGs, there were forty-five transcription
factors for B. rapa, including fifteen up-regulated and thirty
down-regulated genes (Additional file [110]2). Among up-regulated B.
rapa transcriptional factors, six ones were specific to Fer plants,
including calmodulin-binding protein, mutant TFIIF-alpha, TBP1 (TATA
binding protein 1), TBP1 (Telomere-binding protein 1, DNA binding
factor (AT-HSFB2B, ATHSFA2) (Additional file [111]2). For B. oleracea,
there were fifty-seven transcription factors whose expression were
dramatically altered, including thirty-nine down-regulated and eighteen
up-regulated genes among which six ones were only expressed in Fer. And
these transcription factors were ATNAC2, ATHSFA2, transcription factor
IID-1 and three unknown proteins (Additional file [112]2). Further
study would be needed to reveal the molecular basis of gene
expressional regulation.
Genes involved in carbon metabolism
Many genes involved in carbon metabolism were differentially expressed
between Fer and Ste. Altered expressions of numerous genes involved in
pentose phosphate pathway, starch and sucrose metabolism, fructose and
mannose metabolism and carbon fixation in photosynthetic organisms were
observed. For example, in B. rapa, twenty genes involved in pentose
phosphate pathway showed changed expression. Those genes with increased
transcript abundance in Fer encoded glucose-6-phosphate dehydrogenase
(Bra028474), three proteins transferring phosphorus-containing groups
(Bra025751, Bra025751 and Bra025751), transaldolase-like pro-tein
(Bra006197), ribose-phosphate pyrophosphokinase 1/phosphoribosyl
diphosphate synthetase 1 (PRSI) (Bra005344), members performing
aldehyde-lyase activity (Bra028543, Bra017233, Bra009517, Bra019639 and
Bra006960), and the TPMs for these transcripts were up-regulated by 1.0
to 2.1-fold. The eighty-four genes involved in starch and sucrose
metabolism were differentially expressed including two genes encoding
UDP-glucosyltransferase proteins (Bra009830, Bra015364) and six genes
encoding carbohydrate phosphatase proteins (Bra019370, Bra031879,
Bra015497, Bra022549, Bra038548, Bra028254). Their expressions were
increased by 1.4 to 10.2-fold in Fer. In fructose and mannose
metabolism, there were five down-regulated genes, including
6-phosphofructokinase protein family (Bra030394, Bra038519, Bra011089,
Bra012940) and phosphoglucomutase-like protein (Bra016184 ), and five
up-regulated gene including pfkB-type carbohydrate kinase family
protein (Bra014540, Bra007452, Bra029158, Bra028279) and
fructose-1,6-bisphosphatase (Bra014841).
DEGs involved in citrate cycle (TCA cycle) and oxidative phosphorylation
Mitochondria are the cellular site of numerous metabolic pathways
including the tricarboxylic acid cycle (TCA cycle), respiratory
electron transfer and ATP synthesis [[113]31-[114]33]. The
amplification, recombination and rearrangement of mitochondrial genomes
could cause dysfunction of mitochondria leading to CMS. Altered flower
phenotype or defects in pollen formation are presumed to be secondary
effects of the mitochondrial mutation, and the primary defect may be a
reduction in the efficiency of respiration or the impairment of other
mitochondrial function. Using B. rapa sequences as reference, altered
expressions of numerous genes involved in the citrate cycle and
oxidative phosphorylation were observed. For example, a total of 12
DEGs were involved in TCA cycle and 11 of which showed decreased
transcript abundance in Ste. Down-regulated genes in Ste encoded Class
II fumarase (Bra021413), succinate dehydrogenase (Bra033006), PLASTID
E2 SUBUNIT OF PYRUVATE DECARBOXYLASE (Bra017345, Bra028057, Bra025167),
citrate-synthase (Bra014513, Bra007417), dihydrolipoamide dehydrogenase
(Bra001645), IAA-conjugate-resistant 4 (Bra012418), and NADP-specific
isocitrate dehydrogenase (Bra004134, Bra038201), whereas malate
dehydrogenase 2 (Bra028624) was up-regulated. In addition, twenty DEGs
were involved in the oxidative phosphorylation pathway. Nine genes were
associated with NADH dehydrogenase, one gene was associated with
succinate dehydrogenase/fumarate redutase and ten genes were involved
in ATP synthase. Using B. oleracea sequences as reference, 11 DEGs were
involved in TCA cycle and 7 of which performed the same biological
function as DEGs from B. rapa in TCA cycle. According the above
results, we then provided an overview about CMS in sterile plants and
fertility restoration in fertile plants in Figure [115]7.
Figure 7.
Figure 7
[116]Open in a new tab
Overview of CMS in sterile plants and fertility restoration in fertile
plants. Expression of mitochondrial CMS genes led to dysfunction of
mitochondria. Products of nuclear restorer genes suppressed the effects
of CMS genes. The state of the mitochondria was signaled to nuclear
target genes and resulted in male sterile or fertile flower.
Genes involved in oxidoreductase activity
In enriched GO terms, oxidoreductase activity was the most enriched
term. In B. rapa, 311 DEGs were involved in oxidoreductase activity,
including 26 members acting on the aldehyde or oxo group of donors, NAD
or NADP as acceptor. For instance, fatty acyl-CoA reductase
(alcohol-forming)/ oxidoreductase (Bra034793, Bra038691), also named
MS2 (male sterility 2), acting on the CH-CH group of donors, NAD or
NADP as acceptor, was specific to Fer. Others were oxidoreductase,
acting on the CH-CH group of donors (Bra006625), cinnamoyl-CoA
reductase family (Bra021863, Bra035150), 3-chloroallyl aldehyde
dehydrogenase/ aldehyde dehydrogenase (Bra010553, Bra017753, Bra011674,
Bra018090, Bra024619 and Bra016330), NAD dependent
epimerase/dehydratase family (Bra024073), aldehyde oxidase (Bra002347),
NAD or NADH binding/catalytic/ glyceraldehyde-3-phosphate dehydrogenase
(Bra026068, Bra026904, Bra040146 and Bra019797), cytosolic factor
family protein/phosphoglyceride transfer family protein (Bra026185),
cobalamin-independent methionine synthase (Bra023645), and unnamed
oxidoreductase members (Bra032939, Bra011470, Bra011869, Bra005012,
Bra008743, Bra040146 and Bra023487).
Differential expressed PPR proteins
Proteins which encode pentatricopeptide repeat (PPR) motif are proposed
to function as site-specific, RNA-binding adaptor proteins that mediate
interactions between RNA substrates and the enzymes that act on them
[[117]34-[118]36]. PPR proteins have a central role in plant
mitochondrial biogenesis [[119]37]. Most cloned restorer genes encode
mitochondria-targeted PPR proteins [[120]38-[121]44]. In this study,
when used B. rapa as reference sequences, there were nine PPR proteins
showed differential expression, including two up-regulated PPR proteins
(Bra011721 and Bra017411) and seven down-regulated PPR proteins
(Bra033892, Bra008587, Bra025761, Bra036629, Bra031484, Bra033866 and
Bra017881) in Fer. These PPR proteins were good candidates for studying
the link between nuclear and mitochondrial gene expression.
Confirmation of tag-mapped genes by qRT-PCR
To confirm the reliability of Solexa/Illumina sequencing technology,
fifteen genes were randomly selected for quantitative RT-PCR assays.
The results showed that expressions of fourteen genes were consistent
between the qRT-PCR and the DGE analyses (Figure [122]8). And the
corresponding primers were listed in Table [123]3. For the gene which
showed the inconsistency between qPCR and DGE, it was likely
attributable to the fact that DGE was more sensitive in the detection
of low-abundant transcripts and small changes in gene expression than
qPCR.
Figure 8.
Figure 8
[124]Open in a new tab
Real-time PCR validations of tag-mapped genes. Relative level,
(2−^△CT△CT); TPM, log2 Ratio (Fer/Ste).
Table 3.
The corresponding primers of qRT-PCR
Primer sequences Amplified genes
ACTAATTTGCACCGTGTT (S)
__________________________________________________________________
Bra020899
__________________________________________________________________
GGTAGCGTTTCTGTAAGC (A)
__________________________________________________________________
GAAATCTTTGCTTATCCGCTAT (S)
__________________________________________________________________
Bra011351
__________________________________________________________________
AAACGCTGAGGCACCTTG (A)
__________________________________________________________________
GTCAGATATTTCGGAAGC (S)
__________________________________________________________________
Bra002935
__________________________________________________________________
CAAGAGTTGAGAACACCC (A)
__________________________________________________________________
GGCGATATTCTCAGTCCA (S)
__________________________________________________________________
Bra018342
__________________________________________________________________
CACCTCCAAGTGAAACCC (A)
__________________________________________________________________
GCAGGAGGTGCATTAGCC (S)
__________________________________________________________________
Bra040150
__________________________________________________________________
ACCCAGCAACAACGAAGC (A)
__________________________________________________________________
TGGCTGGGAACAGAACTT(S)
__________________________________________________________________
Bra012531
__________________________________________________________________
AGGCTTTCCAGACTACGG (A)
__________________________________________________________________
GAAGTCAATGGGAAAGATG (S)
__________________________________________________________________
Bra034363
__________________________________________________________________
ACTCGAACAGGCTGAAAC (A)
__________________________________________________________________
CTGTCTATTCTCCCTTCCC (S)
__________________________________________________________________
Bra031708
__________________________________________________________________
AACGGTTTCCCATTGCTG (A)
__________________________________________________________________
AGTCATCGTCGCCGTTAT (S)
__________________________________________________________________
Bra031854
__________________________________________________________________
AGGATTTGGTCGCAGTCA (A)
__________________________________________________________________
CGACGGATACAGCACAGG (S)
__________________________________________________________________
Bra023103
__________________________________________________________________
ACGGCTTGGACTCGTTGTAGAT (A)
__________________________________________________________________
AAGCCTCGCCAAAGAAAC (S)
__________________________________________________________________
Bra036557
__________________________________________________________________
TCAACGGCATGTGAAAGC (A)
__________________________________________________________________
TCTCCGTTACCATCAATA (S)
__________________________________________________________________
Bra012583
__________________________________________________________________
TGTAGCCATGTCATCTTT (A)
__________________________________________________________________
GAATACGAACGGAGCAAGA (S)
__________________________________________________________________
Bra039837
__________________________________________________________________
TGGTTTCCCACCATAAGC (A)
__________________________________________________________________
GCTACATACAAGGTCCAGTT (S)
__________________________________________________________________
Bra040469
__________________________________________________________________
TAAGATAAGCCGCACAAG (A)
__________________________________________________________________
TGCTATCCAGGCTGTTCT (S)
__________________________________________________________________
Actin of B. napus
__________________________________________________________________
TTAATGTCACGGACGATTT (A)
[125]Open in a new tab
Discussion
Global gene transcription changes in Fer and Ste plants
A global analysis of transcriptome could facilitate the identification
of systemic gene expression and regulatory mechanisms. In the present
study, a transcriptome profiling of young buds was performed to
identify genes that are differentially expressed in Fer and Ste plants.
A sequencing depth of 4.5 million tags per library was reached
(Table [126]1), and more than 71% unique tags were matched with the
genome and transcriptome of B. rapa and B. oleracea, respectively,
suggesting that the database selected was relatively complete. Because
fertile plants carried one chromosome from S. arvensis besides 19 B.
napus homologous chromosome pairs, exhibiting normal flower morphology
and male fertility, more distinct tags were detected in the Fer library
than in the Ste library. In addition, for distinct tags, tags with five
folds or greater differences in accumulation accounted for 2.31%,
however, among which 2.07% tags increased by at least five folds in
Fer. This suggested that more genes were involved in gene expression
and gene regulation in Fer. Through annotation of distinct tags, there
were 1089 specially expressed unknown tags in Fer, which were neither
mapped to B. rapa nor mapped to B. oleracea (Additional file [127]4).
We presumed that these tags arose basically from the added alien
chromosome of S. arvensis. Further study of these tags would help to
dig genes with desirable agronomic traits from wild species. To
identify corresponding genes that were associated with the development
of stamens (including anthers) and to elucidate the molecular basis of
cytoplasmic male sterility, fertility restoration and improved
resistance, we analyzed the most differentially regulated tags with a
log2 ratio > 1 or < − 1 using a greater statistically significant value
(P < 0.001) as well as false discovery rates (FDR < 0.001). And based
on the rigorous algorithm, there were 3231 genes of B. rapa (Additional
file [128]2) and 3371 genes of B. oleracea (Additional file [129]3)
which were detected with significant differential expression levels.
These DEGs were candidates for further research.
DEGs specific to sterile and fertile plants
Because fertile plants had normal flower morphology and male fertility
versus sterile plants, it was proposed that DEGs specific to fertile
plants were associated with stamen development and male fertility,
whereas DEGs specific to sterile plants may be involved in CMS. Using
B. rapa as reference, a total of 760 DEGs were only expressed in
fertile plants, and 44 DEGs were specific to sterile plants. The
specific DEGs could be categorized into 56 categories through GO
functional analysis, which were involved in biochemistry, metabolism,
growth, development, apoptosis and so on (Figure [130]6). For fertile
specific DEGs, 11 DEGs were involved in pollen wall assembly
(GO:0010208), including four chalcone synthase genes (Bra000559,
Bra034658, Bra011566, Bra017681), three beta-1,3-glucanase genes
(Bra028343, Bra037057, Bra038969), two male sterility 2 (MS2) members
(Bra034793, Bra038691) and two gycosyl hydrolases family proteins
(Bra014979, Bra001918). Chalcone synthase was the key enzyme in the
flavonoid biosynthesis pathway, which played an important role in many
biological functions in plants, including male fertility [[131]45].
Beta-1,3-glucanase genes were involved in male gametophyte development
and pollination. Silencing the expression of beta-1,3-glucanase gene by
RNA interference in transgenic rice resulted in male sterility due to
disruption of degradation of callose in the anther locules at the early
microspore stage [[132]46]. For sterile specific DEGs, 40% of the DEGs
were assigned to binding, followed by catalytic activity (33.3%),
transporter activity (4.4%), antioxidant activity and molecular
transducer activity (2.2% for both) with respect to molecule function
category.
DEGs involved in carbon metabolism and energy metabolism
Mitochondrial gene/genome rearrangements may alter the expression of
common mitochondrial genes coding for proteins involved in
respiration/ATP synthesis, affecting ATP production and/or other
processes within mitochondria. A lowered ATP production [[133]47] or a
reduced carbohydrate accumulation [[134]48] was observed in ‘late
stage’ CMS flowers. In this study, 11 of 12 DEGs involved in TCA cycle
were down-regulated in Ste (Additional file [135]10). DEGs were also
involved in respiration/ATP synthesis and oxidoreductase activity. A
total of 20 genes were involved in the oxidative phosphorylation
pathway, having effects on NADH dehydrogenase, succinate
dehydrogenase/fumarate redutase and ATP synthase (Additional file
[136]10). Oxidoreductase activity was the most enriched GO term
(Additional file [137]5), whereas oxidative stress during
microsporogenesis was thought to induce a premature abortion of tapetal
cells due to programmed cell death (PCD) in CMS sunflower [[138]49] and
rice [[139]50]. In addition, altered expressions of numerous genes
involved in pentose phosphate pathway, starch and sucrose metabolism,
fructose and mannose metabolism and carbon fixation in photosynthetic
organisms were also observed (Additional file [140]10). The nature of
the mitochondrial genes that influence the expression of nuclear genes
were unclear. And thus further investigations on these candidate genes
help to illuminate the primary targets and downstream components of
CMS-associated mitochondrial signaling pathways in Nsa CMS line.
Differentially expressed PPR proteins
Nuclear genomes have crucial roles in mitochondrial biogenesis and
function [[141]31,[142]51]. An estimated 10% of eukaryotic nuclear
genes encode proteins that are targeted to mitochondria following
synthesis on cytosolic ribosomes [[143]32,[144]52]. It has been
proposed that the majority of PPR proteins are targeted to plastid or
mitochondria [[145]34], and many genetic and biochemical studies
conclude that PPRs directly bind to a specific RNA sequence and promote
anterograde regulation such as post-transcriptional splicing,
processing, editing or regulating mRNA stability [[146]53-[147]60].
From about 450 PPR proteins in Arabidopsis, more than 60% are predicted
to be targeted to mitochondria [[148]34]. PPR motifs have been
suggested to possess binding properties to proteins [[149]61] as well
as to RNA [[150]62]. In humans, a PPR related protein with RNA-binding
activity is associated with both nuclear and mitochondrial mRNAs and
discussed as a potential candidate for a factor that may link nuclear
and mitochondrial gene expression [[151]63]. Mutant versions of this
protein were shown to be associated with cytochrome c oxidase
deficiency in humans [[152]64,[153]65]. Hence, PPR proteins are good
candidates for nuclear-encoded factors controlling distinct steps of
transcript maturation in mitochondria. Many of them may turn out to be
essential players in CMS systems. In addition, PPR proteins have been
identified as targets of different miRNAs [[154]66,[155]67]. In this
study, when the genome and transcriptome of B. rapa were used as
reference sequences, there were nine PPR proteins showed differential
expression, including two up-regulated PPR proteins and seven
down-regulated PPR proteins in Fer. These PPR proteins were good
candidates for studying the link between nuclear and mitochondrial gene
expression.
Other DEGs
In addition to the genes described above, several other transcript
profiles were altered between Ste and Fer. Transcriptional factors play
important roles in plant growth and development as well as in hormone
and stress responses. A single transcription factor can regulate
expression of multiple genes. Therefore, alteration in the expression
of transcription factor genes normally results in dramatic changes to a
plant [[156]68-[157]70]. As a practical consequence, engineering of
transcription factor genes provides a valuable means for manipulation
of plants [[158]69]. Our DGE results showed that among the
significantly DEGs, there were forty-five transcription factors for B.
rapa, these transcription factors might therefore be potent tools to
engineer enhanced stress tolerance in oil crops [[159]71,[160]72].
Two male sterility 2 (MS2) members (Bra034793,Bra038691) were found in
Fer, using B. rapa as referece (Additional file [161]2). MS2 gene
determines a nuclear male sterile mutant (monogenic recessive)
phenotype in Arabidopsis and was isolated and characterized using the
En/Spm-I/dSpm transposon-tagging system [[162]73]. The possible
function of the MS2 protein was proposed as a fatty acyl reductase in
the formation of pollen wall substances [[163]74]. The homologue of MS2
in B. napus, BnapMS2, was isolated by cold plague screening from B.
napus anther specific cDNA library [[164]75]. Li et al. [[165]76]
isolated a fragment homologous to BnapMS2 gene from digenic recessive
GMS line S45AB of B. napus using RT-PCR technique, and found that there
existed an amino acid divergence between fertile S45B and sterile S45A,
which may be the putative male sterility site in S45A. Male sterility
gene homolog, designated BcMS2, expressed only in stage III flower buds
of male fertile Chinese cabbage-pak-choi “ZUBajh97-01B” and there were
no detection in all organs of Polima cytoplasmic male sterility (CMS)
line ‘Bpol97-05A’ and Ogura CMS line ‘Bogu97-06A’ [[166]77]. In this
study, MS2 members (Bra034793, Bra038691) were only expressed in Fer,
whose TPM was 1766.48 and 480.76, respectively, and then it was
suggested that the two genes were involved in the development of Fer
pollens.
Three up-regulated aldehyde dehydrogenase members (Bra024619,
Bra018090, Bra016330) were found in Fer, using B. rapa as referece
(Additional file [167]2). Aldehyde dehydrogenases (ALDHs) have been
considered as general detoxifying enzymes which eliminate biogenic and
xenobiotic aldehydes in an NAD(P)-dependent manner [[168]78]. Maize
nuclear fertility restorer gene Rf2 as “biochemical restorer,” encoding
an aldehyde dehydrogenase, plays an important role in oxidizing both
aliphatic and aromatic aldehydes, reducing the amount of a toxic
aldehyde and rescuing the developing pollen through a metabolic effect
[[169]79,[170]80].
Conclusions
This study has demonstrated the usefulness of the digital gene
expression (DGE) approach to identify the differentially expressed
genes (DEGs) between the young floral buds of sterile and fertile
plants. In total, there were 3231 genes of B. rapa and 3371 genes of B.
oleracea which were detected with significant differential expression
levels. GO and pathway-based analyses indicate that these DEGs were
related to many kinds of molecular functions. In addition, there were
1089 specially expressed unknown tags in Fer, which were neither mapped
to B. oleracea nor mapped to B. rapa, and these unique tags were
presumed to arise basically from the added alien chromosome of S.
arvensis. The results provide a strong basis for future research on CMS
in Nsa line, fertility restoration and improved agronomic traits in NR1
line. Further work should focus on characterizing these candidate
target genes.
Methods
Plant materials
The sterile and fertile individual plants were derived from the
self-pollinated offspring of the F[1] hybrid between restorer line NR1
and Nsa CMS line, which were grown at the experimental station of the
Oil Crops Research Institute, Chinese Academy of Agricultural Sciences,
Wuhan, China. For Solexa sequencing, young floral buds (less than 2 mm
in diameter) were pooled. A total of 0.5 g materials was snap-frozen in
liquid nitrogen and stored at −80°C.
Frozen section
Stamens were fixed with FAA (10% formalin / 5% acetic acid/45%
ethanol/0.01% Triton X-100) for 45 min, and rehydrated through an
ethanol series. Stamens were then embedded in OCT compound and frozen,
sectioned at 10-μm thickness with frozen section machine Leica CM1850
(Germany). Observations were made with an Olympus microscope and
photographed.
RNA extraction and Solexa/Illumina sequencing
Total RNA from five sterile and fertile individual plants, respectively
was extracted using the TRIzol reagent (Invitrogen). After
precipitation, RNA was purified with Qiagen’s RNeasy Kit with on-column
DNase digestion according to the manufacturer’s instructions. Purified
RNA samples were dissolved in diethylpyrocarbonate-treated H[2]O, and
the concentration determined spectroscopycally. The quality of the RNA
was assessed on 1.0% denaturing agarose gel in combination with the
Bioanalyzer 2100 (Agilent). For Solexa sequencing, at least 6 μg of
total RNA (≥400 ng/μL) was sent to BGI-Shenzhen, China, following the
protocol offered by Illumina for sequencing. The main reagents and
supplies were the Illumina Gene Expression Sample Prep Kit and Illumina
Sequencing Chip (flow cell), and the main instruments were the Illumina
Cluster Station and the Illumina HiSeq™ 2000 System. In details,
initially, poly-(A) mRNA was isolated from total RNA sample by poly-(T)
oligo-attached magnetic beads. Then, the first- and second-strand cDNAs
were synthesized and digested with restriction enzyme Nla III, which
recognized the CATG sites. Following be washed, the Illumina adaptor 1
was ligated to the 5’ ends of cDNA fragments through the sticky CATG
site. Subsequently, cDNA fragments were digested by Mme I, as a type of
endonuclease with separated recognition sites and digestion sites,
which cut at 17 bp downstream of the CATG site, producing tags with
adaptor 1. After removing 3'- fragments with magnetic beads
precipitation, Illumina adaptor 2 was ligated to the 3'- ends of tags,
acquiring tags with different adaptors of both ends to form a tag
library. After PCR amplification, purification and denaturation, the
single-chain molecules were fixed onto the Illumina Sequencing Chip
(Flowcell). Then, four types of nucleotides labeled by four colors were
added in, and sequencing was performed via the sequencing by synthesis
(SBS) method.
Processing of sequencing tags and gene expression annotation
Sequencing-received raw image data was transformed by base calling into
sequence data, (raw data or raw reads), and was stored in FASTQ format.
This type of files stored information about read sequences and quality.
Each read was described in four lines in FASTQ files. Raw sequences had
3’- adaptor fragments as well as a few low-quality sequences and
several types of impurities. Raw sequences were transformed into clean
tags after certain data-processing steps. All clean sequencing tags
were annotated using the reference sequences which covered all possible
CATG+17-nt tag sequences of the genome and transcriptome of B. rapa and
those of B. oleracea, respectively using blastn, allowing only a 1-bp
mismatch. Clean tags mapped to reference sequences from multiple genes
were filtered, and the remaining clean tags were designated as
unambiguous clean tags. The number of unambiguous clean tags for each
gene was calculated and then normalized to TPM (number of transcripts
per million clean tags) [[171]81,[172]82].
Analysis and screening of DEGs
A rigorous algorithm [[173]28] was used to identify the differentially
expressed genes (DEGs) between the two samples. The P-value corresponds
to the differential gene expression test. The FDR (False Discovery
Rate) is used to determine the threshold of P-value in multiple tests
and analyses by manipulating the FDR value. Assume that R
differentially expressed genes have been selected, among which S genes
truly show differential expression and V genes are false positives. If
we decide that the error ratio “Q = V/R” must stay below a cutoff (e.g.
1%), we should preset the FDR to a number no larger than 0.01. FDR ≤
0.001 and the absolute value of | log2Ratio |≥ 1 were used as
thresholds to judge the significance of differences in transcript
abundance [[174]29]. More stringent criteria with smaller FDR and
greater fold-change value can be used to identify DEGs.
Real-time quantitative RT-PCR (qRT-PCR) analysis
Real-time quantitative RT-PCR (qRT-PCR) analysis was used to verify the
DGE results. The RNA samples used for the qRT-PCR assays were the same
as for the DGE experiments. Gene-specific primers were designed
according to the reference unigene sequences using the Primer Premier
5.0 (Table S4). qRT-PCR was performed according to the TaKaRa
manufacturer specifications (TaKaRa SYBR® PrimeScript™ RT-PCR Kit,
Dalian, China). SYBR Green PCR cycling was denatured using a program of
95°C for 10 s, and 40 cycles of 95°C for 5 s and 55°C for 30 s and
performed on an iQ™ 5 Multicolor Real-time PCR Detection System (Bio-
RAD, USA). B. rapa actin gene was used as a normalizer, and the
relative expression levels of genes were presented by 2^-ΔΔCT.
GO and pathway enrichment analysis
GO or pathway enrichment analysis applies a hypergeometric test to
identify significantly enriched GO terms or pathways in DEGs in
comparison to the whole genome background. The formula used is as
follows:
[MATH: P=1−∑i=0m−1<
mrow>MiN−M<
/mtr>n−i<
/mtr>Nn
mfrac> :MATH]
(1)
For GO enrichment analysis, N is the number of all genes with GO
annotation; n is the number of DEGs in N; M is the number of all genes
that are annotated to the certain GO terms; and m is the number of DEGs
in M. The p value is corrected by Bonferroni, and we chose a
corrected-p value ≤ 0.05 as the threshold value. The GO term (P ≤ 0.05)
is defined as significantly DEGs enriched GO term. For pathway
enrichment analysis, N is the number of genes with a KEGG annotation, n
is the number of DEGs in N, M is the number of genes annotated to
specific pathways, and m is the number of DEGs in M. The pathways with
a Q value of ≤ 0.05 are defined as those with significantly
differentially expressed (enriched) genes.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
XHY contributed to the experimental design and management, data
analysis, and manuscript preparation. CHD and JYY contributed to tissue
collection, RNA extraction and quantitative RT-PCR, data analysis,
manuscript organization and revision. WHL, CHJ, JL and XPF prepared the
plant materials. QH provided the NR1 and Nsa CMS materials. WHW
designed and managed the experiments, organized and reviewed the
manuscript. All authors have read and approved the final manuscript.
Supplementary Material
Additional file 1
Figure S1. Sequencing saturation analysis of the two libraries of Ste
and Fer. The number of detected genes was enhanced as the sequencing
amount (total tag number) increased. A, Ste tags mapped to B. oleracea;
B, Fer tags mapped to B. oleracea; C, Ste tags mapped to B. rapa; D,
Fer tags mapped to B. rapa.
[175]Click here for file^ (1.2MB, tiff)
Additional file 2
DGEs of B. rapa.
[176]Click here for file^ (1,010.6KB, xls)
Additional file 3
DGEs of B. oleracea.
[177]Click here for file^ (995.7KB, xls)
Additional file 4
Specially expressed Tags in Fer, which were neither mapped to B.
oleracea nor mapped to B. rapa.
[178]Click here for file^ (209.5KB, xls)
Additional file 5
Significantly enriched GO terms of DEGs in B. rapa, associated with
molecular function.
[179]Click here for file^ (15.9KB, xlsx)
Additional file 6
Significantly enriched GO terms of DEGs in B. oleracea, associated with
molecular function.
[180]Click here for file^ (11.4KB, xlsx)
Additional file 7
Significantly enriched GO terms of DEGs in B. oleracea, associated with
biological processes.
[181]Click here for file^ (14.3KB, xlsx)
Additional file 8
Significantly enriched GO terms of DEGs in B. rapa, associated with
biological processes.
[182]Click here for file^ (22.1KB, xlsx)
Additional file 9
Pathways corresponding to DEGs from B. oleracea.
[183]Click here for file^ (48KB, path)
Additional file 10
Pathways corresponding to DEGs from B. rapa.
[184]Click here for file^ (46.4KB, path)
Contributor Information
Xiaohong Yan, Email: yxhong216@yahoo.com.cn.
Caihua Dong, Email: dongch@oilcrops.cn.
Jingyin Yu, Email: yujyinfor@gmail.com.
Wanghui Liu, Email: 502973522@qq.com.
Chenghong Jiang, Email: 1148236339@qq.com.
Jia Liu, Email: liusino@gmail.com.
Qiong Hu, Email: huqiong@oilcrops.cn.
Xiaoping Fang, Email: xpfang2008@163.com.
Wenhui Wei, Email: whwei@oilcrops.cn.
Acknowledgements