Abstract
Background
Although rapeseed (Brassica napus L.) mutant forming multiple siliques
was morphologically described and considered to increase the silique
number per plant, an important agronomic trait in this crop, the
molecular mechanism underlying this beneficial trait remains unclear.
Here, we combined bulked-segregant analysis (BSA) and whole genome
re-sequencing (WGR) to map the genomic regions responsible for the
multi-silique trait using two pools of DNA from the near-isogenic lines
(NILs) zws-ms (multi-silique) and zws-217 (single-silique). We used the
Euclidean Distance (ED) to identify genomic regions associated with
this trait based on both SNPs and InDels. We also conducted
transcriptome sequencing to identify differentially expressed genes
(DEGs) between zws-ms and zws-217.
Results
Genetic analysis using the ED algorithm identified three SNP- and two
InDel-associated regions for the multi-silique trait. Two highly
overlapped parts of the SNP- and InDel-associated regions were
identified as important intersecting regions, which are located on
chromosomes A09 and C08, respectively, including 2044 genes in 10.20-MB
length totally. Transcriptome sequencing revealed 129 DEGs between
zws-ms and zws-217 in buds, including 39 DEGs located in the two
abovementioned associated regions. We identified candidate genes
involved in multi-silique formation in rapeseed based on the results of
functional annotation.
Conclusions
This study identified the genomic regions and candidate genes related
to the multi-silique trait in rapeseed.
Electronic supplementary material
The online version of this article (10.1186/s12864-019-5675-4) contains
supplementary material, which is available to authorized users.
Keywords: Association analysis, Brassica napus L., Multi-silique,
Near-isogenic line, Transcriptome sequencing, Whole genome
re-sequencing
Background
In flowering plants, the pistil develops into a fruit, a process
crucial for both yield and seed quality. Many recent studies have
focused on abnormal pistil development, such as the formation of
pistillodes (sterile pistils), pistil number variation, and mutations
affecting pistil length [[43]1–[44]6]. Of these, pistil number
variation, or the multi-pistil phenomenon, has also been reported in
other plants, such as alfalfa (Medicago sativa) [[45]1, [46]7], wheat
(Triticum aestivum) [[47]8, [48]9], sweet cherry (Prunus avium)
[[49]10], and rye (Secale cereale) [[50]11].
Brassica crops are important sources of edible oil, vegetables,
industrial erucic acid, animal feed (seed meal), and ornamental plants.
Like most Brassicaceae family members, Brassica species have
monoclinous flowers, with one pistil and six stamens per normal flower.
Morphological variation in siliques (pods) has been reported in several
Brassica species. In the 1940s, Pathak [[51]12] reported an abnormal
plant of Brassica campestris L. var. sarson prain, which had one to two
extra pods telescoped into the outermost pods. The inner pods was
basically normal but were contorted and smaller than those of the wild
type. Multi-locular Brassica juncea [[52]13], trilocular B. juncea
[[53]14, [54]15] and Brassica rapa [[55]16] were obtained from
interspecific hybridization. Unlike previously reported fruit
variations, zws-ms flower has three completely independent and
apocarpous pistils and nine to ten stamens sharing the same receptacle
in each floral organ, resulting in the formation of three independent
siliques sharing the same carpopodium. Thus, zws-ms has an increased
number of pod on each carpopodium rather than an increased number of
loculi per pod. Similar phenotypes have also been observed previously:
Guan and Li [[56]17] obtained a line with double siliques through
treatment of B. napus with ^12C heavy ion beam irradiation. Hu et al.
[[57]18] identified a multi-pistil plant in a B. campestris L.
cytoplasmic male sterile line, different from our zws-ms plants which
were obtained by crossing of B. napus and B. rapa [[58]19]. Though the
agronomic and physiological traits, and genetic mechanism of zws-ms
plants have been analyzed, the molecular mechanism underlying
multi-pistil formation, which increased the silique number per plant in
rapeseed, remains elusive.
Bulked segregant analysis (BSA) [[59]20] is a simple but effective
strategy for identifying molecular markers linked to target genes by
genotyping a pair of bulked DNA samples from two populations with
distinct phenotypes. BSA was usually combined with traditional markers,
such as simple sequence repeat (SSR) [[60]21], sequence related
amplified polymorphism (SRAP) [[61]22] or random amplified polymorphic
DNA (RAPD) [[62]23] markers to map traits. Recent years, combination of
BSA and next-generation sequencing (NGS) have been successfully used in
the identification of key genes/loci related to regular agronomic
traits in B. napus. For example, Wang et al. combined BSA with NGS to
map quantitative trait loci (QTL) for the branch angle and revealed
BnaYUCCA6 was candidate gene controlling branch angle [[63]24]. Geng et
al. combined specific length amplified fragment sequencing (SLAF-seq)
with BSA and identified four candidate genes related to seed weight
[[64]25]. Yao et al. mapped the orange petal color gene (Bnpc1) in
spring B. napus to a 151-kb region via BSA with whole genome
re-sequencing (WGR) [[65]26]. Hence, we believed that combining BSA and
WGR would be a convenient way to map the genomic regions responsible
for the multi-silique trait.
In this study, we performed association analysis based on SNP and InDel
data using a Euclidean Distance (ED) algorithm to identify the gene
underlying the multi-silique trait in rapeseed. Though the SNP-index
method is also a classical method for association region
identification, and has been employed in many studies [[66]25, [67]26].
However, it is suitable in the case when four DNA samples are
sequenced: two parental plants and two extreme pools from segregated
population. In this study, only two pools from NILs multi-silique pool
and single-silique pool were sequenced, therefore, only ED algorithm
was useful in this situation. We also performed transcriptome
sequencing to identify differentially expressed genes (DEGs). Based on
functional annotation of the genes with SNPs or InDels and DEGs in the
associated regions, we identified candidate genes that might be closely
related to the multi-silique trait.
Results
Morphological analysis of the multi-silique trait in zws-ms
Under normal growth conditions, the NILs (Fig. [68]1) zws-ms
(multi-silique) and zws-217 (single silique) showed no phenotypic
differences (e.g., plant height, leaf shape, and number of branches) at
the vegetative stage. However, zws-ms had shorter and rounder buds than
zws-217 (Fig. [69]2a). At the full-bloom stage, zws-ms showed distinct
floral structures with zws-217. In contrast to normal Brassicaceae
flowers, which typically contain a single pistil and six stamens (four
tetradynamous stamens and two shorter stamens), zws-ms flowers had
three completely independent apocarpous pistils sharing the same
receptacle in each floral organ (Fig. [70]2b). The zws-ms flower had
nine to ten stamens, six of which were more prominent than the others
(Fig. [71]2b). In zws-ms flowers, all pistils and stamens shared the
same receptacle, but the petals and calyxes appeared normal. Roughly
32~53% of flowers in a zws-ms plant had this mutant multi-pistil and
stamen configuration, whereas the rest flowers in the same plant were
normal. The abnormal zws-ms flowers developed into three independent
siliques sharing the same carpopodium as the flowers (Fig. [72]2c). The
triple siliques were normal in length, but occasionally one became
degenerated and curved.
Fig. 1.
[73]Fig. 1
[74]Open in a new tab
Scheme for constructing the multi-silique population (zws-ms) and the
corresponding single-silique NIL population (zws-217)
Fig. 2.
[75]Fig. 2
[76]Open in a new tab
Morphological differences between zws-217and zws-ms. a: at the budding
stage, zws-ms has inflated, round buds; b: at the full-bloom stage, a
normal flower has one pistil and six stamens, while zws-ms has three
pistils and nine stamens per floral organ; c: a normal silique in
zws-217 and the three-silique trait in zws-ms. These images were taken
originally in 2017
Association analysis
High-throughput whole genome re-sequencing on the Illumina HiSeq
platform produced a total of 89.56 Gb of raw data. The Q30 ratio
averaged 90.52%. After alignment to the reference genome, the average
coverage was 91.78% and the average sequencing depth was 40×. When the
clean reads were mapped to the B. napus reference genome, each of them
was aligned in both forward and reverse directions; thus, the number of
total reads was twice of the clean reads: zws-ms and zws-217 contained
328,507,566 and 263,588,948 reads, with 89.89 and 89.10% properly
mapped to the reference genome, respectively.
After filtration, 1,135,690 SNPs and 289,832 small InDels of high
quality (HQ) were used for association analysis. We used the ED
algorithm due to only two pools from NILs zws-ms and zws-217 were
sequenced. According to the ED algorithm [[77]27], the association
threshold was 0.26. Based on this information, three SNP-associated
regions were identified, including one on chromosome A09 (Fig. [78]3a)
and two on chromosome C08 (Fig. [79]3b). One smaller associated region
on the chromosome C08 that close to another larger one (Fig. [80]3b)
was also detected. These regions had a size of 10.45 MB and contained
2095 genes. Similarly, two InDel-associated regions were identified on
chromosomes A09 (Fig. [81]3c) and C08 (Fig. [82]3d) totaling 10.20 MB
and containing 2044 genes, with an association threshold value of 0.30.
Combining the SNP-associated regions and InDel-associated regions, two
merged intersecting regions (Table [83]1, Additional file [84]1: Figure
S1) were obtained on chromosomes A09 and C08, with 2044 genes in this
10.20 MB region, including 126 genes with non-synonymous mutations and
23 with frame-shift mutations.
Fig. 3.
[85]Fig. 3
[86]Open in a new tab
Associated regions calculated using the ED algorithm. Black solid lines
represent the fitted values; red dotted lines represent the associated
threshold. a: SNP-associated region on ChrA09; b: two SNP-associated
regions on ChrC08; the smaller region is indicated by a red arrow; c:
InDel-associated region on ChrA09; d: InDel-associated region on ChrC08
Table 1.
Information about the two intersecting regions
Chromosome Start position End position Size (Mb) Number of genes
A09 28,990,000 33,790,000 4.8 998
C08 32,340,000 37,740,000 5.4 1046
Total – – 10.2 2044
[87]Open in a new tab
Transcriptome sequencing, data mapping, and annotation
We performed sequencing saturation and cluster analysis of the samples
to ensure the validity of the data. In total, 61 Gb of raw data were
generated, with an average Q30 value of 90.54%. For each sample,
approximately 36.7 M clean reads were generated, with an average GC
content of 47.23% (Additional file [88]2: Table S1). The average
proportion of total reads mapped to the reference genome of samples T01
to T06 was 73.729% (Additional file [89]3: Table S2), ranging from
77.81 to 76.03%.
Annotation of genes containing SNPs and InDels
To reveal the biological functions of the potential candidate genes, we
performed GO and KEGG pathway enrichment analyses. Within the SNP- and
InDel-associated regions, 2041 of 2044 genes could be annotated
(Additional file [90]4: Table S3). GO annotations included 2217 terms
involved in biological processes (BP, Fig. [91]4,
Additional file [92]5: Table S4), 326 terms in the cellular component
(CC, Fig. [93]4, Additional file [94]6: Table S5), and 799 terms in the
molecular function (MF, Fig. [95]4, Additional file [96]7: Table S6).
The BP terms with the highest levels of enrichment included response to
nitrate (GO: 0010167), nitrate transport (GO: 0015706), negative
regulation of flower development GO: 0009910), transition metal ion
transport (GO: 0000041), inositol biosynthetic process (GO: 0006021),
and organ development (GO: 0048513) (Additional file [97]5: Table S4).
The CC category includes cell wall (GO: 0005618), vacuole (GO:
0005773), cytosolic small ribosomal subunit (GO: 0022627), and small
ribosomal subunit (GO: 0015935) (Additional file [98]6: Table S5). In
the MF category, the most enriched terms were sinapoyl spermidine:
sinapoyl CoA N-acyltransferase activity (GO: 0080089),
2-isopropylmalate synthase activity (GO: 0003852), methyl salicylate
esterase activity (GO: 0080031), nucleocytoplasmic transporter activity
(GO: 0005487), and calmodulin binding (GO: 0005516) (Additional file
[99]7: Table S6).
Fig. 4.
[100]Fig. 4
[101]Open in a new tab
Gene ontology (GO) terms of the genes from the intersecting regions. A
total of 2041 genes were divided into three categories: biological
processes, cellular components, and molecular functions
Moreover, 126 genes with non-synonymous mutations were identified,
including 104 genes with GO annotations (Additional file [102]8: Table
S7). Additionally, 23 genes with frame-shift mutations were identified,
including 19 with GO annotations (Additional file [103]9: Table S8).
Analysis of these annotations produced some interesting results:
BnaA09g43100D (Bna.A09SFH3) and BnaC08g38270D (Bna.C08HAC12) were
annotated as “flower development (GO:0009908)”; BnaC08g36330D
(Bna.C08GPRI1) is related to “negative regulation of flower development
(GO:0009910)”; BnaA09g44210D (Bna.A09BES1) and BnaA09g48960D are
associated with “ovule development (GO:0048481)” and “carpel
development (GO:0048440)”, respectively; BnaA09g42920D (Bna.A09EC1.3)
was annotated as “regulation of double fertilization forming a zygote
and endosperm (GO:0080155)”; and BnaA09g42700D was annotated as
“specification of floral organ number (GO:0048833)”.
KEGG pathway enrichment analysis revealed the four most significantly
enriched pathways involving the DEGs: stilbenoid, diarylheptanoid and
gingerol biosynthesis (ko00945), limonene and pinene degradation
(ko00903), glycosyl phosphatidyl inositol (GPI)-anchor biosynthesis
(ko00563), and isoquinoline alkaloid biosynthesis (ko00950)
(Additional file [104]10: Table S9). These pathways were further
classified into five major groups: cellular processes, environmental
information processing, genetic information processing, metabolism and
organismal systems (Fig. [105]5). Of these, the sub-group plant hormone
signal transduction had the highest number of annotated genes.
Fig. 5.
[106]Fig. 5
[107]Open in a new tab
Classified KEGG pathways of the genes from the intersecting regions.
The pathways were further classified into five major groups: cellular
processes, environmental information processing, genetic information
processing, metabolism, and organismal systems
Analysis of differentially expressed genes and potential candidate gene
identification
To identify DEGs related to the multi-silique trait, we examined the
expression levels of transcripts from zws-ms and compared them with
those of normal plants (zms-217). A total of 129 DEGs were identified
in buds, with 52.94% (67 genes) upregulated and 48.06% (62 genes)
downregulated (Additional file [108]11: Table S10), 39 of which
(Table [109]2) were located in the two associated regions mentioned
above. Thirteen genes were located on chromosome A09, while 26 were
located on chromosome C08. These 39 genes were clustered into two
groups based on expression level (Fig. [110]6, Table [111]2): the first
group contained 14 genes that were upregulated in multi-silique zws-ms
plants, while the second group contained 25 genes that were
downregulated in these plants.
Table 2.
GO annotations of the 39 DEGs located in the two associated regions
Gene ID Chromosome Expression Level GO Annotation
1 BnaC08g40740D C08 down Molecular Function: translation initiation
factor activity (GO:0003743); Cellular Component: cytoplasm
(GO:0005737);
2 BnaA09g45890D A09 down Cellular Component: plasma membrane
(GO:0005886); Biological Process: phosphate ion transport (GO:0006817);
Cellular Component: integral component of membrane (GO:0016021);
Biological Process: cellular response to phosphate starvation
(GO:0016036);
3 BnaC08g42080D C08 down Cellular Component: plasma membrane
(GO:0005886); Molecular Function: transferase activity, transferring
phosphorus-containing groups (GO:0016772);
4 BnaA09g43250D A09 up Cellular Component: mitochondrion (GO:0005739);
5 BnaC08g36200D C08 down Cellular Component: chloroplast (GO:0009507);
Biological Process: photorespiration (GO:0009853);
6 BnaA09g44650D A09 down Cellular Component: plasma membrane
(GO:0005886); Biological Process: proteolysis (GO:0006508); Biological
Process: lipid transport (GO:0006869); Molecular Function: peptidase
activity (GO:0008233); Molecular Function: lipid binding (GO:0008289);
Cellular Component: anchored component of membrane (GO:0031225);
7 BnaA09g46080D A09 down Cellular Component: nucleus (GO:0005634);
8 BnaC08g37340D C08 up Cellular Component: plasma membrane
(GO:0005886); Biological Process: proteolysis (GO:0006508); Biological
Process: lipid transport (GO:0006869); Molecular Function: peptidase
activity (GO:0008233); Molecular Function: lipid binding (GO:0008289);
Cellular Component: anchored component of membrane (GO:0031225);
9 BnaC08g41780D C08 down Biological Process: sulfur amino acid
metabolic process (GO:0000096); Molecular Function: serine-tRNA ligase
activity (GO:0004828); Molecular Function: ATP binding (GO:0005524);
Cellular Component: mitochondrion (GO:0005739); Biological Process:
rRNA processing (GO:0006364); Biological Process: seryl-tRNA
aminoacylation (GO:0006434); Biological Process: mitochondrion
organization (GO:0007005); Biological Process: cellular amino acid
biosynthetic process (GO:0008652); Biological Process: serine family
amino acid metabolic process (GO:0009069); Cellular Component:
chloroplast (GO:0009507); Biological Process: embryo development ending
in seed dormancy (GO:0009793); Biological Process: chloroplast
relocation (GO:0009902); Biological Process: leaf morphogenesis
(GO:0009965); Biological Process: thylakoid membrane organization
(GO:0010027); Biological Process: photosystem II assembly (GO:0010207);
Biological Process: vegetative to reproductive phase transition of
meristem (GO:0010228); Biological Process: iron-sulfur cluster assembly
(GO:0016226); Biological Process: cell differentiation (GO:0030154);
Biological Process: regulation of protein dephosphorylation
(GO:0035304); Biological Process: cell wall modification (GO:0042545);
Biological Process: transcription from plastid promoter (GO:0042793);
Biological Process: positive regulation of transcription, DNA-templated
(GO:0045893); Biological Process: ovule development (GO:0048481);
10 BnaC08g40810D C08 up Molecular Function: protein serine/threonine
kinase activity (GO:0004674); Biological Process: protein
autophosphorylation (GO:0046777);
11 BnaC08g41540D C08 down Molecular Function: N,N-dimethylaniline
monooxygenase activity (GO:0004499); Cellular Component: nucleus
(GO:0005634); Biological Process: glucosinolate biosynthetic process
(GO:0019761); Molecular Function: flavin adenine dinucleotide binding
(GO:0050660); Molecular Function: NADP binding (GO:0050661); Biological
Process: oxidation-reduction process (GO:0055114); Molecular Function:
8-methylthiopropyl glucosinolate S-oxygenase activity (GO:0080107);
12 BnaC08g40410D C08 up Molecular Function: Ran GTPase activator
activity (GO:0005098); Cellular Component: nuclear envelope
(GO:0005635); Cellular Component: vacuolar membrane (GO:0005774);
Cellular Component: endoplasmic reticulum (GO:0005783); Biological
Process: nucleocytoplasmic transport (GO:0006913); Biological Process:
toxin catabolic process (GO:0009407); Cellular Component: chloroplast
(GO:0009507); Biological Process: photomorphogenesis (GO:0009640);
Biological Process: response to salt stress (GO:0009651); Biological
Process: cullin deneddylation (GO:0010388); Biological Process: lateral
root development (GO:0048527);
13 BnaC08g35720D C08 up Cellular Component: vacuolar
proton-transporting V-type ATPase, V0 domain (GO:0000220); Cellular
Component: mitochondrion (GO:0005739); Cellular Component: Golgi
apparatus (GO:0005794); Biological Process: ATP catabolic process
(GO:0006200); Cellular Component: chloroplast (GO:0009507); Molecular
Function: hydrogen-translocating pyrophosphatase activity (GO:0009678);
Cellular Component: plant-type vacuole membrane (GO:0009705); Molecular
Function: hydrogen ion transmembrane transporter activity (GO:0015078);
Biological Process: ATP synthesis coupled proton transport
(GO:0015986); Biological Process: ATP hydrolysis coupled proton
transport (GO:0015991); Molecular Function: ATPase activity
(GO:0016887); Biological Process: cellular response to nutrient levels
(GO:0031669); Biological Process: sequestering of zinc ion
(GO:0032119); Biological Process: vacuolar sequestering (GO:0043181);
Molecular Function: nutrient reservoir activity (GO:0045735);
Biological Process: vacuolar proton-transporting V-type ATPase complex
assembly (GO:0070072); Biological Process: cellular response to salt
stress (GO:0071472);
14 BnaC08g42280D C08 down Biological Process: telomere maintenance
(GO:0000723); Biological Process: double-strand break repair via
homologous recombination (GO:0000724); Molecular Function: nucleic acid
binding (GO:0003676); Molecular Function: ATP binding (GO:0005524);
Cellular Component: nucleus (GO:0005634); Biological Process: DNA
replication (GO:0006260); Cellular Component: plasmodesma (GO:0009506);
Biological Process: vegetative to reproductive phase transition of
meristem (GO:0010228); Molecular Function: ATP-dependent 3′-5′ DNA
helicase activity (GO:0043140); Biological Process: cellular response
to cold (GO:0070417); Biological Process: cellular response to abscisic
acid stimulus (GO:0071215);
15 BnaC08g38300D C08 down Molecular Function: nucleotide binding
(GO:0000166); Biological Process: mRNA splicing, via spliceosome
(GO:0000398); Molecular Function: RNA binding (GO:0003723); Molecular
Function: protein binding (GO:0005515); Cellular Component: nucleolus
(GO:0005730); Biological Process: sugar mediated signaling pathway
(GO:0010182); Cellular Component: nuclear speck (GO:0016607);
16 BnaC08g39990D C08 up Biological Process: MAPK cascade (GO:0000165);
Molecular Function: protein serine/threonine kinase activity
(GO:0004674); Molecular Function: protein serine/threonine/tyrosine
kinase activity (GO:0004712); Molecular Function: ATP binding
(GO:0005524); Cellular Component: nucleus (GO:0005634); Cellular
Component: cytosol (GO:0005829); Biological Process: protein
phosphorylation (GO:0006468); Biological Process: protein targeting to
membrane (GO:0006612); Biological Process: response to cold
(GO:0009409); Biological Process: response to water deprivation
(GO:0009414); Biological Process: response to ethylene (GO:0009723);
Biological Process: auxin-activated signaling pathway (GO:0009734);
Biological Process: abscisic acid-activated signaling pathway
(GO:0009738); Biological Process: brassinosteroid mediated signaling
pathway (GO:0009742); Biological Process: systemic acquired resistance,
salicylic acid mediated signaling pathway (GO:0009862); Biological
Process: jasmonic acid mediated signaling pathway (GO:0009867);
Biological Process: regulation of signal transduction (GO:0009966);
Biological Process: leaf vascular tissue pattern formation
(GO:0010305); Biological Process: regulation of plant-type
hypersensitive response (GO:0010363); Biological Process: endoplasmic
reticulum unfolded protein response (GO:0030968); Biological Process:
negative regulation of defense response (GO:0031348); Biological
Process: hyperosmotic salinity response (GO:0042538); Biological
Process: negative regulation of programmed cell death (GO:0043069);
Biological Process: defense response to fungus (GO:0050832);
17 BnaC08g36100D C08 down Cellular Component: cytoplasm (GO:0005737);
Biological Process: ER to Golgi vesicle-mediated transport
(GO:0006888);
18 BnaA09g48320D A09 down Molecular Function: structural constituent of
ribosome (GO:0003735); Cellular Component: nucleolus (GO:0005730);
Biological Process: translation (GO:0006412); Cellular Component:
chloroplast (GO:0009507); Cellular Component: cytosolic large ribosomal
subunit (GO:0022625);
19 BnaA09g45000D A09 down Biological Process: RNA splicing, via
endonucleolytic cleavage and ligation (GO:0000394); Molecular Function:
DNA binding (GO:0003677); Cellular Component: transcription factor
TFIID complex (GO:0005669); Biological Process: DNA-templated
transcription, initiation (GO:0006352); Biological Process:
transcription from RNA polymerase II promoter (GO:0006366); Biological
Process: cytokinin-activated signaling pathway (GO:0009736); Biological
Process: jasmonic acid mediated signaling pathway (GO:0009867);
Biological Process: regulation of ethylene-activated signaling pathway
(GO:0010104); Molecular Function: protein heterodimerization activity
(GO:0046982);
20 BnaC08g39130D C08 up Molecular Function: copper ion binding
(GO:0005507); Molecular Function: calmodulin binding (GO:0005516);
Molecular Function: ATP binding (GO:0005524); Cellular Component:
mitochondrion (GO:0005739); Cellular Component: cytosol (GO:0005829);
Biological Process: gluconeogenesis (GO:0006094); Biological Process:
glycolytic process (GO:0006096); Biological Process: protein folding
(GO:0006457); Biological Process: tryptophan catabolic process
(GO:0006569); Biological Process: response to heat (GO:0009408);
Biological Process: response to cold (GO:0009409); Cellular Component:
chloroplast thylakoid membrane (GO:0009535); Cellular Component:
chloroplast stroma (GO:0009570); Biological Process: response to high
light intensity (GO:0009644); Biological Process: response to salt
stress (GO:0009651); Biological Process: chloroplast organization
(GO:0009658); Biological Process: indoleacetic acid biosynthetic
process (GO:0009684); Cellular Component: chloroplast envelope
(GO:0009941); Biological Process: isopentenyl diphosphate biosynthetic
process, methylerythritol 4-phosphate pathway (GO:0019288); Biological
Process: cysteine biosynthetic process (GO:0019344); Biological
Process: response to endoplasmic reticulum stress (GO:0034976);
Biological Process: response to hydrogen peroxide (GO:0042542);
Biological Process: response to cadmium ion (GO:0046686); Cellular
Component: apoplast (GO:0048046); Biological Process: ovule development
(GO:0048481); Molecular Function: chaperone binding (GO:0051087);
Biological Process: positive regulation of superoxide dismutase
activity (GO:1901671);
21 BnaA09g45260D A09 down Cellular Component: chloroplast (GO:0009507);
22 BnaA09g45300D A09 down Molecular Function: serine-type
carboxypeptidase activity (GO:0004185); Cellular Component:
extracellular region (GO:0005576); Cellular Component: vacuole
(GO:0005773); Biological Process: proteolysis (GO:0006508);
23 BnaC08g39020D C08 up Cellular Component: cytosol (GO:0005829);
Cellular Component: plasmodesma (GO:0009506);
24 BnaC08g38200D C08 down Cellular Component: nucleus (GO:0005634);
Molecular Function: oxidoreductase activity, acting on the CH-CH group
of donors, NAD or NADP as acceptor (GO:0016628);
25 BnaC08g41720D C08 down Molecular Function: aspartic-type
endopeptidase activity (GO:0004190); Cellular Component: extracellular
region (GO:0005576); Cellular Component: vacuole (GO:0005773); Cellular
Component: cytosol (GO:0005829); Biological Process: glycolytic process
(GO:0006096); Biological Process: proteolysis (GO:0006508); Biological
Process: protein targeting to vacuole (GO:0006623); Biological Process:
lipid metabolic process (GO:0006629); Biological Process: water
transport (GO:0006833); Biological Process: hyperosmotic response
(GO:0006972); Biological Process: Golgi organization (GO:0007030);
Biological Process: response to temperature stimulus (GO:0009266);
Cellular Component: plasmodesma (GO:0009506); Biological Process:
response to salt stress (GO:0009651); Biological Process: response to
cadmium ion (GO:0046686); Biological Process: organ development
(GO:0048513);
26 BnaC08g42450D C08 down Biological Process: response to molecule of
bacterial origin (GO:0002237); Molecular Function: protein
serine/threonine kinase activity (GO:0004674); Molecular Function: ATP
binding (GO:0005524); Cellular Component: plasma membrane (GO:0005886);
Biological Process: N-terminal protein myristoylation (GO:0006499);
Biological Process: protein targeting to membrane (GO:0006612);
Biological Process: membrane fusion (GO:0006944); Biological Process:
response to oxidative stress (GO:0006979); Biological Process:
transmembrane receptor protein tyrosine kinase signaling pathway
(GO:0007169); Biological Process: systemic acquired resistance
(GO:0009627); Biological Process: seed germination (GO:0009845);
Biological Process: stomatal complex morphogenesis (GO:0010103);
Biological Process: regulation of plant-type hypersensitive response
(GO:0010363); Cellular Component: integral component of membrane
(GO:0016021); Biological Process: negative regulation of programmed
cell death (GO:0043069); Biological Process: protein
autophosphorylation (GO:0046777); Biological Process: stamen
development (GO:0048443); Cellular Component: micropyle (GO:0070825);
27 BnaC08g35850D C08 up Molecular Function: microtubule motor activity
(GO:0003777); Molecular Function: ATP binding (GO:0005524); Cellular
Component: cytoplasm (GO:0005737); Cellular Component: kinesin complex
(GO:0005871); Cellular Component: microtubule (GO:0005874); Cellular
Component: plasma membrane (GO:0005886); Biological Process:
microtubule-based movement (GO:0007018); Molecular Function:
microtubule binding (GO:0008017); Cellular Component: plasmodesma
(GO:0009506);
28 BnaC08g40320D C08 up Molecular Function: chromatin binding
(GO:0003682); Molecular Function: sequence-specific DNA binding
transcription factor activity (GO:0003700); Cellular Component: nucleus
(GO:0005634); Biological Process: regulation of transcription,
DNA-templated (GO:0006355); Biological Process: membrane fusion
(GO:0006944); Molecular Function: identical protein binding
(GO:0042802); Molecular Function: sequence-specific DNA binding
(GO:0043565); Biological Process: Golgi vesicle transport (GO:0048193);
29 BnaA09g45310D A09 up –
30 BnaC08g41180D C08 down Molecular Function: DNA binding (GO:0003677);
Cellular Component: nucleus (GO:0005634); Molecular Function: zinc ion
binding (GO:0008270);
31 BnaA09g44370D A09 down Molecular Function: DNA binding (GO:0003677);
Molecular Function: chromatin binding (GO:0003682); Molecular Function:
sequence-specific DNA binding transcription factor activity
(GO:0003700); Cellular Component: nucleus (GO:0005634); Biological
Process: regulation of transcription, DNA-templated (GO:0006355);
Biological Process: protein targeting to membrane (GO:0006612);
Biological Process: response to salt stress (GO:0009651); Biological
Process: response to ethylene (GO:0009723); Biological Process:
response to auxin (GO:0009733); Biological Process: response to
abscisic acid (GO:0009737); Biological Process: response to gibberellin
(GO:0009739); Biological Process: response to salicylic acid
(GO:0009751); Biological Process: response to jasmonic acid
(GO:0009753); Biological Process: positive regulation of flavonoid
biosynthetic process (GO:0009963); Biological Process: regulation of
plant-type hypersensitive response (GO:0010363); Biological Process:
response to cadmium ion (GO:0046686);
32 BnaA09g47900D A09 down Molecular Function: zinc ion binding
(GO:0008270);
33 BnaC08g41390D C08 down Cellular Component: plant-type vacuole
(GO:0000325); Molecular Function: sucrose alpha-glucosidase activity
(GO:0004575); Biological Process: carbohydrate metabolic process
(GO:0005975); Biological Process: polyamine catabolic process
(GO:0006598); Biological Process: calcium ion transport (GO:0006816);
Biological Process: iron ion transport (GO:0006826); Biological
Process: Golgi organization (GO:0007030); Cellular Component:
plant-type cell wall (GO:0009505); Biological Process: response to
wounding (GO:0009611); Biological Process: response to bacterium
(GO:0009617); Biological Process: response to salt stress (GO:0009651);
Biological Process: coumarin biosynthetic process (GO:0009805);
Biological Process: cellular response to iron ion starvation
(GO:0010106); Biological Process: response to nitrate (GO:0010167);
Biological Process: nitrate transport (GO:0015706); Biological Process:
brassinosteroid biosynthetic process (GO:0016132); Biological Process:
cellular modified amino acid biosynthetic process (GO:0042398);
Biological Process: cellular response to gibberellin stimulus
(GO:0071370); Biological Process: primary root development
(GO:0080022);
34 BnaA09g45610D A09 up Cellular Component: nucleus (GO:0005634);
35 BnaC08g35880D C08 down Cellular Component: cytosol (GO:0005829);
Biological Process: ubiquitin-dependent protein catabolic process
(GO:0006511); Cellular Component: chloroplast (GO:0009507);
36 BnaC08g36360D C08 up Molecular Function: nucleotide binding
(GO:0000166); Molecular Function: catalytic activity (GO:0003824);
Cellular Component: mitochondrial respiratory chain complex I
(GO:0005747); Biological Process: ubiquitin-dependent protein catabolic
process (GO:0006511); Biological Process: response to salt stress
(GO:0009651); Biological Process: photorespiration (GO:0009853);
Molecular Function: coenzyme binding (GO:0050662); Biological Process:
response to misfolded protein (GO:0051788); Biological Process:
proteasome core complex assembly (GO:0080129);
37 BnaC08g37460D C08 down Biological Process: mitotic cell cycle
(GO:0000278); Molecular Function: RNA binding (GO:0003723); Molecular
Function: polynucleotide adenylyltransferase activity (GO:0004652);
Molecular Function: protein binding (GO:0005515); Cellular Component:
nucleus (GO:0005634); Biological Process: transcription, DNA-templated
(GO:0006351); Biological Process: RNA polyadenylation (GO:0043631);
38 BnaC08g39120D C08 up –
39 BnaA09g45320D A09 down Molecular Function: copper ion binding
(GO:0005507); Molecular Function: calmodulin binding (GO:0005516);
Molecular Function: ATP binding (GO:0005524); Cellular Component:
mitochondrion (GO:0005739); Cellular Component: cytosol (GO:0005829);
Biological Process: gluconeogenesis (GO:0006094); Biological Process:
glycolytic process (GO:0006096); Biological Process: protein folding
(GO:0006457); Biological Process: tryptophan catabolic process
(GO:0006569); Biological Process: response to heat (GO:0009408);
Biological Process: response to cold (GO:0009409); Cellular Component:
chloroplast thylakoid membrane (GO:0009535); Cellular Component:
chloroplast stroma (GO:0009570); Biological Process: response to high
light intensity (GO:0009644); Biological Process: response to salt
stress (GO:0009651); Biological Process: chloroplast organization
(GO:0009658); Biological Process: indoleacetic acid biosynthetic
process (GO:0009684); Cellular Component: chloroplast envelope
(GO:0009941); Biological Process: isopentenyl diphosphate biosynthetic
process, methylerythritol 4-phosphate pathway (GO:0019288); Biological
Process: cysteine biosynthetic process (GO:0019344); Biological
Process: response to endoplasmic reticulum stress (GO:0034976);
Biological Process: response to hydrogen peroxide (GO:0042542);
Biological Process: response to cadmium ion (GO:0046686); Cellular
Component: apoplast (GO:0048046); Biological Process: ovule development
(GO:0048481); Molecular Function: chaperone binding (GO:0051087);
Biological Process: positive regulation of superoxide dismutase
activity (GO:1901671)
[112]Open in a new tab
Fig. 6.
[113]Fig. 6
[114]Open in a new tab
Expression patterns of 39 DEGs located in the two associated regions.
The expression levels are given in log[2](FPKM+ 1). FPKM: Fragments per
Kilobase Million. T01, T02, and T03: three random samples from zws-ms
at the budding stage; T04, T05, and T06: three random samples from
zws-217 at the budding stage
Of the 39 DEGs, three (BnaA09g45320D (Bna.A09CPN10), BnaC08g39130D
(Bna.C08CPN10), and BnaC08g41780D (Bna.C08SRS)) were assigned to GO
categories involved in ovule development (GO: 0048481), BnaC08g42450D
(Bna.C08RLK7) in stamen development (GO: 0048443) and BnaC08g41720D
(Bna.C08APA1) in organ development (GO: 0048513).
Validation of transcriptome sequencing data by qPCR
We validated the transcriptome sequencing data by comparing the
relative transcript levels of these DEGs in zws-ms and zws-217 by qPCR.
qPCR analysis of plants in buds showed that all genes except
BnaC08g39120D had similar trends in expression as those observed by
transcriptome sequencing, especially BnaA09g45320D and BnaC08g40410D
(Figs. [115]6 and [116]7). These results further confirm the
reliability of the transcriptome sequencing data described above.
Fig. 7.
[117]Fig. 7
[118]Open in a new tab
Relative expression levels (folds) of 10 selected genes determined by
qPCR. Note: a: BnaA09g45320D; b: BnaC08g40410D; c: BnaA09g45890D; d:
BnaC08g41720D; e: BnaC08g42080D; f: BnaC08g40740D; g: BnaA09g45310D; h:
BnaA09g47900D; i: BnaC08g39120D; j: BnaC08g41780D
Orthologs of these 12 candidate genes in Arabidopsis
These 12 candidate genes mentioned above, including seven with
non-synonymous mutations or frame-shift mutations, as well as the five
DEGs, were then aligned with Arabidopsis and their orthologs
(Table [119]3) in this well-studied model plant were identified based
on The Arabidopsis Information Resource (TAIR,
[120]https://www.arabidopsis.org): (1) AT2G22370 (ortholog of
BnaA09g42700D), identified as MED18, encodes a subunit of the mediator
complex that affects flowering time and floral organ formation through
FLOWERING LOCUS C (FLC) and AGAMOUS (AG); (2) AT2G21750, identified as
EGG CELL 1.3 (EC1.3), encodes a small cysteine-rich protein secreted by
the egg cell during gamete interactions; (3) AT2G21540 is an SEC14-LIKE
3 (ATSFH3) protein; (4) AT1G19350 (BZR2 or BES1) encodes
brassinosteroid signalling protein that accumulates in the nucleus as
dephosphorylated form in response to brassinosteroid. (5) BnaA09g45320D
and BnaC08g39130D share the same ortholog, AT1G14980, known as CPN10,
encoding mitochondrial-localized chaperonin 10; (6) AT1G08260, known as
TIL1 gene, encodes the catalytic subunit of DNA polymerase ε (abo4–1);
(7) AT2G20570 (GPRI1 or GLK1) encodes one of a pair of partially
redundant nuclear transcription factors that regulate chloroplast
development in a cell-autonomous manner; (8) AT1G11910 (APA1) encodes
an aspartic proteinase that forms a heterodimer and is stable over a
broad pH range; (9) AT1G11870 is identified as OVA7 (SRS), of which
expression products Seryl-tRNA synthetase targeted to chloroplasts and
mitochondria; (10) AT1G09970 (RLK7) belongs to a leucine-rich repeat
class of receptor-likekinase (LRR-RLKs); (11) AT1G16710 (HAC12) encodes
an enzyme with histone acetyltransferase activity that can use both H3
and H4 histones as substrates.
Table 3.
The 14 candidate genes and their orthologs in Arabidopsis
Gene in B. napus orthologs in Arabidopsis
Gene ID Description
BnaA09g42700D AT2G22370 MEDIATOR 18 (MED18)
BnaA09g42920D AT2G21750 EGG CELL 1.3
BnaA09g43100D AT2G21540 SEC14-like 3, ATSFH3
BnaA09g44210D AT1G19350 BRASSINAZOLE-RESISTANT 2 (BZR2),
BRI1-EMS-SUPPRESSOR 1 (BES1)
BnaA09g45320D AT1G14980 chaperonin 10 (CPN10)
BnaA09g48960D AT1G08260 TILTED 1 (TIL1)
BnaC08g36330D AT2G20570 GBF’s pro-rich region-interacting factor 1
(GPRI1), GOLDEN2-LIKE 1 (GLK1)
BnaC08g38270D AT1G16710 histone acetyltransferase of the CBP family 12
(HAC12)
BnaC08g39130D AT1G14980 chaperonin 10 (CPN10)
BnaC08g41720D AT1G11910 aspartic proteinase A1 (APA1)
BnaC08g41780D AT1G11870 Seryl-tRNA synthetase (SRS), OVULE ABORTION 7
(OVA7)
BnaC08g42450D AT1G09970 LRR XI-23, RECEPTOR-LIKE KINASE 7 (RLK7)
[121]Open in a new tab
Discussion
In the current study, we constructed NILs to establish the two extreme
DNA pools (multi-silique pool and single-silique pool), instead of the
frequently used F[2] [[122]28–[123]30], doubled haploid (DH) [[124]25,
[125]31], or recombinant inbred line (RIL) [[126]32, [127]33]
populations. BSA [[128]20] combined with NGS [[129]34], which has been
successfully used in various crops [[130]29, [131]30, [132]32, [133]35,
[134]36]. Here, the WGR produced abundant, high-quality data. After
filtration, 1,135,690 HQ SNPs and 289,832 InDels were identified and
used to conduct association analysis. These markers were clearly
superior to traditional markers in terms of number, efficiency, and
quality: as we expected, these markers were much denser than SSR or
other markers and the combination of BSA and WGR saved the labors and
shortened the whole process. As we were unable to use the SNP-index
method in this study, we used the ED algorithm instead, followed by the
identification of three SNP-associated regions, including one on
chromosome A09 and two on chromosome C08. The two regions on chromosome
C08 were very close to each other (only 0.14 Mb apart), and the smaller
region was only 0.07 Mb long, suggesting that it is likely a minor
site. Moreover, we identified two InDel-associated regions. Taken
together, the SNP-associated region data were highly consistent in
terms of position with those of the InDel-associated regions,
confirming the reliability of both data sets. Therefore, the two
intersecting regions are considered to be highly associated with the
multi-silique trait, representing the only difference between the
zws-ms and zws-217 pools. Preliminary analysis indicated that these two
regions contain 2044 genes, including 126 with non-synonymous mutations
and 23 with frame-shift mutations.
Since these genes were identified by WGR, they had small differences in
sequence, which consequently led to changes in the amino acid
sequences, structures, and final functions of the proteins. Since
zws-ms and zws-217 are NILs that only differ from each other in the
multi−/single-silique trait, we reasoned that these genes are
associated with this trait and were therefore selected as candidate
genes. However, some genes might have differed between the two pools in
terms expression levels rather than nucleotide sequence. Therefore, we
performed transcriptome sequencing from buds in plants to uncover the
DEGs between lines, which we then confirmed by qPCR. We identified 129
DEGs between zws-ms and zws-217, which are distributed on all
chromosomes except chromosome A06. Among the 129 DEGs, 39 are located
in the two important trait-associated regions identified in this study
and are therefore considered highly related to the multi-silique trait.
These DEGs were classified into two groups based on expression level,
including one containing upregulated genes and the other containing
downregulated genes.
A comprehensive understanding of the functions of the 2044 genes in the
two intersecting regions is crucial for candidate gene selection.
Unlike standard agronomic traits (such as plant height and flowering
time), which are relatively well-studied and for which some candidate
genes are available for reference, the multi-silique trait analyzed in
this study has not been studied on genomic or molecular level in detail
to date. Thus, GO annotation and KEGG pathway analysis provided
important clues for candidate gene identification. Among the 2044 genes
within the two associated regions, 2041 were successfully annotated.
KEGG pathway analysis provided a general view of the genes in these
regions. Most enriched KEGG pathways were involved in the biosynthesis
or degradation of some secondary metabolites.
As we were unable to relate this information to the multi-silique trait
using established knowledge, we conducted a GO analysis. Eight of these
DEGs were annotated in the category “negative regulation of flower
development (GO: 0009910)” and 221 genes were involved in “organ
development (GO: 0048513)”. This information is too general to reveal
an association with flower/silique development. In addition, most genes
within the two regions were the same between the two DNA pools. Thus,
more attention should be paid to genes showing polymorphism between
zws-ms and zws-217 plants, specifically genes harboring non-synonymous
mutations or frame-shift mutations. Furthermore, 104 of the 126 genes
with non-synonymous mutations and 19 of the 23 genes with frame-shift
mutations were represented in the GO database. We examined each of
their annotations and identified seven interesting genes, which were
annotated to ovule, carpel or stamen development. The NILs zws-ms and
zws-217 differ only in terms of flower and silique morphology and
structure. Therefore, genes that differ in zws-ms versus zws-217 at the
genomic level are thought to encode diverse proteins, suggesting they
are excellent candidate genes for the multi-silique trait.
At the transcriptome level, we focused on DEGs, especially those
located in the associated regions mentioned above. Thirty-nine out of
129 DEGs were found in the two intersecting regions, among which five
genes were annotated to the ovule or stamen development. This
annotation information strongly suggests that these DEGs are involved
in the multi-silique trait, as zws-ms flowers have obviously abnormal
morphology, consequently leading to the formation of abnormal fruits.
Then, we aligned these selected genes to the Arabidopsis, screening for
homologous genes sharing the highest sequence identity with them, but
did not identify more potential candidate genes other than these 12
genes already mentioned (BnaA09g43100D, BnaC08g38270D, BnaC08g36330D,
BnaA09g44210D, BnaA09g48960D, BnaA09g42920D and BnaA09g42700D with
non-synonymous mutations or frame-shift mutations, as well as
BnaA09g45320D, BnaC08g39130D, BnaC08g41780D, BnaC08g42450D and
BnaC08g41720D with different expression levels). Based on analysis of
these orthologs in Arabidopsis and their roles in plants, the annotated
information of these candidate genes are confirmed: (1) BnaA09g42700D
is found much shorter than its Arabidopsis ortholog AT2G22370 (MED18)
in length, but in their overlapping region, they share 100% identity of
nucleotide sequence. Mutation of MED18 in Arabidopsis significantly
delayed its flowering time [[135]37]. FLC encodes an MADS-box protein
and is a repressor for flowering [[136]38]; AG is a floral meristem
identity gene, which specifies carpel and stamen identity in the flower
[[137]39, [138]40]. Thus, BnaA09g42700D could probably act on flowering
indirectly. (2) Disruption of AT1G11870 results in ovule abortion in
Arabidopsis [[139]41]. So the different expression of BnaC08g41780D
between zws-ms and zws-217 may result some unknown process in fruit
development. (3) The knockdown of HAC1 causes disturbances in flower
morphology in the Arabidopsis, such as a modified petal shape or even
absence of petals [[140]42]. HAC12 has high identity with HAC1, but
what morphology changes it can make are still unknown. (4) SEC14-LIKE 3
(ATSFH3) is predominantly expressed in the flowers and involved in the
transfer of phosphatidylinositol or phosphatidylcholine phospholipids
during flower development in Arabidopsis [[141]43, [142]44]. In
chickpea (Cicer arietinum L.), SEC14-LIKE3 is also considered a
potential candidate flowering time-regulating genes by mapping
[[143]45]. (5) EC1.3 triggers sperm cell activation during double
fertilization and knockdown of it reduces seed number in siliques
[[144]46]. (6) The brassinosteroid signal is found involved in ovule
initiation and development, by regulating the expression level of BZR1,
of which dephosphorylation increases ovule and seed number in
Arabidopsis [[145]47]. In this study, BnaA09g44210D has non-synonymous
mutation between zws-ms and zws-217, probably causes multi-silique
trait by some undiscovered pathways. (7) ATGLK1 (GPRI1) can regulate
the silique number in Arabidopsis: the glk1/glk2 double mutant realized
only about 1/3 of the mean silique number per plant found in the wild
type, due to changes in silique wall and leaf photosynthesis [[146]48].
(8) The Arabidopsis mutant til1–4/til1–4 has altered floral
phyllotaxis, reduced ovule number, abnormally developed ovules, and
reduced fertility [[147]49]. (9) APA1: In Cynara cardunculus, the
aspartic proteinase cardosin B is found playing an important role in
ovule function [[148]50]. (10) RLK7: Stamen development depends on a
teamwork of receptor-like kinases [[149]51, [150]52]. (11) CPN10 is
involved in response to many physiological process, such as seed
abortion in plants [[151]53]. Although no report directly leads to
multi-silique trait so far, these orthologs are all related to flower
development, flower morphology change, fruit development and even the
number of ovules/seeds. It is due to the scarcity of the multi-silique
trait to some extent. Moreover, this multi-silique trait is
theoretically more complex, since it is controlled by multi-loci.
Anyhow, these clues validated the 12 candidate genes.
In summary, the unique multi-silique rapeseed zws-ms, examined in this
study represents an enriched rapeseed germplasm resource, which is
considered the basis of crop breeding. Thus far, 12 candidate genes in
two genomic regions associated with the multi-silique trait have been
identified, including seven genes with sequence changes and five genes
with expression level differences between the multi-silique and
single-silique NILs. This information lays the foundation for future
research, such as candidate genes isolation, their functional
verification and so on. Several unanswered questions remain. A previous
study suggested that the multi-silique is likely controlled by three
allelic genes [[152]19], but in the current study, we only identified
two associated regions, which harbor no more than two independent
genes. This finding suggests that one more gene/factor remains
unidentified. We also noted that the multi-silique trait is highly
sensitive to the environment. For example, when zws-ms plants were
planted in Xindu (altitude of 472 m, 30°47′10″N 104°12′12″E), the
multi-silique trait was continuously stable for years; but when they
were grown in Ma’erkang (altitude of 2540 m, 31°94′15″N 102°14′29″E),
the multi-silique trait disappeared, and all plants had normal siliques
(Additional file [153]12: Fig. S2). Interestingly, once we transferred
zws-ms seeds harvested in Ma’erkang back to Xindu, the multi-silique
trait was recovered. By contrast, zws-217 maintained the single-silique
phenotype at both locations.
Thus, there might be some epigenetic reason for this trait, such as
methylation of a base that does not alter the DNA sequence of this
underlying locus but is strongly affected by environmental factors. We
noticed that there were several genes in the environmental response
category in the associated regions with SNP/InDel variations between
the two DNA pools: BnaA09g45600D was annotated as “cellular response to
water deprivation (GO: 0042631)” and BnaA09g44900D as “response to cold
(GO: 0009409)”. Moreover, several DEGs were also annotated to
environment response, like BnaA09g45320D was also annotated as
“response to cold” and so on. Since these genes are responsive to
environmental factors, such as cold, water, and light intensity, their
functions vary when their sequences or expression levels change. Thus,
they may have different effects under different environmental
conditions. Furthermore, Xindu is located in the Sichuan Basin, which
has a humid subtropical monsoon climate (with an annual average
temperature of 16.2 °C), whereas Ma’erkang is located in a mountainous
area in western Sichuan, where the annual average temperature is only
8.6 °C. The environmental factor with the greatest difference between
the two locations is temperature. Thus, we propose that temperature is
the most important environmental factor affecting the multi-silique
trait in zws-ms, although other factors such as light intensity are
also likely involved in regulating this trait.
Conclusions
A novel trait in rapeseed, multi-siliques, was investigated at the
genome level by association analysis based on WGR, followed by analysis
at the transcriptome level. The two regions associated with this trait
contain seven genes with non-synonymous mutations or frame-shift
mutations annotated to floral organ-related GO terms, as well as five
other vital DEGs. These genes are interesting candidate genes for the
multi-silique trait.
Methods
Plant materials and population development
The original multi-silique rapeseed line was discovered at the Crop
Research Institute, Sichuan Academy of Agricultural Sciences from a
population derived from a cross between B. napus and B. rapa.
Multi-silique plants were grown and self-pollinated for six successive
generations to obtain the homozygous multi-silique zws-ms plants.
Single-silique offspring were continuously backcrossed for six
generations with zws-ms as the recurrent parent, followed by six
continuous generations of self-pollination to obtain the near-isogenic
line zws-217 with the single-silique phenotype (Fig. [154]1). The NILs
zws-217 and zws-ms were grown in the field under standard conditions at
the Sichuan Academy of Agricultural Sciences in the Xindu District of
Sichuan Province, China.
DNA preparation and pool construction
The rate of multi-silique formation per zws-ms plant was determined by
phenotyping at the full-bloom stage (BBCH 67). The multi-silique rate
per plant was calculated as:
[MATH: multi−silique
rate=number of
three−pistiled flowersperplanttotal number
of flowers in this plant :MATH]
Thirty plants with high multi-silique rates were selected for DNA
isolation, while 30 zws-217 plants were randomly selected. Fresh leaves
were sampled from these plants (zws-ms and zws-217) and subjected to
DNA extraction by the cetyltrimethyl ammonium bromide (CTAB) method
[[155]54]. RNA contamination was removed from each sample using RNase
A. The DNA was quantified using a NanoDrop 2000 Spectrophotometer
(Thermo Scientific, USA). Samples with optical density 260/280 ratios
ranging from 1.8 to 2.2 were considered adequate. Equal quantities of
DNA were pooled by line (multi-silique and single-silique), with 30
samples per line, to a final concentration of 40 ng/μl.
Whole genome re-sequencing
Pooled DNA samples were sheared into ~ 350 bp fragments using a Covaris
S2/E210 DNA Shearing kit. The sheared DNA was end-repaired and a single
nucleotide (A) overhang was subsequently added to the repaired
fragments by adding a Klenow DNA polymerase Fragment (3′ → 5′ exo–)
(New England Biolabs, Ipswich, MA, USA) and dATP at 37 °C. Barcodes and
Illumina sequencing adapters were ligated to the A-tailed fragments
using T4 DNA Ligase (Takara, Dalian, China). PCR was performed on both
pooled samples using diluted prepared (sheared and ligated) DNA,
deoxyribonucleotide triphosphate, Q5® High-Fidelity DNA Polymerase, and
PCR primers. The PCR products were purified using Agencourt AMPure XP
Beads (Beckman Coulter, High Wycombe, UK). Fragments 300 to 500 bp in
length (including barcodes and adaptors) were excised and purified
using a QIAquick Gel Extraction Kit (Qiagen, Hilden, Germany).
Gel-purified products were diluted for pair-end sequencing (150 bp on
each end) on an Illumina HiSeq system following the standard protocol
(Illumina, Inc.; San Diego, CA, USA) at the Biomarker Technologies
Corporation (Beijing, China). The sequencing depth was roughly 30 × .
SNP calling
Low-quality reads with quality scores <20e were filtered out and raw
reads were sorted according to their barcode sequences. After the
barcodes were trimmed, clean HQ reads were mapped to the
Brassica_napus.v4.1.fa genome
([156]http://www.genoscope.cns.fr/brassicanapus/data/) using the
Burrows-Wheeler Aligner software [[157]55]. SAMtools [[158]56] was used
to mark duplicates, and GATK [[159]57] was used for local realignment
and base recalibration. An SNP set was formed by combining the results
of analysis with GATK and SAMtools via SNP calling using default
parameters. SNPs identified between the two lines were regarded as
polymorphic for association analysis.
Association analysis
Euclidean distance (ED) association analysis is a method that
calculates ED values (quadratic sum root of differences between bulks
from the depth of four types of bases) and is satisfied by ED. In
theory, the higher the ED value is, the closer the object site
[[160]25]. The ED values were calculated as follows [[161]27]:
[MATH: ED=Ams−A2172+Cms−C2172+Gms−G2172+Tms−T2172 :MATH]
In this formula, A[ms], C[ms], G[ms], and T[ms] represent the depth of
bases A, C, T, and G on a site in the multi-silique pool, and A[217],
C[217], G[217], and T[217] represent the depth of bases A, C, T, and G
on a site in the single-silique pool, respectively. To remove
background noise, the original ED^5 value was used, and the adjusted
values were fit using the DISTANCE method. Regions over a threshold
value were considered to be trait-related candidate regions. The
associated threshold value was determined based on ED + 3SD (standard
deviation) [[162]27]. InDel-associated regions were obtained via a
similar method.
RNA isolation for transcriptome analysis
Three individual plants of the zws-217 (T04, T05, and T06) line were
randomly selected for RNA isolation. Eight to ten buds (BBCH 57) per
plant were randomly selected and sampled. For the zws-ms, three plants
(T01, T02, and T03) were selected, and buds were sampled from each
plant. All buds were quick-frozen and stored in liquid nitrogen. Before
RNA isolation, the buds from three zws-ms plants were sliced with
tweezers and observed. Only buds containing three pistils were
considered to be correct multi-silique buds and used for RNA isolation.
Buds from each plant were mixed to isolate the RNA, yielding three
samples from zws-217 (T04, T05, and T06) and three samples from zws-ms
(T01, T02, and T03). Since zws-217 and zws-ms were NILs, and these
plants differ from each other only in the multi-silique trait, we
reasoned that DEGs between these lines might be related to this trait.
Total RNA was isolated using an RNA Isolation Kit (Tiangen, Beijing,
China) and the concentration measured using a NanoDrop 2000 (Thermo).
RNA integrity was assessed using an RNA Nano 6000 Assay Kit and the
Agilent Bioanalyzer 2100 system (Agilent Technologies, CA, USA).
RNA library preparation and transcriptome sequencing
For each tissue sample, 1 μg RNA was used as input material for RNA
sample preparation. Sequencing libraries were generated using a NEBNext
Ultra™ RNA Library Prep Kit for Illumina (NEB, USA) following the
manufacturer’s instructions with index codes added to associate
sequences to each sample. Briefly, mRNA was purified from total RNA
using poly-T oligo-attached magnetic beads. Fragmentation was carried
out using divalent cations under elevated temperature in NEBNext First
Strand Synthesis Reaction Buffer (5×). First-strand cDNA was
synthesized using random hexamer primer and M-MuLV Reverse
Transcriptase. Second-strand cDNA synthesis was subsequently performed
using DNA Polymerase I and RNase H. The remaining overhangs were
converted into blunt ends using exonuclease/polymerase. After
adenylation of the 3′ ends of DNA fragments, a NEBNext Adaptor with a
hairpin loop structure was ligated to prepare for hybridization. To
select 240 bp cDNA fragments, the library fragments were purified using
the AMPure XP system (Beckman Coulter, Danvers, MA USA). The
size-selected, adaptor-ligated cDNA was incubated with 3 μl USER Enzyme
(NEB, USA) at 37 °C for 15 min, followed by 5 min at 95 °C. PCR was
then performed with Phusion High-Fidelity DNA polymerase, universal PCR
primers, and an Index (X) Primer. The PCR products were purified
(AMPure XP system) and library quality assessed on the Agilent
Bioanalyzer 2100 system.
Clustering of index-coded samples was performed on a cBot Cluster
Generation System using a TruSeq PE Cluster Kit v4-cBot-HS (Illumina)
according to the manufacturer’s instructions. After cluster generation,
the libraries were sequenced on the Illumina HiSeq X-ten platform and
paired-end reads were generated.
Raw reads in fastq format were initially processed using in-house Perl
scripts. During this step, clean reads were obtained by removing
adapter sequences, reads containing poly-N, and low-quality reads. At
the same time, the Q20, Q30, GC-content, and sequence duplication
levels of the clean reads were calculated. All downstream analyses were
based on clean HQ reads.
Clean reads were mapped to the reference genome sequence, and only
reads with a perfect match or one mismatch were further analyzed and
annotated based on the reference genome
([163]http://www.genoscope.cns.fr/brassicanapus/data/). Tophat2 tools
(parameters: --read-mismatches 2 --read-edit-dist 2 --max-intron-length
5,000,000 --library-type fr-unstranded --mate-inner-dist 40) were used
to map clean reads to the reference genome.
Differential expression analysis
Differential expression analysis of the two lines was performed using
the DESeq R package [[164]58]. DESeq provides statistical models based
on the negative binomial distribution to identify DEGs using gene
expression data. The resulting P values were adjusted using the
Benjamini and Hochberg’s approach for controlling the false discovery
rate. Genes with an adjusted fold change of 4 and a p-value ≤0.01 were
considered differentially expressed.
Reverse-transcription quantitative PCR (qPCR) validation
To confirm the validity of the DEGs identified by transcriptome
sequencing, qPCR was performed. Reverse transcription to cDNA was
conducted using a PrimeScript First-strand cDNA Synthesis Kit (Takara,
Dalian, China). The relative transcript levels of 10 selected DEGs (at
the budding stage BBCH 57) were re-examined by qPCR, with BnACTIN used
as the internal control. The primer sequences (Additional file [165]13:
Table S11) for the genes were designed with Premier Primer 3.0
software. All qPCR was performed on a Roche480 instrument (Roche,
Basel, Switzerland). Amplification reactions were performed as follows:
an initial denaturation step at 95 °C for 3 min, 39 cycles at 95 °C for
10 s, 58 °C for 30 s, and 72 °C for 30 s, a final extension at 72 °C
for 10 min and hold at 15 °C.
Additional files
[166]Additional file 1:^ (209.1KB, docx)
Figure S1. Whole genome with associated regions. (DOCX 209 kb)
[167]Additional file 2:^ (15KB, docx)
Table S1. Summary of the transcriptome sequencing data. (DOCX 15 kb)
[168]Additional file 3:^ (15.7KB, docx)
Table S2. Information about the mapped reads based on the transcriptome
sequencing data. (DOCX 15 kb)
[169]Additional file 4:^ (340.7KB, xlsx)
Table S3. Annotations of the 2041 genes in the two intersection
associated regions. (XLSX 340 kb)
[170]Additional file 5:^ (123.4KB, xlsx)
Table S4. GO enrichment in biological processes (BP) category of the
genes. (XLSX 123 kb)
[171]Additional file 6:^ (26.6KB, xlsx)
Table S5. GO enrichment in cellular components (CC) category of the
genes. (XLSX 26 kb)
[172]Additional file 7:^ (50.7KB, xlsx)
Table S6. GO enrichment in molecular function (MF) category of the
genes. (XLSX 50 kb)
[173]Additional file 8:^ (34.3KB, docx)
Table S7. GO annotations of the 104 genes with non-synonymous mutation
SNPs in the associated regions. (DOCX 34 kb)
[174]Additional file 9:^ (17.5KB, docx)
Table S8. GO annotations of the 19 genes with frame-shift mutation
InDels in the associated regions. (DOCX 17 kb)
[175]Additional file 10:^ (12.2KB, xlsx)
Table S9. KEGG enrichment of the 2041 genes in the intersection
associated regions. (XLSX 12 kb)
[176]Additional file 11:^ (41KB, docx)
Table S10. Information about the DEGs between zws-ms and zws-217. (DOCX
40 kb)
[177]Additional file 12:^ (1.8MB, docx)
Figure S2. Stability of the multi-silique trait under one environment
and variation of this trait across different environments. (DOCX 1840
kb)
[178]Additional file 13:^ (15.5KB, docx)
Table S11. Primers used for qPCR. (DOCX 15 kb)
Acknowledgements