Abstract

Background

   Although rapeseed (Brassica napus L.) mutant forming multiple siliques
   was morphologically described and considered to increase the silique
   number per plant, an important agronomic trait in this crop, the
   molecular mechanism underlying this beneficial trait remains unclear.
   Here, we combined bulked-segregant analysis (BSA) and whole genome
   re-sequencing (WGR) to map the genomic regions responsible for the
   multi-silique trait using two pools of DNA from the near-isogenic lines
   (NILs) zws-ms (multi-silique) and zws-217 (single-silique). We used the
   Euclidean Distance (ED) to identify genomic regions associated with
   this trait based on both SNPs and InDels. We also conducted
   transcriptome sequencing to identify differentially expressed genes
   (DEGs) between zws-ms and zws-217.

Results

   Genetic analysis using the ED algorithm identified three SNP- and two
   InDel-associated regions for the multi-silique trait. Two highly
   overlapped parts of the SNP- and InDel-associated regions were
   identified as important intersecting regions, which are located on
   chromosomes A09 and C08, respectively, including 2044 genes in 10.20-MB
   length totally. Transcriptome sequencing revealed 129 DEGs between
   zws-ms and zws-217 in buds, including 39 DEGs located in the two
   abovementioned associated regions. We identified candidate genes
   involved in multi-silique formation in rapeseed based on the results of
   functional annotation.

Conclusions

   This study identified the genomic regions and candidate genes related
   to the multi-silique trait in rapeseed.

Electronic supplementary material

   The online version of this article (10.1186/s12864-019-5675-4) contains
   supplementary material, which is available to authorized users.

   Keywords: Association analysis, Brassica napus L., Multi-silique,
   Near-isogenic line, Transcriptome sequencing, Whole genome
   re-sequencing

Background

   In flowering plants, the pistil develops into a fruit, a process
   crucial for both yield and seed quality. Many recent studies have
   focused on abnormal pistil development, such as the formation of
   pistillodes (sterile pistils), pistil number variation, and mutations
   affecting pistil length [[43]1–[44]6]. Of these, pistil number
   variation, or the multi-pistil phenomenon, has also been reported in
   other plants, such as alfalfa (Medicago sativa) [[45]1, [46]7], wheat
   (Triticum aestivum) [[47]8, [48]9], sweet cherry (Prunus avium)
   [[49]10], and rye (Secale cereale) [[50]11].

   Brassica crops are important sources of edible oil, vegetables,
   industrial erucic acid, animal feed (seed meal), and ornamental plants.
   Like most Brassicaceae family members, Brassica species have
   monoclinous flowers, with one pistil and six stamens per normal flower.

   Morphological variation in siliques (pods) has been reported in several
   Brassica species. In the 1940s, Pathak [[51]12] reported an abnormal
   plant of Brassica campestris L. var. sarson prain, which had one to two
   extra pods telescoped into the outermost pods. The inner pods was
   basically normal but were contorted and smaller than those of the wild
   type. Multi-locular Brassica juncea [[52]13], trilocular B. juncea
   [[53]14, [54]15] and Brassica rapa [[55]16] were obtained from
   interspecific hybridization. Unlike previously reported fruit
   variations, zws-ms flower has three completely independent and
   apocarpous pistils and nine to ten stamens sharing the same receptacle
   in each floral organ, resulting in the formation of three independent
   siliques sharing the same carpopodium. Thus, zws-ms has an increased
   number of pod on each carpopodium rather than an increased number of
   loculi per pod. Similar phenotypes have also been observed previously:
   Guan and Li [[56]17] obtained a line with double siliques through
   treatment of B. napus with ^12C heavy ion beam irradiation. Hu et al.
   [[57]18] identified a multi-pistil plant in a B. campestris L.
   cytoplasmic male sterile line, different from our zws-ms plants which
   were obtained by crossing of B. napus and B. rapa [[58]19]. Though the
   agronomic and physiological traits, and genetic mechanism of zws-ms
   plants have been analyzed, the molecular mechanism underlying
   multi-pistil formation, which increased the silique number per plant in
   rapeseed, remains elusive.

   Bulked segregant analysis (BSA) [[59]20] is a simple but effective
   strategy for identifying molecular markers linked to target genes by
   genotyping a pair of bulked DNA samples from two populations with
   distinct phenotypes. BSA was usually combined with traditional markers,
   such as simple sequence repeat (SSR) [[60]21], sequence related
   amplified polymorphism (SRAP) [[61]22] or random amplified polymorphic
   DNA (RAPD) [[62]23] markers to map traits. Recent years, combination of
   BSA and next-generation sequencing (NGS) have been successfully used in
   the identification of key genes/loci related to regular agronomic
   traits in B. napus. For example, Wang et al. combined BSA with NGS to
   map quantitative trait loci (QTL) for the branch angle and revealed
   BnaYUCCA6 was candidate gene controlling branch angle [[63]24]. Geng et
   al. combined specific length amplified fragment sequencing (SLAF-seq)
   with BSA and identified four candidate genes related to seed weight
   [[64]25]. Yao et al. mapped the orange petal color gene (Bnpc1) in
   spring B. napus to a 151-kb region via BSA with whole genome
   re-sequencing (WGR) [[65]26]. Hence, we believed that combining BSA and
   WGR would be a convenient way to map the genomic regions responsible
   for the multi-silique trait.

   In this study, we performed association analysis based on SNP and InDel
   data using a Euclidean Distance (ED) algorithm to identify the gene
   underlying the multi-silique trait in rapeseed. Though the SNP-index
   method is also a classical method for association region
   identification, and has been employed in many studies [[66]25, [67]26].
   However, it is suitable in the case when four DNA samples are
   sequenced: two parental plants and two extreme pools from segregated
   population. In this study, only two pools from NILs multi-silique pool
   and single-silique pool were sequenced, therefore, only ED algorithm
   was useful in this situation. We also performed transcriptome
   sequencing to identify differentially expressed genes (DEGs). Based on
   functional annotation of the genes with SNPs or InDels and DEGs in the
   associated regions, we identified candidate genes that might be closely
   related to the multi-silique trait.

Results

Morphological analysis of the multi-silique trait in zws-ms

   Under normal growth conditions, the NILs (Fig. [68]1) zws-ms
   (multi-silique) and zws-217 (single silique) showed no phenotypic
   differences (e.g., plant height, leaf shape, and number of branches) at
   the vegetative stage. However, zws-ms had shorter and rounder buds than
   zws-217 (Fig. [69]2a). At the full-bloom stage, zws-ms showed distinct
   floral structures with zws-217. In contrast to normal Brassicaceae
   flowers, which typically contain a single pistil and six stamens (four
   tetradynamous stamens and two shorter stamens), zws-ms flowers had
   three completely independent apocarpous pistils sharing the same
   receptacle in each floral organ (Fig. [70]2b). The zws-ms flower had
   nine to ten stamens, six of which were more prominent than the others
   (Fig. [71]2b). In zws-ms flowers, all pistils and stamens shared the
   same receptacle, but the petals and calyxes appeared normal. Roughly
   32~53% of flowers in a zws-ms plant had this mutant multi-pistil and
   stamen configuration, whereas the rest flowers in the same plant were
   normal. The abnormal zws-ms flowers developed into three independent
   siliques sharing the same carpopodium as the flowers (Fig. [72]2c). The
   triple siliques were normal in length, but occasionally one became
   degenerated and curved.

Fig. 1.

   [73]Fig. 1
   [74]Open in a new tab

   Scheme for constructing the multi-silique population (zws-ms) and the
   corresponding single-silique NIL population (zws-217)

Fig. 2.

   [75]Fig. 2
   [76]Open in a new tab

   Morphological differences between zws-217and zws-ms. a: at the budding
   stage, zws-ms has inflated, round buds; b: at the full-bloom stage, a
   normal flower has one pistil and six stamens, while zws-ms has three
   pistils and nine stamens per floral organ; c: a normal silique in
   zws-217 and the three-silique trait in zws-ms. These images were taken
   originally in 2017

Association analysis

   High-throughput whole genome re-sequencing on the Illumina HiSeq
   platform produced a total of 89.56 Gb of raw data. The Q30 ratio
   averaged 90.52%. After alignment to the reference genome, the average
   coverage was 91.78% and the average sequencing depth was 40×. When the
   clean reads were mapped to the B. napus reference genome, each of them
   was aligned in both forward and reverse directions; thus, the number of
   total reads was twice of the clean reads: zws-ms and zws-217 contained
   328,507,566 and 263,588,948 reads, with 89.89 and 89.10% properly
   mapped to the reference genome, respectively.

   After filtration, 1,135,690 SNPs and 289,832 small InDels of high
   quality (HQ) were used for association analysis. We used the ED
   algorithm due to only two pools from NILs zws-ms and zws-217 were
   sequenced. According to the ED algorithm [[77]27], the association
   threshold was 0.26. Based on this information, three SNP-associated
   regions were identified, including one on chromosome A09 (Fig. [78]3a)
   and two on chromosome C08 (Fig. [79]3b). One smaller associated region
   on the chromosome C08 that close to another larger one (Fig. [80]3b)
   was also detected. These regions had a size of 10.45 MB and contained
   2095 genes. Similarly, two InDel-associated regions were identified on
   chromosomes A09 (Fig. [81]3c) and C08 (Fig. [82]3d) totaling 10.20 MB
   and containing 2044 genes, with an association threshold value of 0.30.
   Combining the SNP-associated regions and InDel-associated regions, two
   merged intersecting regions (Table [83]1, Additional file [84]1: Figure
   S1) were obtained on chromosomes A09 and C08, with 2044 genes in this
   10.20 MB region, including 126 genes with non-synonymous mutations and
   23 with frame-shift mutations.

Fig. 3.

   [85]Fig. 3
   [86]Open in a new tab

   Associated regions calculated using the ED algorithm. Black solid lines
   represent the fitted values; red dotted lines represent the associated
   threshold. a: SNP-associated region on ChrA09; b: two SNP-associated
   regions on ChrC08; the smaller region is indicated by a red arrow; c:
   InDel-associated region on ChrA09; d: InDel-associated region on ChrC08

Table 1.

   Information about the two intersecting regions
   Chromosome Start position End position Size (Mb) Number of genes
   A09        28,990,000     33,790,000   4.8       998
   C08        32,340,000     37,740,000   5.4       1046
   Total      –              –            10.2      2044
   [87]Open in a new tab

Transcriptome sequencing, data mapping, and annotation

   We performed sequencing saturation and cluster analysis of the samples
   to ensure the validity of the data. In total, 61 Gb of raw data were
   generated, with an average Q30 value of 90.54%. For each sample,
   approximately 36.7 M clean reads were generated, with an average GC
   content of 47.23% (Additional file [88]2: Table S1). The average
   proportion of total reads mapped to the reference genome of samples T01
   to T06 was 73.729% (Additional file [89]3: Table S2), ranging from
   77.81 to 76.03%.

Annotation of genes containing SNPs and InDels

   To reveal the biological functions of the potential candidate genes, we
   performed GO and KEGG pathway enrichment analyses. Within the SNP- and
   InDel-associated regions, 2041 of 2044 genes could be annotated
   (Additional file [90]4: Table S3). GO annotations included 2217 terms
   involved in biological processes (BP, Fig. [91]4,
   Additional file [92]5: Table S4), 326 terms in the cellular component
   (CC, Fig. [93]4, Additional file [94]6: Table S5), and 799 terms in the
   molecular function (MF, Fig. [95]4, Additional file [96]7: Table S6).
   The BP terms with the highest levels of enrichment included response to
   nitrate (GO: 0010167), nitrate transport (GO: 0015706), negative
   regulation of flower development GO: 0009910), transition metal ion
   transport (GO: 0000041), inositol biosynthetic process (GO: 0006021),
   and organ development (GO: 0048513) (Additional file [97]5: Table S4).
   The CC category includes cell wall (GO: 0005618), vacuole (GO:
   0005773), cytosolic small ribosomal subunit (GO: 0022627), and small
   ribosomal subunit (GO: 0015935) (Additional file [98]6: Table S5). In
   the MF category, the most enriched terms were sinapoyl spermidine:
   sinapoyl CoA N-acyltransferase activity (GO: 0080089),
   2-isopropylmalate synthase activity (GO: 0003852), methyl salicylate
   esterase activity (GO: 0080031), nucleocytoplasmic transporter activity
   (GO: 0005487), and calmodulin binding (GO: 0005516) (Additional file
   [99]7: Table S6).

Fig. 4.

   [100]Fig. 4
   [101]Open in a new tab

   Gene ontology (GO) terms of the genes from the intersecting regions. A
   total of 2041 genes were divided into three categories: biological
   processes, cellular components, and molecular functions

   Moreover, 126 genes with non-synonymous mutations were identified,
   including 104 genes with GO annotations (Additional file [102]8: Table
   S7). Additionally, 23 genes with frame-shift mutations were identified,
   including 19 with GO annotations (Additional file [103]9: Table S8).
   Analysis of these annotations produced some interesting results:
   BnaA09g43100D (Bna.A09SFH3) and BnaC08g38270D (Bna.C08HAC12) were
   annotated as “flower development (GO:0009908)”; BnaC08g36330D
   (Bna.C08GPRI1) is related to “negative regulation of flower development
   (GO:0009910)”; BnaA09g44210D (Bna.A09BES1) and BnaA09g48960D are
   associated with “ovule development (GO:0048481)” and “carpel
   development (GO:0048440)”, respectively; BnaA09g42920D (Bna.A09EC1.3)
   was annotated as “regulation of double fertilization forming a zygote
   and endosperm (GO:0080155)”; and BnaA09g42700D was annotated as
   “specification of floral organ number (GO:0048833)”.

   KEGG pathway enrichment analysis revealed the four most significantly
   enriched pathways involving the DEGs: stilbenoid, diarylheptanoid and
   gingerol biosynthesis (ko00945), limonene and pinene degradation
   (ko00903), glycosyl phosphatidyl inositol (GPI)-anchor biosynthesis
   (ko00563), and isoquinoline alkaloid biosynthesis (ko00950)
   (Additional file [104]10: Table S9). These pathways were further
   classified into five major groups: cellular processes, environmental
   information processing, genetic information processing, metabolism and
   organismal systems (Fig. [105]5). Of these, the sub-group plant hormone
   signal transduction had the highest number of annotated genes.

Fig. 5.

   [106]Fig. 5
   [107]Open in a new tab

   Classified KEGG pathways of the genes from the intersecting regions.
   The pathways were further classified into five major groups: cellular
   processes, environmental information processing, genetic information
   processing, metabolism, and organismal systems

Analysis of differentially expressed genes and potential candidate gene
identification

   To identify DEGs related to the multi-silique trait, we examined the
   expression levels of transcripts from zws-ms and compared them with
   those of normal plants (zms-217). A total of 129 DEGs were identified
   in buds, with 52.94% (67 genes) upregulated and 48.06% (62 genes)
   downregulated (Additional file [108]11: Table S10), 39 of which
   (Table [109]2) were located in the two associated regions mentioned
   above. Thirteen genes were located on chromosome A09, while 26 were
   located on chromosome C08. These 39 genes were clustered into two
   groups based on expression level (Fig. [110]6, Table [111]2): the first
   group contained 14 genes that were upregulated in multi-silique zws-ms
   plants, while the second group contained 25 genes that were
   downregulated in these plants.

Table 2.

   GO annotations of the 39 DEGs located in the two associated regions
   Gene ID Chromosome Expression Level GO Annotation
   1 BnaC08g40740D C08 down Molecular Function: translation initiation
   factor activity (GO:0003743); Cellular Component: cytoplasm
   (GO:0005737);
   2 BnaA09g45890D A09 down Cellular Component: plasma membrane
   (GO:0005886); Biological Process: phosphate ion transport (GO:0006817);
   Cellular Component: integral component of membrane (GO:0016021);
   Biological Process: cellular response to phosphate starvation
   (GO:0016036);
   3 BnaC08g42080D C08 down Cellular Component: plasma membrane
   (GO:0005886); Molecular Function: transferase activity, transferring
   phosphorus-containing groups (GO:0016772);
   4 BnaA09g43250D A09 up Cellular Component: mitochondrion (GO:0005739);
   5 BnaC08g36200D C08 down Cellular Component: chloroplast (GO:0009507);
   Biological Process: photorespiration (GO:0009853);
   6 BnaA09g44650D A09 down Cellular Component: plasma membrane
   (GO:0005886); Biological Process: proteolysis (GO:0006508); Biological
   Process: lipid transport (GO:0006869); Molecular Function: peptidase
   activity (GO:0008233); Molecular Function: lipid binding (GO:0008289);
   Cellular Component: anchored component of membrane (GO:0031225);
   7 BnaA09g46080D A09 down Cellular Component: nucleus (GO:0005634);
   8 BnaC08g37340D C08 up Cellular Component: plasma membrane
   (GO:0005886); Biological Process: proteolysis (GO:0006508); Biological
   Process: lipid transport (GO:0006869); Molecular Function: peptidase
   activity (GO:0008233); Molecular Function: lipid binding (GO:0008289);
   Cellular Component: anchored component of membrane (GO:0031225);
   9 BnaC08g41780D C08 down Biological Process: sulfur amino acid
   metabolic process (GO:0000096); Molecular Function: serine-tRNA ligase
   activity (GO:0004828); Molecular Function: ATP binding (GO:0005524);
   Cellular Component: mitochondrion (GO:0005739); Biological Process:
   rRNA processing (GO:0006364); Biological Process: seryl-tRNA
   aminoacylation (GO:0006434); Biological Process: mitochondrion
   organization (GO:0007005); Biological Process: cellular amino acid
   biosynthetic process (GO:0008652); Biological Process: serine family
   amino acid metabolic process (GO:0009069); Cellular Component:
   chloroplast (GO:0009507); Biological Process: embryo development ending
   in seed dormancy (GO:0009793); Biological Process: chloroplast
   relocation (GO:0009902); Biological Process: leaf morphogenesis
   (GO:0009965); Biological Process: thylakoid membrane organization
   (GO:0010027); Biological Process: photosystem II assembly (GO:0010207);
   Biological Process: vegetative to reproductive phase transition of
   meristem (GO:0010228); Biological Process: iron-sulfur cluster assembly
   (GO:0016226); Biological Process: cell differentiation (GO:0030154);
   Biological Process: regulation of protein dephosphorylation
   (GO:0035304); Biological Process: cell wall modification (GO:0042545);
   Biological Process: transcription from plastid promoter (GO:0042793);
   Biological Process: positive regulation of transcription, DNA-templated
   (GO:0045893); Biological Process: ovule development (GO:0048481);
   10 BnaC08g40810D C08 up Molecular Function: protein serine/threonine
   kinase activity (GO:0004674); Biological Process: protein
   autophosphorylation (GO:0046777);
   11 BnaC08g41540D C08 down Molecular Function: N,N-dimethylaniline
   monooxygenase activity (GO:0004499); Cellular Component: nucleus
   (GO:0005634); Biological Process: glucosinolate biosynthetic process
   (GO:0019761); Molecular Function: flavin adenine dinucleotide binding
   (GO:0050660); Molecular Function: NADP binding (GO:0050661); Biological
   Process: oxidation-reduction process (GO:0055114); Molecular Function:
   8-methylthiopropyl glucosinolate S-oxygenase activity (GO:0080107);
   12 BnaC08g40410D C08 up Molecular Function: Ran GTPase activator
   activity (GO:0005098); Cellular Component: nuclear envelope
   (GO:0005635); Cellular Component: vacuolar membrane (GO:0005774);
   Cellular Component: endoplasmic reticulum (GO:0005783); Biological
   Process: nucleocytoplasmic transport (GO:0006913); Biological Process:
   toxin catabolic process (GO:0009407); Cellular Component: chloroplast
   (GO:0009507); Biological Process: photomorphogenesis (GO:0009640);
   Biological Process: response to salt stress (GO:0009651); Biological
   Process: cullin deneddylation (GO:0010388); Biological Process: lateral
   root development (GO:0048527);
   13 BnaC08g35720D C08 up Cellular Component: vacuolar
   proton-transporting V-type ATPase, V0 domain (GO:0000220); Cellular
   Component: mitochondrion (GO:0005739); Cellular Component: Golgi
   apparatus (GO:0005794); Biological Process: ATP catabolic process
   (GO:0006200); Cellular Component: chloroplast (GO:0009507); Molecular
   Function: hydrogen-translocating pyrophosphatase activity (GO:0009678);
   Cellular Component: plant-type vacuole membrane (GO:0009705); Molecular
   Function: hydrogen ion transmembrane transporter activity (GO:0015078);
   Biological Process: ATP synthesis coupled proton transport
   (GO:0015986); Biological Process: ATP hydrolysis coupled proton
   transport (GO:0015991); Molecular Function: ATPase activity
   (GO:0016887); Biological Process: cellular response to nutrient levels
   (GO:0031669); Biological Process: sequestering of zinc ion
   (GO:0032119); Biological Process: vacuolar sequestering (GO:0043181);
   Molecular Function: nutrient reservoir activity (GO:0045735);
   Biological Process: vacuolar proton-transporting V-type ATPase complex
   assembly (GO:0070072); Biological Process: cellular response to salt
   stress (GO:0071472);
   14 BnaC08g42280D C08 down Biological Process: telomere maintenance
   (GO:0000723); Biological Process: double-strand break repair via
   homologous recombination (GO:0000724); Molecular Function: nucleic acid
   binding (GO:0003676); Molecular Function: ATP binding (GO:0005524);
   Cellular Component: nucleus (GO:0005634); Biological Process: DNA
   replication (GO:0006260); Cellular Component: plasmodesma (GO:0009506);
   Biological Process: vegetative to reproductive phase transition of
   meristem (GO:0010228); Molecular Function: ATP-dependent 3′-5′ DNA
   helicase activity (GO:0043140); Biological Process: cellular response
   to cold (GO:0070417); Biological Process: cellular response to abscisic
   acid stimulus (GO:0071215);
   15 BnaC08g38300D C08 down Molecular Function: nucleotide binding
   (GO:0000166); Biological Process: mRNA splicing, via spliceosome
   (GO:0000398); Molecular Function: RNA binding (GO:0003723); Molecular
   Function: protein binding (GO:0005515); Cellular Component: nucleolus
   (GO:0005730); Biological Process: sugar mediated signaling pathway
   (GO:0010182); Cellular Component: nuclear speck (GO:0016607);
   16 BnaC08g39990D C08 up Biological Process: MAPK cascade (GO:0000165);
   Molecular Function: protein serine/threonine kinase activity
   (GO:0004674); Molecular Function: protein serine/threonine/tyrosine
   kinase activity (GO:0004712); Molecular Function: ATP binding
   (GO:0005524); Cellular Component: nucleus (GO:0005634); Cellular
   Component: cytosol (GO:0005829); Biological Process: protein
   phosphorylation (GO:0006468); Biological Process: protein targeting to
   membrane (GO:0006612); Biological Process: response to cold
   (GO:0009409); Biological Process: response to water deprivation
   (GO:0009414); Biological Process: response to ethylene (GO:0009723);
   Biological Process: auxin-activated signaling pathway (GO:0009734);
   Biological Process: abscisic acid-activated signaling pathway
   (GO:0009738); Biological Process: brassinosteroid mediated signaling
   pathway (GO:0009742); Biological Process: systemic acquired resistance,
   salicylic acid mediated signaling pathway (GO:0009862); Biological
   Process: jasmonic acid mediated signaling pathway (GO:0009867);
   Biological Process: regulation of signal transduction (GO:0009966);
   Biological Process: leaf vascular tissue pattern formation
   (GO:0010305); Biological Process: regulation of plant-type
   hypersensitive response (GO:0010363); Biological Process: endoplasmic
   reticulum unfolded protein response (GO:0030968); Biological Process:
   negative regulation of defense response (GO:0031348); Biological
   Process: hyperosmotic salinity response (GO:0042538); Biological
   Process: negative regulation of programmed cell death (GO:0043069);
   Biological Process: defense response to fungus (GO:0050832);
   17 BnaC08g36100D C08 down Cellular Component: cytoplasm (GO:0005737);
   Biological Process: ER to Golgi vesicle-mediated transport
   (GO:0006888);
   18 BnaA09g48320D A09 down Molecular Function: structural constituent of
   ribosome (GO:0003735); Cellular Component: nucleolus (GO:0005730);
   Biological Process: translation (GO:0006412); Cellular Component:
   chloroplast (GO:0009507); Cellular Component: cytosolic large ribosomal
   subunit (GO:0022625);
   19 BnaA09g45000D A09 down Biological Process: RNA splicing, via
   endonucleolytic cleavage and ligation (GO:0000394); Molecular Function:
   DNA binding (GO:0003677); Cellular Component: transcription factor
   TFIID complex (GO:0005669); Biological Process: DNA-templated
   transcription, initiation (GO:0006352); Biological Process:
   transcription from RNA polymerase II promoter (GO:0006366); Biological
   Process: cytokinin-activated signaling pathway (GO:0009736); Biological
   Process: jasmonic acid mediated signaling pathway (GO:0009867);
   Biological Process: regulation of ethylene-activated signaling pathway
   (GO:0010104); Molecular Function: protein heterodimerization activity
   (GO:0046982);
   20 BnaC08g39130D C08 up Molecular Function: copper ion binding
   (GO:0005507); Molecular Function: calmodulin binding (GO:0005516);
   Molecular Function: ATP binding (GO:0005524); Cellular Component:
   mitochondrion (GO:0005739); Cellular Component: cytosol (GO:0005829);
   Biological Process: gluconeogenesis (GO:0006094); Biological Process:
   glycolytic process (GO:0006096); Biological Process: protein folding
   (GO:0006457); Biological Process: tryptophan catabolic process
   (GO:0006569); Biological Process: response to heat (GO:0009408);
   Biological Process: response to cold (GO:0009409); Cellular Component:
   chloroplast thylakoid membrane (GO:0009535); Cellular Component:
   chloroplast stroma (GO:0009570); Biological Process: response to high
   light intensity (GO:0009644); Biological Process: response to salt
   stress (GO:0009651); Biological Process: chloroplast organization
   (GO:0009658); Biological Process: indoleacetic acid biosynthetic
   process (GO:0009684); Cellular Component: chloroplast envelope
   (GO:0009941); Biological Process: isopentenyl diphosphate biosynthetic
   process, methylerythritol 4-phosphate pathway (GO:0019288); Biological
   Process: cysteine biosynthetic process (GO:0019344); Biological
   Process: response to endoplasmic reticulum stress (GO:0034976);
   Biological Process: response to hydrogen peroxide (GO:0042542);
   Biological Process: response to cadmium ion (GO:0046686); Cellular
   Component: apoplast (GO:0048046); Biological Process: ovule development
   (GO:0048481); Molecular Function: chaperone binding (GO:0051087);
   Biological Process: positive regulation of superoxide dismutase
   activity (GO:1901671);
   21 BnaA09g45260D A09 down Cellular Component: chloroplast (GO:0009507);
   22 BnaA09g45300D A09 down Molecular Function: serine-type
   carboxypeptidase activity (GO:0004185); Cellular Component:
   extracellular region (GO:0005576); Cellular Component: vacuole
   (GO:0005773); Biological Process: proteolysis (GO:0006508);
   23 BnaC08g39020D C08 up Cellular Component: cytosol (GO:0005829);
   Cellular Component: plasmodesma (GO:0009506);
   24 BnaC08g38200D C08 down Cellular Component: nucleus (GO:0005634);
   Molecular Function: oxidoreductase activity, acting on the CH-CH group
   of donors, NAD or NADP as acceptor (GO:0016628);
   25 BnaC08g41720D C08 down Molecular Function: aspartic-type
   endopeptidase activity (GO:0004190); Cellular Component: extracellular
   region (GO:0005576); Cellular Component: vacuole (GO:0005773); Cellular
   Component: cytosol (GO:0005829); Biological Process: glycolytic process
   (GO:0006096); Biological Process: proteolysis (GO:0006508); Biological
   Process: protein targeting to vacuole (GO:0006623); Biological Process:
   lipid metabolic process (GO:0006629); Biological Process: water
   transport (GO:0006833); Biological Process: hyperosmotic response
   (GO:0006972); Biological Process: Golgi organization (GO:0007030);
   Biological Process: response to temperature stimulus (GO:0009266);
   Cellular Component: plasmodesma (GO:0009506); Biological Process:
   response to salt stress (GO:0009651); Biological Process: response to
   cadmium ion (GO:0046686); Biological Process: organ development
   (GO:0048513);
   26 BnaC08g42450D C08 down Biological Process: response to molecule of
   bacterial origin (GO:0002237); Molecular Function: protein
   serine/threonine kinase activity (GO:0004674); Molecular Function: ATP
   binding (GO:0005524); Cellular Component: plasma membrane (GO:0005886);
   Biological Process: N-terminal protein myristoylation (GO:0006499);
   Biological Process: protein targeting to membrane (GO:0006612);
   Biological Process: membrane fusion (GO:0006944); Biological Process:
   response to oxidative stress (GO:0006979); Biological Process:
   transmembrane receptor protein tyrosine kinase signaling pathway
   (GO:0007169); Biological Process: systemic acquired resistance
   (GO:0009627); Biological Process: seed germination (GO:0009845);
   Biological Process: stomatal complex morphogenesis (GO:0010103);
   Biological Process: regulation of plant-type hypersensitive response
   (GO:0010363); Cellular Component: integral component of membrane
   (GO:0016021); Biological Process: negative regulation of programmed
   cell death (GO:0043069); Biological Process: protein
   autophosphorylation (GO:0046777); Biological Process: stamen
   development (GO:0048443); Cellular Component: micropyle (GO:0070825);
   27 BnaC08g35850D C08 up Molecular Function: microtubule motor activity
   (GO:0003777); Molecular Function: ATP binding (GO:0005524); Cellular
   Component: cytoplasm (GO:0005737); Cellular Component: kinesin complex
   (GO:0005871); Cellular Component: microtubule (GO:0005874); Cellular
   Component: plasma membrane (GO:0005886); Biological Process:
   microtubule-based movement (GO:0007018); Molecular Function:
   microtubule binding (GO:0008017); Cellular Component: plasmodesma
   (GO:0009506);
   28 BnaC08g40320D C08 up Molecular Function: chromatin binding
   (GO:0003682); Molecular Function: sequence-specific DNA binding
   transcription factor activity (GO:0003700); Cellular Component: nucleus
   (GO:0005634); Biological Process: regulation of transcription,
   DNA-templated (GO:0006355); Biological Process: membrane fusion
   (GO:0006944); Molecular Function: identical protein binding
   (GO:0042802); Molecular Function: sequence-specific DNA binding
   (GO:0043565); Biological Process: Golgi vesicle transport (GO:0048193);
   29 BnaA09g45310D A09 up –
   30 BnaC08g41180D C08 down Molecular Function: DNA binding (GO:0003677);
   Cellular Component: nucleus (GO:0005634); Molecular Function: zinc ion
   binding (GO:0008270);
   31 BnaA09g44370D A09 down Molecular Function: DNA binding (GO:0003677);
   Molecular Function: chromatin binding (GO:0003682); Molecular Function:
   sequence-specific DNA binding transcription factor activity
   (GO:0003700); Cellular Component: nucleus (GO:0005634); Biological
   Process: regulation of transcription, DNA-templated (GO:0006355);
   Biological Process: protein targeting to membrane (GO:0006612);
   Biological Process: response to salt stress (GO:0009651); Biological
   Process: response to ethylene (GO:0009723); Biological Process:
   response to auxin (GO:0009733); Biological Process: response to
   abscisic acid (GO:0009737); Biological Process: response to gibberellin
   (GO:0009739); Biological Process: response to salicylic acid
   (GO:0009751); Biological Process: response to jasmonic acid
   (GO:0009753); Biological Process: positive regulation of flavonoid
   biosynthetic process (GO:0009963); Biological Process: regulation of
   plant-type hypersensitive response (GO:0010363); Biological Process:
   response to cadmium ion (GO:0046686);
   32 BnaA09g47900D A09 down Molecular Function: zinc ion binding
   (GO:0008270);
   33 BnaC08g41390D C08 down Cellular Component: plant-type vacuole
   (GO:0000325); Molecular Function: sucrose alpha-glucosidase activity
   (GO:0004575); Biological Process: carbohydrate metabolic process
   (GO:0005975); Biological Process: polyamine catabolic process
   (GO:0006598); Biological Process: calcium ion transport (GO:0006816);
   Biological Process: iron ion transport (GO:0006826); Biological
   Process: Golgi organization (GO:0007030); Cellular Component:
   plant-type cell wall (GO:0009505); Biological Process: response to
   wounding (GO:0009611); Biological Process: response to bacterium
   (GO:0009617); Biological Process: response to salt stress (GO:0009651);
   Biological Process: coumarin biosynthetic process (GO:0009805);
   Biological Process: cellular response to iron ion starvation
   (GO:0010106); Biological Process: response to nitrate (GO:0010167);
   Biological Process: nitrate transport (GO:0015706); Biological Process:
   brassinosteroid biosynthetic process (GO:0016132); Biological Process:
   cellular modified amino acid biosynthetic process (GO:0042398);
   Biological Process: cellular response to gibberellin stimulus
   (GO:0071370); Biological Process: primary root development
   (GO:0080022);
   34 BnaA09g45610D A09 up Cellular Component: nucleus (GO:0005634);
   35 BnaC08g35880D C08 down Cellular Component: cytosol (GO:0005829);
   Biological Process: ubiquitin-dependent protein catabolic process
   (GO:0006511); Cellular Component: chloroplast (GO:0009507);
   36 BnaC08g36360D C08 up Molecular Function: nucleotide binding
   (GO:0000166); Molecular Function: catalytic activity (GO:0003824);
   Cellular Component: mitochondrial respiratory chain complex I
   (GO:0005747); Biological Process: ubiquitin-dependent protein catabolic
   process (GO:0006511); Biological Process: response to salt stress
   (GO:0009651); Biological Process: photorespiration (GO:0009853);
   Molecular Function: coenzyme binding (GO:0050662); Biological Process:
   response to misfolded protein (GO:0051788); Biological Process:
   proteasome core complex assembly (GO:0080129);
   37 BnaC08g37460D C08 down Biological Process: mitotic cell cycle
   (GO:0000278); Molecular Function: RNA binding (GO:0003723); Molecular
   Function: polynucleotide adenylyltransferase activity (GO:0004652);
   Molecular Function: protein binding (GO:0005515); Cellular Component:
   nucleus (GO:0005634); Biological Process: transcription, DNA-templated
   (GO:0006351); Biological Process: RNA polyadenylation (GO:0043631);
   38 BnaC08g39120D C08 up –
   39 BnaA09g45320D A09 down Molecular Function: copper ion binding
   (GO:0005507); Molecular Function: calmodulin binding (GO:0005516);
   Molecular Function: ATP binding (GO:0005524); Cellular Component:
   mitochondrion (GO:0005739); Cellular Component: cytosol (GO:0005829);
   Biological Process: gluconeogenesis (GO:0006094); Biological Process:
   glycolytic process (GO:0006096); Biological Process: protein folding
   (GO:0006457); Biological Process: tryptophan catabolic process
   (GO:0006569); Biological Process: response to heat (GO:0009408);
   Biological Process: response to cold (GO:0009409); Cellular Component:
   chloroplast thylakoid membrane (GO:0009535); Cellular Component:
   chloroplast stroma (GO:0009570); Biological Process: response to high
   light intensity (GO:0009644); Biological Process: response to salt
   stress (GO:0009651); Biological Process: chloroplast organization
   (GO:0009658); Biological Process: indoleacetic acid biosynthetic
   process (GO:0009684); Cellular Component: chloroplast envelope
   (GO:0009941); Biological Process: isopentenyl diphosphate biosynthetic
   process, methylerythritol 4-phosphate pathway (GO:0019288); Biological
   Process: cysteine biosynthetic process (GO:0019344); Biological
   Process: response to endoplasmic reticulum stress (GO:0034976);
   Biological Process: response to hydrogen peroxide (GO:0042542);
   Biological Process: response to cadmium ion (GO:0046686); Cellular
   Component: apoplast (GO:0048046); Biological Process: ovule development
   (GO:0048481); Molecular Function: chaperone binding (GO:0051087);
   Biological Process: positive regulation of superoxide dismutase
   activity (GO:1901671)
   [112]Open in a new tab

Fig. 6.

   [113]Fig. 6
   [114]Open in a new tab

   Expression patterns of 39 DEGs located in the two associated regions.
   The expression levels are given in log[2](FPKM+ 1). FPKM: Fragments per
   Kilobase Million. T01, T02, and T03: three random samples from zws-ms
   at the budding stage; T04, T05, and T06: three random samples from
   zws-217 at the budding stage

   Of the 39 DEGs, three (BnaA09g45320D (Bna.A09CPN10), BnaC08g39130D
   (Bna.C08CPN10), and BnaC08g41780D (Bna.C08SRS)) were assigned to GO
   categories involved in ovule development (GO: 0048481), BnaC08g42450D
   (Bna.C08RLK7) in stamen development (GO: 0048443) and BnaC08g41720D
   (Bna.C08APA1) in organ development (GO: 0048513).

Validation of transcriptome sequencing data by qPCR

   We validated the transcriptome sequencing data by comparing the
   relative transcript levels of these DEGs in zws-ms and zws-217 by qPCR.
   qPCR analysis of plants in buds showed that all genes except
   BnaC08g39120D had similar trends in expression as those observed by
   transcriptome sequencing, especially BnaA09g45320D and BnaC08g40410D
   (Figs. [115]6 and [116]7). These results further confirm the
   reliability of the transcriptome sequencing data described above.

Fig. 7.

   [117]Fig. 7
   [118]Open in a new tab

   Relative expression levels (folds) of 10 selected genes determined by
   qPCR. Note: a: BnaA09g45320D; b: BnaC08g40410D; c: BnaA09g45890D; d:
   BnaC08g41720D; e: BnaC08g42080D; f: BnaC08g40740D; g: BnaA09g45310D; h:
   BnaA09g47900D; i: BnaC08g39120D; j: BnaC08g41780D

Orthologs of these 12 candidate genes in Arabidopsis

   These 12 candidate genes mentioned above, including seven with
   non-synonymous mutations or frame-shift mutations, as well as the five
   DEGs, were then aligned with Arabidopsis and their orthologs
   (Table [119]3) in this well-studied model plant were identified based
   on The Arabidopsis Information Resource (TAIR,
   [120]https://www.arabidopsis.org): (1) AT2G22370 (ortholog of
   BnaA09g42700D), identified as MED18, encodes a subunit of the mediator
   complex that affects flowering time and floral organ formation through
   FLOWERING LOCUS C (FLC) and AGAMOUS (AG); (2) AT2G21750, identified as
   EGG CELL 1.3 (EC1.3), encodes a small cysteine-rich protein secreted by
   the egg cell during gamete interactions; (3) AT2G21540 is an SEC14-LIKE
   3 (ATSFH3) protein; (4) AT1G19350 (BZR2 or BES1) encodes
   brassinosteroid signalling protein that accumulates in the nucleus as
   dephosphorylated form in response to brassinosteroid. (5) BnaA09g45320D
   and BnaC08g39130D share the same ortholog, AT1G14980, known as CPN10,
   encoding mitochondrial-localized chaperonin 10; (6) AT1G08260, known as
   TIL1 gene, encodes the catalytic subunit of DNA polymerase ε (abo4–1);
   (7) AT2G20570 (GPRI1 or GLK1) encodes one of a pair of partially
   redundant nuclear transcription factors that regulate chloroplast
   development in a cell-autonomous manner; (8) AT1G11910 (APA1) encodes
   an aspartic proteinase that forms a heterodimer and is stable over a
   broad pH range; (9) AT1G11870 is identified as OVA7 (SRS), of which
   expression products Seryl-tRNA synthetase targeted to chloroplasts and
   mitochondria; (10) AT1G09970 (RLK7) belongs to a leucine-rich repeat
   class of receptor-likekinase (LRR-RLKs); (11) AT1G16710 (HAC12) encodes
   an enzyme with histone acetyltransferase activity that can use both H3
   and H4 histones as substrates.

Table 3.

   The 14 candidate genes and their orthologs in Arabidopsis
   Gene in B. napus orthologs in Arabidopsis
   Gene ID Description
   BnaA09g42700D AT2G22370 MEDIATOR 18 (MED18)
   BnaA09g42920D AT2G21750 EGG CELL 1.3
   BnaA09g43100D AT2G21540 SEC14-like 3, ATSFH3
   BnaA09g44210D AT1G19350 BRASSINAZOLE-RESISTANT 2 (BZR2),
   BRI1-EMS-SUPPRESSOR 1 (BES1)
   BnaA09g45320D AT1G14980 chaperonin 10 (CPN10)
   BnaA09g48960D AT1G08260 TILTED 1 (TIL1)
   BnaC08g36330D AT2G20570 GBF’s pro-rich region-interacting factor 1
   (GPRI1), GOLDEN2-LIKE 1 (GLK1)
   BnaC08g38270D AT1G16710 histone acetyltransferase of the CBP family 12
   (HAC12)
   BnaC08g39130D AT1G14980 chaperonin 10 (CPN10)
   BnaC08g41720D AT1G11910 aspartic proteinase A1 (APA1)
   BnaC08g41780D AT1G11870 Seryl-tRNA synthetase (SRS), OVULE ABORTION 7
   (OVA7)
   BnaC08g42450D AT1G09970 LRR XI-23, RECEPTOR-LIKE KINASE 7 (RLK7)
   [121]Open in a new tab

Discussion

   In the current study, we constructed NILs to establish the two extreme
   DNA pools (multi-silique pool and single-silique pool), instead of the
   frequently used F[2] [[122]28–[123]30], doubled haploid (DH) [[124]25,
   [125]31], or recombinant inbred line (RIL) [[126]32, [127]33]
   populations. BSA [[128]20] combined with NGS [[129]34], which has been
   successfully used in various crops [[130]29, [131]30, [132]32, [133]35,
   [134]36]. Here, the WGR produced abundant, high-quality data. After
   filtration, 1,135,690 HQ SNPs and 289,832 InDels were identified and
   used to conduct association analysis. These markers were clearly
   superior to traditional markers in terms of number, efficiency, and
   quality: as we expected, these markers were much denser than SSR or
   other markers and the combination of BSA and WGR saved the labors and
   shortened the whole process. As we were unable to use the SNP-index
   method in this study, we used the ED algorithm instead, followed by the
   identification of three SNP-associated regions, including one on
   chromosome A09 and two on chromosome C08. The two regions on chromosome
   C08 were very close to each other (only 0.14 Mb apart), and the smaller
   region was only 0.07 Mb long, suggesting that it is likely a minor
   site. Moreover, we identified two InDel-associated regions. Taken
   together, the SNP-associated region data were highly consistent in
   terms of position with those of the InDel-associated regions,
   confirming the reliability of both data sets. Therefore, the two
   intersecting regions are considered to be highly associated with the
   multi-silique trait, representing the only difference between the
   zws-ms and zws-217 pools. Preliminary analysis indicated that these two
   regions contain 2044 genes, including 126 with non-synonymous mutations
   and 23 with frame-shift mutations.

   Since these genes were identified by WGR, they had small differences in
   sequence, which consequently led to changes in the amino acid
   sequences, structures, and final functions of the proteins. Since
   zws-ms and zws-217 are NILs that only differ from each other in the
   multi−/single-silique trait, we reasoned that these genes are
   associated with this trait and were therefore selected as candidate
   genes. However, some genes might have differed between the two pools in
   terms expression levels rather than nucleotide sequence. Therefore, we
   performed transcriptome sequencing from buds in plants to uncover the
   DEGs between lines, which we then confirmed by qPCR. We identified 129
   DEGs between zws-ms and zws-217, which are distributed on all
   chromosomes except chromosome A06. Among the 129 DEGs, 39 are located
   in the two important trait-associated regions identified in this study
   and are therefore considered highly related to the multi-silique trait.
   These DEGs were classified into two groups based on expression level,
   including one containing upregulated genes and the other containing
   downregulated genes.

   A comprehensive understanding of the functions of the 2044 genes in the
   two intersecting regions is crucial for candidate gene selection.
   Unlike standard agronomic traits (such as plant height and flowering
   time), which are relatively well-studied and for which some candidate
   genes are available for reference, the multi-silique trait analyzed in
   this study has not been studied on genomic or molecular level in detail
   to date. Thus, GO annotation and KEGG pathway analysis provided
   important clues for candidate gene identification. Among the 2044 genes
   within the two associated regions, 2041 were successfully annotated.
   KEGG pathway analysis provided a general view of the genes in these
   regions. Most enriched KEGG pathways were involved in the biosynthesis
   or degradation of some secondary metabolites.

   As we were unable to relate this information to the multi-silique trait
   using established knowledge, we conducted a GO analysis. Eight of these
   DEGs were annotated in the category “negative regulation of flower
   development (GO: 0009910)” and 221 genes were involved in “organ
   development (GO: 0048513)”. This information is too general to reveal
   an association with flower/silique development. In addition, most genes
   within the two regions were the same between the two DNA pools. Thus,
   more attention should be paid to genes showing polymorphism between
   zws-ms and zws-217 plants, specifically genes harboring non-synonymous
   mutations or frame-shift mutations. Furthermore, 104 of the 126 genes
   with non-synonymous mutations and 19 of the 23 genes with frame-shift
   mutations were represented in the GO database. We examined each of
   their annotations and identified seven interesting genes, which were
   annotated to ovule, carpel or stamen development. The NILs zws-ms and
   zws-217 differ only in terms of flower and silique morphology and
   structure. Therefore, genes that differ in zws-ms versus zws-217 at the
   genomic level are thought to encode diverse proteins, suggesting they
   are excellent candidate genes for the multi-silique trait.

   At the transcriptome level, we focused on DEGs, especially those
   located in the associated regions mentioned above. Thirty-nine out of
   129 DEGs were found in the two intersecting regions, among which five
   genes were annotated to the ovule or stamen development. This
   annotation information strongly suggests that these DEGs are involved
   in the multi-silique trait, as zws-ms flowers have obviously abnormal
   morphology, consequently leading to the formation of abnormal fruits.

   Then, we aligned these selected genes to the Arabidopsis, screening for
   homologous genes sharing the highest sequence identity with them, but
   did not identify more potential candidate genes other than these 12
   genes already mentioned (BnaA09g43100D, BnaC08g38270D, BnaC08g36330D,
   BnaA09g44210D, BnaA09g48960D, BnaA09g42920D and BnaA09g42700D with
   non-synonymous mutations or frame-shift mutations, as well as
   BnaA09g45320D, BnaC08g39130D, BnaC08g41780D, BnaC08g42450D and
   BnaC08g41720D with different expression levels). Based on analysis of
   these orthologs in Arabidopsis and their roles in plants, the annotated
   information of these candidate genes are confirmed: (1) BnaA09g42700D
   is found much shorter than its Arabidopsis ortholog AT2G22370 (MED18)
   in length, but in their overlapping region, they share 100% identity of
   nucleotide sequence. Mutation of MED18 in Arabidopsis significantly
   delayed its flowering time [[135]37]. FLC encodes an MADS-box protein
   and is a repressor for flowering [[136]38]; AG is a floral meristem
   identity gene, which specifies carpel and stamen identity in the flower
   [[137]39, [138]40]. Thus, BnaA09g42700D could probably act on flowering
   indirectly. (2) Disruption of AT1G11870 results in ovule abortion in
   Arabidopsis [[139]41]. So the different expression of BnaC08g41780D
   between zws-ms and zws-217 may result some unknown process in fruit
   development. (3) The knockdown of HAC1 causes disturbances in flower
   morphology in the Arabidopsis, such as a modified petal shape or even
   absence of petals [[140]42]. HAC12 has high identity with HAC1, but
   what morphology changes it can make are still unknown. (4) SEC14-LIKE 3
   (ATSFH3) is predominantly expressed in the flowers and involved in the
   transfer of phosphatidylinositol or phosphatidylcholine phospholipids
   during flower development in Arabidopsis [[141]43, [142]44]. In
   chickpea (Cicer arietinum L.), SEC14-LIKE3 is also considered a
   potential candidate flowering time-regulating genes by mapping
   [[143]45]. (5) EC1.3 triggers sperm cell activation during double
   fertilization and knockdown of it reduces seed number in siliques
   [[144]46]. (6) The brassinosteroid signal is found involved in ovule
   initiation and development, by regulating the expression level of BZR1,
   of which dephosphorylation increases ovule and seed number in
   Arabidopsis [[145]47]. In this study, BnaA09g44210D has non-synonymous
   mutation between zws-ms and zws-217, probably causes multi-silique
   trait by some undiscovered pathways. (7) ATGLK1 (GPRI1) can regulate
   the silique number in Arabidopsis: the glk1/glk2 double mutant realized
   only about 1/3 of the mean silique number per plant found in the wild
   type, due to changes in silique wall and leaf photosynthesis [[146]48].
   (8) The Arabidopsis mutant til1–4/til1–4 has altered floral
   phyllotaxis, reduced ovule number, abnormally developed ovules, and
   reduced fertility [[147]49]. (9) APA1: In Cynara cardunculus, the
   aspartic proteinase cardosin B is found playing an important role in
   ovule function [[148]50]. (10) RLK7: Stamen development depends on a
   teamwork of receptor-like kinases [[149]51, [150]52]. (11) CPN10 is
   involved in response to many physiological process, such as seed
   abortion in plants [[151]53]. Although no report directly leads to
   multi-silique trait so far, these orthologs are all related to flower
   development, flower morphology change, fruit development and even the
   number of ovules/seeds. It is due to the scarcity of the multi-silique
   trait to some extent. Moreover, this multi-silique trait is
   theoretically more complex, since it is controlled by multi-loci.
   Anyhow, these clues validated the 12 candidate genes.

   In summary, the unique multi-silique rapeseed zws-ms, examined in this
   study represents an enriched rapeseed germplasm resource, which is
   considered the basis of crop breeding. Thus far, 12 candidate genes in
   two genomic regions associated with the multi-silique trait have been
   identified, including seven genes with sequence changes and five genes
   with expression level differences between the multi-silique and
   single-silique NILs. This information lays the foundation for future
   research, such as candidate genes isolation, their functional
   verification and so on. Several unanswered questions remain. A previous
   study suggested that the multi-silique is likely controlled by three
   allelic genes [[152]19], but in the current study, we only identified
   two associated regions, which harbor no more than two independent
   genes. This finding suggests that one more gene/factor remains
   unidentified. We also noted that the multi-silique trait is highly
   sensitive to the environment. For example, when zws-ms plants were
   planted in Xindu (altitude of 472 m, 30°47′10″N 104°12′12″E), the
   multi-silique trait was continuously stable for years; but when they
   were grown in Ma’erkang (altitude of 2540 m, 31°94′15″N 102°14′29″E),
   the multi-silique trait disappeared, and all plants had normal siliques
   (Additional file [153]12: Fig. S2). Interestingly, once we transferred
   zws-ms seeds harvested in Ma’erkang back to Xindu, the multi-silique
   trait was recovered. By contrast, zws-217 maintained the single-silique
   phenotype at both locations.

   Thus, there might be some epigenetic reason for this trait, such as
   methylation of a base that does not alter the DNA sequence of this
   underlying locus but is strongly affected by environmental factors. We
   noticed that there were several genes in the environmental response
   category in the associated regions with SNP/InDel variations between
   the two DNA pools: BnaA09g45600D was annotated as “cellular response to
   water deprivation (GO: 0042631)” and BnaA09g44900D as “response to cold
   (GO: 0009409)”. Moreover, several DEGs were also annotated to
   environment response, like BnaA09g45320D was also annotated as
   “response to cold” and so on. Since these genes are responsive to
   environmental factors, such as cold, water, and light intensity, their
   functions vary when their sequences or expression levels change. Thus,
   they may have different effects under different environmental
   conditions. Furthermore, Xindu is located in the Sichuan Basin, which
   has a humid subtropical monsoon climate (with an annual average
   temperature of 16.2 °C), whereas Ma’erkang is located in a mountainous
   area in western Sichuan, where the annual average temperature is only
   8.6 °C. The environmental factor with the greatest difference between
   the two locations is temperature. Thus, we propose that temperature is
   the most important environmental factor affecting the multi-silique
   trait in zws-ms, although other factors such as light intensity are
   also likely involved in regulating this trait.

Conclusions

   A novel trait in rapeseed, multi-siliques, was investigated at the
   genome level by association analysis based on WGR, followed by analysis
   at the transcriptome level. The two regions associated with this trait
   contain seven genes with non-synonymous mutations or frame-shift
   mutations annotated to floral organ-related GO terms, as well as five
   other vital DEGs. These genes are interesting candidate genes for the
   multi-silique trait.

Methods

Plant materials and population development

   The original multi-silique rapeseed line was discovered at the Crop
   Research Institute, Sichuan Academy of Agricultural Sciences from a
   population derived from a cross between B. napus and B. rapa.
   Multi-silique plants were grown and self-pollinated for six successive
   generations to obtain the homozygous multi-silique zws-ms plants.
   Single-silique offspring were continuously backcrossed for six
   generations with zws-ms as the recurrent parent, followed by six
   continuous generations of self-pollination to obtain the near-isogenic
   line zws-217 with the single-silique phenotype (Fig. [154]1). The NILs
   zws-217 and zws-ms were grown in the field under standard conditions at
   the Sichuan Academy of Agricultural Sciences in the Xindu District of
   Sichuan Province, China.

DNA preparation and pool construction

   The rate of multi-silique formation per zws-ms plant was determined by
   phenotyping at the full-bloom stage (BBCH 67). The multi-silique rate
   per plant was calculated as:
   [MATH: <mspace
   width="0.25em"></mspace><mtext>multi</mtext><mo>−</mo><mtext>silique
   rate</mtext><mo>=</mo><mfrac><mrow><mtext>number of
   three</mtext><mo>−</mo><mtext>pistiled flowers</mtext><mspace
   width="0.25em"></mspace><mi>per</mi><mspace
   width="0.25em"></mspace><mtext>plant</mtext></mrow><mtext>total number
   of flowers in this plant</mtext></mfrac> :MATH]

   Thirty plants with high multi-silique rates were selected for DNA
   isolation, while 30 zws-217 plants were randomly selected. Fresh leaves
   were sampled from these plants (zws-ms and zws-217) and subjected to
   DNA extraction by the cetyltrimethyl ammonium bromide (CTAB) method
   [[155]54]. RNA contamination was removed from each sample using RNase
   A. The DNA was quantified using a NanoDrop 2000 Spectrophotometer
   (Thermo Scientific, USA). Samples with optical density 260/280 ratios
   ranging from 1.8 to 2.2 were considered adequate. Equal quantities of
   DNA were pooled by line (multi-silique and single-silique), with 30
   samples per line, to a final concentration of 40 ng/μl.

Whole genome re-sequencing

   Pooled DNA samples were sheared into ~ 350 bp fragments using a Covaris
   S2/E210 DNA Shearing kit. The sheared DNA was end-repaired and a single
   nucleotide (A) overhang was subsequently added to the repaired
   fragments by adding a Klenow DNA polymerase Fragment (3′ → 5′ exo–)
   (New England Biolabs, Ipswich, MA, USA) and dATP at 37 °C. Barcodes and
   Illumina sequencing adapters were ligated to the A-tailed fragments
   using T4 DNA Ligase (Takara, Dalian, China). PCR was performed on both
   pooled samples using diluted prepared (sheared and ligated) DNA,
   deoxyribonucleotide triphosphate, Q5® High-Fidelity DNA Polymerase, and
   PCR primers. The PCR products were purified using Agencourt AMPure XP
   Beads (Beckman Coulter, High Wycombe, UK). Fragments 300 to 500 bp in
   length (including barcodes and adaptors) were excised and purified
   using a QIAquick Gel Extraction Kit (Qiagen, Hilden, Germany).
   Gel-purified products were diluted for pair-end sequencing (150 bp on
   each end) on an Illumina HiSeq system following the standard protocol
   (Illumina, Inc.; San Diego, CA, USA) at the Biomarker Technologies
   Corporation (Beijing, China). The sequencing depth was roughly 30 × .

SNP calling

   Low-quality reads with quality scores <20e were filtered out and raw
   reads were sorted according to their barcode sequences. After the
   barcodes were trimmed, clean HQ reads were mapped to the
   Brassica_napus.v4.1.fa genome
   ([156]http://www.genoscope.cns.fr/brassicanapus/data/) using the
   Burrows-Wheeler Aligner software [[157]55]. SAMtools [[158]56] was used
   to mark duplicates, and GATK [[159]57] was used for local realignment
   and base recalibration. An SNP set was formed by combining the results
   of analysis with GATK and SAMtools via SNP calling using default
   parameters. SNPs identified between the two lines were regarded as
   polymorphic for association analysis.

Association analysis

   Euclidean distance (ED) association analysis is a method that
   calculates ED values (quadratic sum root of differences between bulks
   from the depth of four types of bases) and is satisfied by ED. In
   theory, the higher the ED value is, the closer the object site
   [[160]25]. The ED values were calculated as follows [[161]27]:
   [MATH: <mi>ED</mi><mo>=</mo><msqrt><mrow><msup><mfenced close=")"
   open="("><mrow><msub><mi
   mathvariant="normal">A</mi><mi>ms</mi></msub><mo>−</mo><msub><mi
   mathvariant="normal">A</mi><mn>217</mn></msub></mrow></mfenced><mn>2</m
   n></msup><mo>+</mo><msup><mfenced close=")" open="("><mrow><msub><mi
   mathvariant="normal">C</mi><mi>ms</mi></msub><mo>−</mo><msub><mi
   mathvariant="normal">C</mi><mn>217</mn></msub></mrow></mfenced><mn>2</m
   n></msup><mo>+</mo><msup><mfenced close=")" open="("><mrow><msub><mi
   mathvariant="normal">G</mi><mi>ms</mi></msub><mo>−</mo><msub><mi
   mathvariant="normal">G</mi><mn>217</mn></msub></mrow></mfenced><mn>2</m
   n></msup><mo>+</mo><msup><mfenced close=")" open="("><mrow><msub><mi
   mathvariant="normal">T</mi><mi>ms</mi></msub><mo>−</mo><msub><mi
   mathvariant="normal">T</mi><mn>217</mn></msub></mrow></mfenced><mn>2</m
   n></msup></mrow></msqrt> :MATH]

   In this formula, A[ms], C[ms], G[ms], and T[ms] represent the depth of
   bases A, C, T, and G on a site in the multi-silique pool, and A[217],
   C[217], G[217], and T[217] represent the depth of bases A, C, T, and G
   on a site in the single-silique pool, respectively. To remove
   background noise, the original ED^5 value was used, and the adjusted
   values were fit using the DISTANCE method. Regions over a threshold
   value were considered to be trait-related candidate regions. The
   associated threshold value was determined based on ED + 3SD (standard
   deviation) [[162]27]. InDel-associated regions were obtained via a
   similar method.

RNA isolation for transcriptome analysis

   Three individual plants of the zws-217 (T04, T05, and T06) line were
   randomly selected for RNA isolation. Eight to ten buds (BBCH 57) per
   plant were randomly selected and sampled. For the zws-ms, three plants
   (T01, T02, and T03) were selected, and buds were sampled from each
   plant. All buds were quick-frozen and stored in liquid nitrogen. Before
   RNA isolation, the buds from three zws-ms plants were sliced with
   tweezers and observed. Only buds containing three pistils were
   considered to be correct multi-silique buds and used for RNA isolation.
   Buds from each plant were mixed to isolate the RNA, yielding three
   samples from zws-217 (T04, T05, and T06) and three samples from zws-ms
   (T01, T02, and T03). Since zws-217 and zws-ms were NILs, and these
   plants differ from each other only in the multi-silique trait, we
   reasoned that DEGs between these lines might be related to this trait.
   Total RNA was isolated using an RNA Isolation Kit (Tiangen, Beijing,
   China) and the concentration measured using a NanoDrop 2000 (Thermo).
   RNA integrity was assessed using an RNA Nano 6000 Assay Kit and the
   Agilent Bioanalyzer 2100 system (Agilent Technologies, CA, USA).

RNA library preparation and transcriptome sequencing

   For each tissue sample, 1 μg RNA was used as input material for RNA
   sample preparation. Sequencing libraries were generated using a NEBNext
   Ultra™ RNA Library Prep Kit for Illumina (NEB, USA) following the
   manufacturer’s instructions with index codes added to associate
   sequences to each sample. Briefly, mRNA was purified from total RNA
   using poly-T oligo-attached magnetic beads. Fragmentation was carried
   out using divalent cations under elevated temperature in NEBNext First
   Strand Synthesis Reaction Buffer (5×). First-strand cDNA was
   synthesized using random hexamer primer and M-MuLV Reverse
   Transcriptase. Second-strand cDNA synthesis was subsequently performed
   using DNA Polymerase I and RNase H. The remaining overhangs were
   converted into blunt ends using exonuclease/polymerase. After
   adenylation of the 3′ ends of DNA fragments, a NEBNext Adaptor with a
   hairpin loop structure was ligated to prepare for hybridization. To
   select 240 bp cDNA fragments, the library fragments were purified using
   the AMPure XP system (Beckman Coulter, Danvers, MA USA). The
   size-selected, adaptor-ligated cDNA was incubated with 3 μl USER Enzyme
   (NEB, USA) at 37 °C for 15 min, followed by 5 min at 95 °C. PCR was
   then performed with Phusion High-Fidelity DNA polymerase, universal PCR
   primers, and an Index (X) Primer. The PCR products were purified
   (AMPure XP system) and library quality assessed on the Agilent
   Bioanalyzer 2100 system.

   Clustering of index-coded samples was performed on a cBot Cluster
   Generation System using a TruSeq PE Cluster Kit v4-cBot-HS (Illumina)
   according to the manufacturer’s instructions. After cluster generation,
   the libraries were sequenced on the Illumina HiSeq X-ten platform and
   paired-end reads were generated.

   Raw reads in fastq format were initially processed using in-house Perl
   scripts. During this step, clean reads were obtained by removing
   adapter sequences, reads containing poly-N, and low-quality reads. At
   the same time, the Q20, Q30, GC-content, and sequence duplication
   levels of the clean reads were calculated. All downstream analyses were
   based on clean HQ reads.

   Clean reads were mapped to the reference genome sequence, and only
   reads with a perfect match or one mismatch were further analyzed and
   annotated based on the reference genome
   ([163]http://www.genoscope.cns.fr/brassicanapus/data/). Tophat2 tools
   (parameters: --read-mismatches 2 --read-edit-dist 2 --max-intron-length
   5,000,000 --library-type fr-unstranded --mate-inner-dist 40) were used
   to map clean reads to the reference genome.

Differential expression analysis

   Differential expression analysis of the two lines was performed using
   the DESeq R package [[164]58]. DESeq provides statistical models based
   on the negative binomial distribution to identify DEGs using gene
   expression data. The resulting P values were adjusted using the
   Benjamini and Hochberg’s approach for controlling the false discovery
   rate. Genes with an adjusted fold change of 4 and a p-value ≤0.01 were
   considered differentially expressed.

Reverse-transcription quantitative PCR (qPCR) validation

   To confirm the validity of the DEGs identified by transcriptome
   sequencing, qPCR was performed. Reverse transcription to cDNA was
   conducted using a PrimeScript First-strand cDNA Synthesis Kit (Takara,
   Dalian, China). The relative transcript levels of 10 selected DEGs (at
   the budding stage BBCH 57) were re-examined by qPCR, with BnACTIN used
   as the internal control. The primer sequences (Additional file [165]13:
   Table S11) for the genes were designed with Premier Primer 3.0
   software. All qPCR was performed on a Roche480 instrument (Roche,
   Basel, Switzerland). Amplification reactions were performed as follows:
   an initial denaturation step at 95 °C for 3 min, 39 cycles at 95 °C for
   10 s, 58 °C for 30 s, and 72 °C for 30 s, a final extension at 72 °C
   for 10 min and hold at 15 °C.

Additional files

   [166]Additional file 1:^ (209.1KB, docx)

   Figure S1. Whole genome with associated regions. (DOCX 209 kb)
   [167]Additional file 2:^ (15KB, docx)

   Table S1. Summary of the transcriptome sequencing data. (DOCX 15 kb)
   [168]Additional file 3:^ (15.7KB, docx)

   Table S2. Information about the mapped reads based on the transcriptome
   sequencing data. (DOCX 15 kb)
   [169]Additional file 4:^ (340.7KB, xlsx)

   Table S3. Annotations of the 2041 genes in the two intersection
   associated regions. (XLSX 340 kb)
   [170]Additional file 5:^ (123.4KB, xlsx)

   Table S4. GO enrichment in biological processes (BP) category of the
   genes. (XLSX 123 kb)
   [171]Additional file 6:^ (26.6KB, xlsx)

   Table S5. GO enrichment in cellular components (CC) category of the
   genes. (XLSX 26 kb)
   [172]Additional file 7:^ (50.7KB, xlsx)

   Table S6. GO enrichment in molecular function (MF) category of the
   genes. (XLSX 50 kb)
   [173]Additional file 8:^ (34.3KB, docx)

   Table S7. GO annotations of the 104 genes with non-synonymous mutation
   SNPs in the associated regions. (DOCX 34 kb)
   [174]Additional file 9:^ (17.5KB, docx)

   Table S8. GO annotations of the 19 genes with frame-shift mutation
   InDels in the associated regions. (DOCX 17 kb)
   [175]Additional file 10:^ (12.2KB, xlsx)

   Table S9. KEGG enrichment of the 2041 genes in the intersection
   associated regions. (XLSX 12 kb)
   [176]Additional file 11:^ (41KB, docx)

   Table S10. Information about the DEGs between zws-ms and zws-217. (DOCX
   40 kb)
   [177]Additional file 12:^ (1.8MB, docx)

   Figure S2. Stability of the multi-silique trait under one environment
   and variation of this trait across different environments. (DOCX 1840
   kb)
   [178]Additional file 13:^ (15.5KB, docx)

   Table S11. Primers used for qPCR. (DOCX 15 kb)

Acknowledgements