Abstract

Background

   Fruit color is one of the most important economic traits of the sweet
   cherry (Prunus avium L.). The red coloration of sweet cherry fruit is
   mainly attributed to anthocyanins. However, limited information is
   available regarding the molecular mechanisms underlying anthocyanin
   biosynthesis and its regulation in sweet cherry.

Methodology/Principal Findings

   In this study, a reference transcriptome of P. avium L. was sequenced
   and annotated to identify the transcriptional determinants of fruit
   color. Normalized cDNA libraries from red and yellow fruits were
   sequenced using the next-generation Illumina/Solexa sequencing platform
   and de novo assembly. Over 66 million high-quality reads were assembled
   into 43,128 unigenes using a combined assembly strategy. Then a total
   of 22,452 unigenes were compared to public databases using homology
   searches, and 20,095 of these unigenes were annotated in the Nr protein
   database. Furthermore, transcriptome differences between the four
   stages of fruit ripening were analyzed using Illumina digital gene
   expression (DGE) profiling. Biological pathway analysis revealed that
   72 unigenes were involved in anthocyanin biosynthesis. The expression
   patterns of unigenes encoding phenylalanine ammonia-lyase (PAL),
   4-coumarate-CoA ligase (4CL), chalcone synthase (CHS), chalcone
   isomerase (CHI), flavanone 3-hydroxylase (F3H), flavanone
   3’-hydroxylase (F3’H), dihydroflavonol 4-reductase (DFR), anthocyanidin
   synthase (ANS) and UDP glucose: flavonol 3-O-glucosyltransferase (UFGT)
   during fruit ripening differed between red and yellow fruit. In
   addition, we identified some transcription factor families (such as
   MYB, bHLH and WD40) that may control anthocyanin biosynthesis. We
   confirmed the altered expression levels of eighteen unigenes that
   encode anthocyanin biosynthetic enzymes and transcription factors using
   quantitative real-time PCR (qRT-PCR).

Conclusions/Significance

   The obtained sweet cherry transcriptome and DGE profiling data provide
   comprehensive gene expression information that lends insights into the
   molecular mechanisms underlying anthocyanin biosynthesis. These results
   will provide a platform for further functional genomic research on this
   fruit crop.

Introduction

   Sweet cherry (P. avium L.) is one of the most popular fresh fruits
   grown in temperate regions worldwide because of its appealing color,
   delicious taste and nutritional value. The color of the fruit ranges
   from dark red to pale yellow and is one of the primary exterior quality
   and economic characteristics, along with fruit weight and sweetness,
   that influences consumers’ buying decisions [[33]1]. Fruit color is
   also the most important indicator of the quality and maturity of fresh
   cherries [[34]2]. The color of cherries has been attributed to the
   accumulation of anthocyanins, and differences in the types and levels
   of anthocyanins vary between cherries of different colors [[35]3,
   [36]4, [37]5].

   Anthocyanins are secondary metabolites that not only play a significant
   role in pigmentation but also have antioxidant [[38]6, [39]7] and
   anti-tumor functions [[40]8, [41]9], protect against coronary heart
   disease and help defend against pathogens and ultraviolet radiation
   [[42]6]. The accumulation and distribution of anthocyanins are governed
   by metabolic networks that are regulated by genetic and environmental
   conditions; they are also strongly correlated with the expression of
   structural and regulatory genes [[43]10]. In model plants such as
   Arabidopsis thaliana, the structural genes in the anthocyanin
   biosynthetic pathway are divided into two groups designated as “early”
   or “late”. The “early” biosynthetic genes appear to be coordinately
   regulated and encode enzymes that function in the beginning steps of
   the biosynthetic pathway, such as CHS and CHI. The “late” biosynthetic
   genes are expressed during the later stages of the biosynthetic
   pathway, such as DFR, ANS and others [[44]11]. During blooming and
   fruit ripening, these genes exhibit different expression patterns in
   different species [[45]12]. The transcriptional regulation of
   structural genes plays an important role in the anthocyanin
   biosynthetic pathway. The transcriptional regulators for anthocyanin
   biosynthesis include MYB proteins, bHLH (basic helix-loop-helix)
   proteins and WD40 proteins [[46]13, [47]14]. The combinations of and
   interactions between the MYB, bHLH and WD40 transcription factors
   mediate the regulation of the anthocyanin biosynthetic pathway
   [[48]15]. The structural genes and transcription factors that are
   involved in the anthocyanin biosynthetic pathway have been identified
   through biochemical and genetic analyses in several fruit trees,
   including strawberry, apple, and grape [[49]16, [50]17, [51]18,
   [52]19].

   In recent years, more and more researchers have focused on the genetic
   control of skin color in sweet cherry. For example, Suneth et al.
   reported a QTL analysis and candidate gene mapping for skin and flesh
   color in sweet cherry fruit [[53]20]. Wang et al. analyzed ANS and CHS
   expression levels in the red fruit cultivar ‘Stella’ and the bicolored
   cultivar ‘Rainier’ [[54]21]. The expression levels of six genes
   involved in anthocyanin biosynthesis have previously been studied in
   the red fruit cultivar ‘Hongdeng’ and the bicolored cultivar ‘Caihong’
   using qRT-PCR analysis [[55]22]. However, the precise hierarchical
   organization of the global network that is involved in anthocyanin
   biosynthesis has not been defined.

   More recently, transcriptome analyses based on deep sequencing have
   been used for gene discovery, the analysis of specific transcripts, and
   the estimation of overall gene expression at different developmental
   stages and in different tissues. Next-generation sequencing and DGE
   profiling are cost-effective choices for characterizing non-model
   organisms without a reference genome [[56]23, [57]24]. The anthocyanin
   biosynthetic pathway and the expression of related genes have been
   studied through global transcriptome methods, uncovering a wealth of
   anthocyanin biosynthesis regulatory genes in important fruit crops such
   as grape [[58]25], blood orange [[59]26], blueberry [[60]27], Chinese
   bayberry [[61]28], pomegranate [[62]29], and black raspberry [[63]30].
   A transcriptome analysis of sweet cherry has been reported, but that
   study was focused on cuticle deposition [[64]31].

   Pure yellow cultivars are rare germplasm in sweet cherry breeding, and
   they are precious materials for the study of anthocyanin biosynthesis.
   In the present study, the pure yellow cultivar ‘13–33’ and the red
   cultivar ‘Tieton’ were used as the experimental material to establish a
   database of transcriptome sequences of sweet cherry fruit using
   Illumina transcriptome sequencing. The transcriptome database was used
   as reference data to identify candidate genes involved in the
   biosynthesis of anthocyanins. Based on this, we used DGE profile
   analysis to compare the transcripts involved in the biosynthesis of the
   anthocyanins that exist in the red and yellow cultivars. The
   transcriptome sequences and gene expression profiles provide a solid
   foundation for functional genomic studies on sweet cherry in the future
   and will facilitate a better understanding of the molecular mechanisms
   of anthocyanin biosynthesis.

Materials and Methods

Plant material and determination of total anthocyanins

   The pure yellow sweet cherry cultivar P. avium L. (‘13–33’) and the red
   sweet cherry cultivar P. avium L. (‘Tieton’) were used in this study
   [[65]32]. The plants were grown in Shandong Institute of Pomology,
   Tai’an, Shandong Province, China. According to the data from our
   laboratory, the fruit development periods of the two cultivars were
   similar. The total sugar accumulation pattern was the same in the two
   cultivars, beginning at the early stages and accelerating during
   ripening. The total acid content first increased and then decreased
   until the fruit was ripe in both cultivars ([66]S1 Table). Fruit
   samples were collected at four different ripening stages: 20 days after
   flowering (DAF) (stage 1), 35 DAF (stage 2), 45 DAF (stage 3) and 55
   DAF (stage 4). [67]Fig. 1 shows samples of fruits at the four different
   ripening stages. After collection, samples were flash-frozen in liquid
   nitrogen and stored at -80°C until further processing.

Fig 1. Fruits of P. avium L. ‘13–33’ and ‘Tieton’ used in deep sequencing.

   Fig 1
   [68]Open in a new tab

   (A) ‘13–33’ fruit at 20 DAF (stage 1). (B) ‘13–33’ fruit at 35 DAF
   (stage 2). (C) ‘13–33’ fruit at 45 DAF (stage 3). (D) ‘13–33’ fruit at
   55 DAF (stage 4). (E) ‘Tieton’ fruit at 20 DAF (stage 1). (F) ‘Tieton’
   fruit at 35 DAF (stage 2). (G) ‘Tieton’ fruit at 45 DAF (stage 3). (H)
   ‘Tieton’ fruit at 55 DAF (stage 4).

   The total anthocyanin content in the sweet cherry cultivars was assayed
   using the method presented by Wang et al [[69]33]. Fruits were crushed
   and pooled to obtain three replicates. Anthocyanins were extracted in
   10 ml of methanol (containing 1% hydrochloric acid) for 2 h at 4°C in
   darkness. Anthocyanin content was measured at 553 and 600 nm.

Total RNA extraction

   Total RNA was extracted using a modified CTAB method [[70]34]. RNA
   quantity and quality (purity and integrity) were analyzed using a
   NanoPhotometer spectrophotometer (IMPLEN, Westlake Village, CA, USA)
   and an Agilent Bioanalyzer 2100 system (Agilent Technologies, CA, USA),
   respectively.

Library construction and transcriptome sequencing

   Total RNA from 4 different fruit ripening stages of the yellow (YTR)
   and red (RTR) cultivars was pooled prior to library preparation in the
   two experimental groups. Equimolar quantities of total RNA from samples
   at each stage were combined into one pool. Prior to cDNA library
   construction, poly-T oligo-attached magnetic beads were used to purify
   the mRNA, which was then broken into short fragments of approximately
   200 bp. The fragments were used to synthesize first-strand cDNA using
   random oligonucleotides and SuperScript II. Second-strand cDNA was then
   synthesized using DNA polymerase I and RNase H. The double-stranded
   cDNA fragments were subjected to end repair, and sequencing adapters
   were ligated to both ends. The final cDNA library was selectively
   enriched by PCR and purified using the AMPure XP system (Beckman
   Coulter, Beverly, USA). The library preparations were sequenced by
   Novogene Bioinformatics Technology Co., Ltd (Beijing, China) on an
   Illumina HiSeq 2000 platform. Then, 100-bp paired-end reads were
   generated, and all raw sequence read data were deposited in the NCBI
   Short Read Archive (SRA) database under accession number SRP044388.

De novo transcriptome assembly

   The raw reads were first processed through in-house Perl scripts, which
   removed reads containing the sequencing adapters, reads with over 10%
   ambiguous ‘N’ nucleotides, and reads with over 50% of bases with a
   quality score lower than 5. The remaining clean reads from each library
   were assembled using a combined de novo transcriptome assembly
   strategy. De novo assembly of transcripts was performed using Trinity
   software as previously described for de novo transcriptome assembly
   without a reference genome [[71]35].

Functional annotation and classification

   Unigene function was annotated using NCBI BLAST 2.2.28^+ with an
   E-value threshold of 10^-5 for the NCBI non-redundant protein (Nr)
   database, the NCBI non-redundant nucleotide sequences database (Nt),
   the Swiss-Prot protein database, and the euKaryotic Ortholog Groups of
   proteins (KOG) database. Protein family (Pfam) was assigned using the
   HMMER 3.0 package. The Kyoto Encyclopedia of Genes and Genomes (KEGG)
   Ortholog (KO) database categories were assigned to the unigene
   sequences using the KEGG Automatic Annotation Server (KAAS) online
   [[72]36]. Blast2GO v2.5 was used to obtain Gene Ontology (GO)
   annotation of unigenes based on BLASTX hits against the Nr database
   with a cut-off E-value of 10^-5 [[73]37].

Digital gene expression sequencing and mapping to reference transcripts

   The eight DGE libraries prepared from the four different ripening stage
   samples (R1 and Y1: 20 DAF fruits, R2 and Y2: 35 DAF fruits, R3 and Y3:
   45 DAF fruits and R4 and Y4: 55 DAF fruits) were constructed using the
   Illumina TruSeq RNA Sample Preparation Kit. The library preparations
   were sequenced on the Illumina HiSeq 2000 platform. The raw reads were
   deposited in the NCBI SRA database (accession number SRP044388).

   To map the DGE reads, clean reads were obtained by removing low-quality
   reads and reads containing adapters or poly-N stretches from the raw
   data. The clean reads were mapped back onto the assembled transcriptome
   for each sample using RSEM [[74]38].

Analysis of differential gene expression

   For gene expression analysis, the read numbers mapped to each gene were
   counted using HTSeq v0.5.4p3 and then normalized to RPKM (Reads per
   Kilobase per Million mapped reads) [[75]39]. The DESeq R package
   (1.10.1) was used for differential expression analysis of the unigenes
   between two samples. The differentially expressed genes (DEGs) were
   filtered for a corrected P-value threshold of 0.005 (P values were
   adjusted using the Benjamini & Hochberg method) and log2 (fold change)
   of 1. We used GOseq (1.10.0) for GO enrichment analysis of all
   differentially expressed genes [[76]40]. KEGG enrichment analysis was
   performed using the KOBAS software [[77]41].

qRT-PCR analysis

   Eighteen unigenes were chosen for validation using qRT-PCR. Specific
   primer pairs for selected genes used in qRT-PCR were designed as shown
   in [78]S2 Table. The cDNA was transcribed from 1 μg of total RNA using
   the Thermo Scientific Revertaid First Strand cDNA Synthesis Kit
   (Thermo, USA) in 20 μL of reaction mixture. The qRT-PCR was performed
   with the ABI 7500 Fast Real-Time Detection System (Applied Biosystems)
   with the Ultra SYBR Mix (with ROX) (CWBIO, Beijing, China). The thermal
   profile for SYBR Green I RT-PCR was 95°C for 10 min, followed by 40
   cycles of 95°C for 15 s and 55°C for 1 min. Each plate was repeated
   three times in independent runs for all reference and selected genes.
   The reference gene (β-ACTIN) was used for normalization. The
   comparative CT method (2^-ΔΔCT method) was used to analyze the
   expression levels of the different genes [[79]42].

Statistical analysis

   All of the experiments analyzed using data comparisons were repeated
   three times. Statistical analyses were performed using variance (ANOVA)
   followed by Duncan’s new multiple range tests with SPSS version 16.0
   (SPSS, Chicago, IL, USA). A significance level of p < 0.05 was applied.

Results and Discussion

RNA-Seq and de novo transcriptome assembly

   Two cDNA libraries, YTR and RTR, were constructed from the total RNA of
   the yellow (YTR) and red (RTR) cultivars’ fruit. Fruit were collected
   at 20 DAF (stage 1, [80]Fig. 1A, [81]1E), 35 DAF (stage 2, [82]Fig. 1B,
   [83]1F), 45 DAF (stage 3, [84]Fig. 1C, [85]1G), and 55 DAF (stage 4,
   [86]Fig. 1D, [87]1H). These cDNA libraries were subjected to pair-end
   reading with the Illumina HiSeq 2000 platform, generating 33,566,702
   and 35,223,154 paired-end raw reads of 100 bp in length, respectively
   ([88]Table 1). All of the raw reads are available in the NCBI SRA
   database (accession number SRP044388). After removing the low-quality
   reads and trimming the adapter sequences, 32,398,690 and 33,993,880
   clean reads were obtained for the YTR and RTR libraries, respectively.

Table 1. Summary of the sequencing and de novo assembly.

   Sequences                                  YTR           RTR
   Before trimming
   Total nucleotides (bp)                     6,713,340,400 7,044,750,800
   Number of raw reads                        33,566,702    35,223,154
   After trimming
   Number of clean reads                      32,398,690    33,993,880
   GC content (%)                             45.73         45.99
   Q20 percentage (%)                         95.78         95.82
   After assembly
   Number of transcripts of combined data     81,670
   Number of unigenes of combined data        43,128
   Total nucleotides (nt) of transcripts (bp) 110,084,099
   Total nucleotides (nt) of unigenes (bp)    40,691,890
   Mean length of transcripts (bp)            1,348
   Mean length of unigenes (bp)               944
   N50 of unigenes (bp)                       1,781
   N90 of unigenes (bp)                       344
   [89]Open in a new tab

   The clean reads from the two libraries were assembled into 81,670
   transcripts with an average length of 1348 bp using the Trinity
   software, and 43,128 unigenes were obtained ([90]Table 1). Assembled
   unigenes ranged from 201 bp to 15,614 bp, and approximately half of
   them (22,281; 51.67%) were 200–500 bp in length ([91]S1 Fig.). Compared
   to the sweet cherry transcriptome analysis published in 2014, this
   study was superior based on the N50 value (1781 bp), the average length
   of unigene (944 bp) obtained and the accuracy of the assembled results
   using paired-end sequencing [[92]31]. The sequencing and assembly
   results suggest that the unigene data were highly reliable for further
   analysis.

Functional annotation and classification

   For functional annotation of the sweet cherry fruit transcriptome,
   22,452 unigenes were BLASTed against seven public databases ([93]Table
   2). Approximately 20,095 (46.59%) and 14,154 (32.81%) unigenes showed
   significant BLAST hits against known sequences in the Nr and Nt
   databases, respectively. A total of 70.22% (14,111) of the matched
   sequences showed high homology, with an E-value < IE-50, and 63.7%
   (12,801) of the matched sequences with a similarity value of 80% in the
   Nr database ([94]Fig. 2). Additionally, 15,394 (35.69%), 8,645
   (20.04%), 15,345 (35.58%), 4,035 (9.35%) and 16,853 (39.07%) unigenes
   were annotated in the Swiss-Prot, KOG, Pfam, KO and GO databases,
   respectively ([95]Table 2). Due to limitations in the genomic and
   expressed sequence tag (EST) information available for P. avium L.,
   only 52.05% of the total unigenes were annotated from at least one
   database.

Table 2. Summary of the annotations for the assembled sweet cherry unigenes
in public databases.

   Database Number of annotated unigenes Percentage of annotated unigenes
   (%)
   Nr 20,095 46.59
   Nt 14,154 32.81
   KO 4,035 9.35
   Swiss-Prot 15,394 35.69
   PFAM 15,345 35.58
   GO 16,853 39.07
   KOG 8,645 20.04
   Annotated in all databases 2,246 5.2
   Annotated in at least one database 22,452 52.05
   [96]Open in a new tab

Fig 2. E-value and similarity distribution of sweet cherry fruit tissue
transcriptome unigenes with annotation to the Nr database.

   [97]Fig 2
   [98]Open in a new tab

   (A) E-value distribution of annotated unigenes; (B) Similarity
   distribution of annotated unigenes.

   Gene Ontology (GO), an international standardized gene function
   classification system, was used to classify the function of the
   predicted P. avium L. genes. Sequence homology revealed that 16,853 of
   the assembled unigenes were assigned at least one GO term ([99]Table
   2). Furthermore, 46,748 unigenes were assigned to the Biological
   Process category, 34,527 unigenes were assigned to the Cellular
   Component category, and 22,307 unigenes were assigned to the Molecular
   Function category. These unigenes were further classified into 55
   functional subcategories ([100]Fig. 3). The most common assignments in
   the Biological Process category were cellular process (10,654 unigenes,
   22.79%), metabolic process (9,974 unigenes, 21.34%) and single-organism
   process (5,180 unigenes, 11.08%). In the Cellular Component category,
   the majority of unigenes were grouped into the cell (7,049 unigenes,
   20.42%), cell part (7,037 unigenes, 20.38%) and organelle (4,945
   unigenes, 14.32%) subcategories. Genes in the Molecular Function
   category were primarily sorted into the binding (10134 unigenes,
   45.43%), catalytic activity (8513 unigenes, 38.16%) and macromolecular
   complex (3,709 unigenes, 10.74%) subcategories.

Fig 3. GO classification of unigenes of P. avium L fruit.

   [101]Fig 3
   [102]Open in a new tab

   The results are summarized in three main GO categories: Biological
   process, Cellular Component and Molecular Function.

   To further evaluate the completeness of our transcriptome library and
   the effectiveness of the annotation process, KOG annotations were
   selected, and 8,645 unigenes were assigned to 26 KOG categories
   ([103]Fig. 4). Among the 26 KOG categories, the cluster for General
   Functional Prediction (1,461 unigenes, 16.90%) represented the largest
   group, followed by Post-translational modification, protein turnover,
   chaperone (1,149 unigenes, 13.29%) and Translation (809 unigenes,
   9.36%). Cell motility (4 unigenes, 0.05%), Nuclear structure (34
   unigenes, 0.39%) and Extracellular structures (35 unigenes, 0.40%)
   represented the smallest groups.

Fig 4. Histogram of KOG classification.

   [104]Fig 4
   [105]Open in a new tab

   The 8,645 unigenes were grouped into 25 KOG categories.

   To identify the biological pathways activated in sweet cherry fruit, we
   mapped the annotated sequences to the canonical reference pathways in
   the KEGG database [[106]36]. Significant matches were found for 4,035
   unigenes, which were assigned to 242 KEGG pathways ([107]S2 Fig.;
   [108]S3 Table). In total, 2,748 unigenes were identified in metabolism,
   and the main metabolic pathways were Carbohydrate metabolism (564
   unigenes), Energy metabolism (511 unigenes) and Amino acid metabolism
   (392 unigenes). From these pathways, information on sweet cherry
   metabolism can be obtained. We concentrated on the ‘Biosynthesis of
   other secondary metabolites’ category in relation to fruit
   pigmentation. The KEGG pathway analysis also revealed that 149 unigenes
   were classified into 12 subcategories within the ‘Biosynthesis of other
   secondary metabolites’ category. Among these, the cluster for
   ‘Phenylpropanoid biosynthesis’ represented the largest group, followed
   by ‘Flavonoid biosynthesis’ and ‘Flavone and flavonol biosynthesis’.

Digital gene expression library sequencing and mapping

   To investigate the gene expression patterns in red and yellow sweet
   cherry fruits during the ripening process, eight DGE libraries were
   constructed and sequenced using Illumina deep sequencing technology.
   Using Illumina sequencing, we obtained over 7.0 million raw reads in
   each library ([109]Table 3). The raw reads are available at the NCBI
   SRA database (accession number SRP044388). Following the transformation
   of raw sequences into clean reads, the total number of reads per
   library ranged from 7,248,161 to 11,442,383, and more than 90% of the
   total reads in each library were clean reads. We then mapped the clean
   reads in each library to our transcriptome reference database, which
   contained 43,128 unigenes. Approximately 94.56% to 95.95% of the clean
   reads in each sample were mapped to our transcriptome reference
   database ([110]Table 3).

Table 3. Statistics of DGE library sequencing and read mapping.

   Library Raw reads Clean reads Clean bases Clean reads /Raw reads (%)
   Total mapped reads Total mapped reads /Clean reads (%)
   Y1 7633776 7,568,503 0.76G 99.14% 7,156,815 94.56%
   Y2 9016540 8,929,925 0.89G 99.04% 8,522,912 95.44%
   Y3 8899691 8,821,425 0.88G 99.12% 8,453,179 95.83%
   Y4 7449167 7,381,098 0.74G 99.09% 7,016,375 95.06%
   R1 7311958 7,248,161 0.72G 99.13% 6,880,047 94.92%
   R2 9964196 9,879,901 0.99G 99.15% 9,466,562 95.82%
   R3 11541278 11,442,383 1.14G 99.14% 10,978,727 95.95%
   R4 10465601 10,369,923 1.04G 99.09% 9,836,442 94.86%
   [111]Open in a new tab

   Notes: Y1: ‘13–33’ fruit at 20 DAF (stage 1); Y2: ‘13–33’ fruit at 35
   DAF (stage 2); Y3: ‘13–33’ fruit at 45 DAF (stage 3); Y4: ‘13–33’ fruit
   at 55 DAF (stage 4); R1: ‘Tieton’ fruit at 20 DAF (stage 1); R2:
   ‘Tieton’ fruit at 35 DAF (stage 2); R3: ‘Tieton’ fruit at 45 DAF (stage
   3); R4: ‘Tieton’ fruit at 55 DAF (stage 4).

Analysis of differential gene expression

   To compare the changes in gene expression between different cultivars
   and different stages of ripening, we normalized the gene expression
   levels to the RPKM. All uniquely mapped reads were used to calculate
   the genes’ RPKM values. The differentially expressed genes were
   hierarchically clustered based on the log[10]RPKM of the eight samples,
   allowing us to observe the overall gene expression pattern. The red
   bands indicate high gene expression, and the blue bands indicate low
   gene expression ([112]S3 Fig.).

   The genes whose expression differed in the two samples were identified
   and filtered for corrected P values < 0.005 and log2 (fold change)
   values > 1. We compared the DEGs between different stages within a
   cultivar and between cultivars within a specific stage. The number of
   DEGs among these comparisons varied; approximately 250–2797 unigenes
   displayed significant changes in expression, and the average number was
   1136 ([113]S4 Table). The number of upregulated and downregulated
   unigenes are shown in [114]Fig. 5.

Fig 5. Unigenes differentially expressed between different stages of fruit
development and between cultivars at the same stage.

   [115]Fig 5
   [116]Open in a new tab

   Upregulated (red) and downregulated (green) unigenes were quantified.
   The results of 14 comparisons between each two samples are shown.

   To illustrate the DEGs detected in the different developmental stages
   of fruits from the yellow and red cultivars, we conducted GO functional
   enrichment and KEGG pathway analyses. The DEGs were clustered into
   three main categories of the GO classification: Biological Process,
   Cellular Function and Molecular Function ([117]S5 Table). The genes
   from the different expression clusters associated with the different
   functional categories clearly indicate the molecular and cellular
   events involved in sweet cherry fruit ripening. KEGG pathway enrichment
   analysis of the differentially expressed genes revealed effects on
   various metabolic processes, particularly in starch and sucrose
   metabolism, the biosynthesis of secondary metabolites, and amino acid
   metabolism ([118]S6 Table). The predicted biosynthesis pathways for the
   unigenes differed between the libraries, and most of the unigenes
   involved in these pathways were differentially expressed. In
   particular, we found that the unigenes predicted to be involved in
   anthocyanin biosynthesis were significantly increased or decreased in
   different DGE libraries.

Genes from the sweet cherry fruit transcriptome are involved in anthocyanin
biosynthesis

   In the later ripening stages, sweet cherry fruit undergoes a rapid
   color change. The red color of sweet cherry fruit is a result of the
   increase in total anthocyanins, consisting mostly of
   cyanidin-3-rutinoside and cyanidin-3-glucoside [[119]43, [120]44]. In
   this study, the anthocyanin content in the red sweet cherry ‘Tieton’
   fruit increased from 0.67 U·g^-1 FW to 197.4 U·g^-1 FW as the color
   changed from green to dark red during ripening ([121]Fig. 6). This
   accumulation pattern is comparable to previously reported trends for
   the red color of sweet cherry [[122]22]. However, the anthocyanin
   content of the yellow sweet cherry ‘13–33’ fruit was almost unchanged
   and remained at a very low level during fruit ripening in our study
   ([123]Fig. 6), which, to our knowledge, is the first time these data
   have been assessed in sweet cherry. Previous studies have shown that
   fruit anthocyanin content is correlated with the expression of
   anthocyanin biosynthetic genes in many crops, including apple, pear,
   and grape [[124]17, [125]45, [126]46, [127]47]. In this study, unigenes
   participating in the anthocyanin biosynthetic pathway were selected and
   studied. A total of 72 unigenes encoding 11 enzymes were assigned to
   the anthocyanin biosynthetic pathway based on a KEGG pathway assignment
   ([128]Fig. 7). More than one unigene sequence was annotated as the same
   enzyme, indicating that these unigenes represent different fragments of
   a single transcript or different members of a gene family [[129]48].
   For example, CHS is the first committed enzyme in this pathway
   [[130]49]; 6 unigene sequences (including unigene comp27880_c0,
   comp29351_c0 and comp16012_c1, among others) from the transcriptome
   database were annotated as CHS.

Fig 6. Comparison of anthocyanin content in different ripening stages of the
red sweet cherry cultivar ‘Tieton’ and the yellow sweet cherry cultivar
‘13–33’.

   [131]Fig 6
   [132]Open in a new tab

Fig 7. Unigenes involved in the anthocyanin biosynthesis of sweet cherry.

   [133]Fig 7
   [134]Open in a new tab

   The numbers in parentheses indicate the number of unigenes for each
   gene in the transcriptome library.

   In the green apple varieties ‘Crown’ and ‘Granny Smith’, CHS, F3H, DFR
   and UFGT are not expressed or are expressed at low levels; however, in
   the red apple varieties ‘Gala’ and ‘Cripps Red’, the expression levels
   of these genes are very high [[135]50]. Previous qRT-PCR analyses have
   shown that the expression levels of CHS, CHI, F3H, DFR and ANS
   increased along with the anthocyanin synthesis process in a red sweet
   cherry cultivar but were expressed at low levels and were almost
   unchanged during fruit ripening in a bicolored cultivar [[136]21,
   [137]22]. In this study, 13 candidate unigenes that are responsible for
   anthocyanin synthesis in fruit were identified from the DEG data
   including PAL (1 unigene), 4CL (2 unigenes), CHS (2 unigenes), CHI (1
   unigene), F3H (1 unigene), F3’H (1 unigene), DFR (1 unigene), ANS (1
   unigene) and UFGT (3 unigenes) ([138]Table 4). The expression of these
   13 unigenes exhibited distinctive patterns in the two varieties. In
   fruit of the red cultivar ‘Tieton’, all of the DEGs encoding
   anthocyanin biosynthesis showed significantly upregulated expression
   during the ripening process, particularly when the fruit turned red
   (stage 3 and stage 4) ([139]Table 4). This result is in agreement with
   previously reported qRT-PCR results [[140]22]. In contrast, all of the
   DEGs encoding anthocyanin biosynthesis in fruit of the yellow cultivar
   ‘13–33’ showed significantly downregulated expression during stage 1
   and stage 2 and remained at a low expression level in stage 3 and stage
   4. These results are consistent with the accumulation of anthocyanin in
   these cultivars ([141]Fig. 6, [142]Table 4). The expression levels of
   PAL, CHS and F3’H in the red fruit were lower than those of the yellow
   fruit at stage 1. However, in stages 3 and 4, the expression levels of
   PAL, 4CL, CHS, CHI, F3H, DFR, F3’H, ANS, and UFGT in the red fruit were
   higher than those of the yellow fruit. This result indicates that the
   biosynthesis of anthocyanin compounds are maintained at high levels in
   the red cultivar ‘Tieton’ when the fruit is turning red. The expression
   of these genes might be required for sweet cherry coloration because
   anthocyanin composition is primarily responsible for alterations in
   fruit color. Previous studies have shown that the expression of UFGT is
   critical for fruit coloration in many plants, such as grape, strawberry
   and lychee [[143]51, [144]52, [145]53]. There is also a high
   correlation for anthocyanin accumulation in the bicolored cultivar
   ‘Caihong’ and the red cultivar ‘Hongdeng’ of sweet cherry [[146]22].
   However in the present study, there were no significant changes in the
   expression of 3 unigenes encoding UFGT in the yellow cultivar ‘13–33’,
   but a gradual increase in UFGT was correlated with the accumulation of
   anthocyanin in the red cultivar ‘Tieton’. This result indicates that
   UFGT may play a key role in sweet cherry coloration.

Table 4. Expression profiles of anthocyanin biosynthesis genes in sweet
cherry fruit.

   Gene name Unigene ID Gene length RPKM
   Stage 1 Stage 2 Stage 3 Stage 4
   R1 Y1 R2 Y2 R3 Y3 R4 Y4
   PAL comp26141_c0 2,717 250.35 454.14 178.27 43.20 268.74 26.59 642.21
   23.76
   4CL comp21677_c0 2,223 67.60 114.47 36.64 10.63 81.08 8.52 181.34 9.01
   comp24772_c0 2,065 54.80 56.05 27.27 25.67 42.20 18.88 28.01 16.60
   CHS comp27880_c0 1,684 257.11 864.24 124.09 7.166 431.97 18.00 1,001.81
   5.155
   comp29351_c0 282 54.14 546.25 22.49 1.296 86.44 8.67 298.98 1.57
   CHI comp24444_c0 1,090 205.26 210.68 262.66 18.11 518.95 15.02 548.06
   8.98
   F3H comp22775_c0 1,539 623.61 909.37 423.48 75.80 768.49 82.69 880.70
   53.34
   DFR comp14360_c0 1,680 246.19 489.64 246.57 36.13 585.27 35.51 1,579.94
   28.53
   F3’H comp19284_c0 2,120 120.93 279.77 125.91 21.38 265.99 19.77 545.25
   15.75
   ANS comp28941_c0 1,680 226.22 293.52 755.24 147.66 1,507.65 245.56
   1,876.1 151.16
   UFGT comp22538_c0 4837 25.51 21.60 160.48 18.87 246.31 16.33 734.84
   19.58
   comp13406_c0 1826 28.31 24.44 25.93 7.16 35.72 6.75 66.37 28.31
   comp26856_c0 1859 18.70 25.97 21.83 42.22 36.59 35.77 82.13 18.70
   comp26856_c0 1859 18.70 25.97 21.83 42.22 36.59 35.77 82.13 18.70
   MYB comp26801_c1 1805 1.84 0.32 89.48 0.34 153.39 4.47 742.37 0.08
   comp33353_c0 1258 15.74 27.03 25.27 8.53 17.36 0.96 55.80 5.07
   comp20483_c0 1019 2.82 0.57 5.04 2.39 14.87 0.25 20.48 0.87
   comp2616_c0 1256 1.08 0.81 1.13 0 0.75 29.11 3.17 0
   bHLH comp24411_c0 2985 34.04 62.75 39.42 31.77 43.88 11.85 63.09 37.79
   comp21245_c0 2838 16.25 32.11 46.55 28.39 32.23 28.41 41.44 26.93
   WD40 comp19899_c0 1915 46.62 28.46 38.83 41.56 37.70 0 55.46 38.98
   [147]Open in a new tab

   Anthocyanin biosynthesis structural genes are controlled by
   transcription factors such as MYB and the basic-helix-loop-helix (b
   HLH) and WD40 families [[148]54, [149]55, [150]56]. To identify
   regulatory factors that potentially control anthocyanin biosynthesis,
   candidate transcription factors were chosen from the DEG data. It has
   been reported that MYB transcription factors play a key role in
   regulating anthocyanin biosynthesis in some fruit trees [[151]57,
   [152]58, [153]59]. In our study, we first identified that 4 candidate
   MYB genes among the DEGs were annotated as R2R3-MYB genes; the
   expression of these factors increased gradually with fruit ripening in
   red fruit but were almost unchanged in yellow fruit ([154]Table 4). In
   stage 3 and stage 4, the expression of these unigenes was strongly
   up-regulated in ‘Tieton’ compared to ‘13–33’. Among these unigenes,
   comp26801_c1 was highly homologous to P. avium MYB10, which was
   identified as a key positive regulator in the control of anthocyanin
   biosynthesis in sweet cherry fruit [[155]21, [156]60]. Comp2616_c0 was
   highly homologous to MYB11 [Malus x domestica], which has been reported
   to positively regulate anthocyanin synthesis in apple calluses
   [[157]61]. Comp33353_c0 and comp20483_c0 were homologous to MYB111
   [Malus x domestica] and a putative MYB transcription factor [Rosa
   rugosa], respectively, although the function of both of these genes has
   not been reported. In addition, 2 candidate bHLH genes were also
   identified from the DEGs, designated as comp24411_ c0 and comp21245_c0;
   the former was highly homologous to P. avium TT8, and the latter was
   homologous to AtbHLH13. The expression patterns of these 2 candidate
   bHLH genes in the two cultivars were the same as the patterns of the
   MYB family genes ([158]Table 4). Finally, we targeted WD40
   transcription factors; 1 candidate WD40 gene (comp19899_c0) was
   identified among the DEGs and upregulated during fruit ripening in the
   red cultivar ‘Tieton’ compared to the yellow cultivar ‘13–33’
   ([159]Table 4). Comp19899_c0 was homologous to At TTG1, which is known
   to interact with GL3, EGL3 and PAP1 to regulate anthocyanin
   biosynthesis in Arabidopsis thaliana [[160]62]. Among these candidate
   transcription factors, only one MYB transcription factor’s function has
   been studied in sweet cherry. The remaining candidates have not been
   reported to be involved in anthocyanin synthesis in sweet cherry.
   Further studies are still needed to determine whether changes in the
   transcription of these candidate genes is related to the regulation of
   anthocyanin metabolism. Furthermore, in sweet cherry, the relationship
   between these transcription factors and the differentially expressed
   structural genes remains unclear. Consequently, this issue should also
   be investigated further.

   To confirm the unigenes obtained from sequencing and to further analyze
   the differences in the expression profiles between the red and yellow
   fruits during ripening, eighteen unigenes related to anthocyanin
   biosynthesis were chosen for qRT-PCR analysis. The red fruit showed
   much higher gene expression than the yellow fruit for most of the
   selected unigenes. These qRT-PCR results were consistent with those
   obtained from the DGE expression profiling ([161]Fig. 8).

Fig 8. Expression analysis of 18 candidate DEGs related to anthocyanin
biosynthesis in sweet cherry by qRT-PCR.

   [162]Fig 8
   [163]Open in a new tab

   The Y-axis represents the relative expression, and the X-axis depicts
   the fruit ripening stages. The standard error of the mean for three
   biological replicates (nested with three technical replicates) is
   represented by the error bars. Different letters on each symbol
   indicate statistically significant differences (P < 0.05) between two
   values according to ANOVA and Duncan’s new multiple range tests.

Conclusion

   This study investigated the transcriptome profiles of the fruit from
   red and yellow sweet cherry varieties using Illumina RNA-seq and DGE
   deep-sequencing technologies. This transcriptome analysis provided a
   total of 43,128 unigenes, of which 46.59% were annotated in the Nr
   database, although there is no P. avium L. reference genome sequence
   available. The genes that encode enzymes and transcription factors
   involved in anthocyanin biosynthesis were identified using de novo
   transcriptome analysis and functional annotation of novel genes, and
   their expression patterns were explored in the present study. The
   upregulated DEGs involved in anthocyanin biosynthesis might play an
   important role in the accumulation of anthocyanin and the development
   of the red color of sweet cherry. This study provides a platform for
   further functional genomic research on this fruit crop and a reference
   for studying complicated metabolic processes in non-model perennial
   species.

Supporting Information

   S1 Fig. Histogram of sequence length distribution of the assembled
   unigenes in sweet cherry fruit transcriptomes.

   The x-axis indicates a unigene length interval from 200 bp to ≥2000 bp.
   The y-axis indicates the number of unigenes of each given sequence
   length.

   (TIF)
   [164]Click here for additional data file.^ (641.9KB, tif)
   S2 Fig. KEGG classification of non-redundant unigenes of P. avium L
   fruit.

   (A) Cellular Processes; (B) Environmental Information Processing; (C)
   Genetic Information Processing; (D) Metabolism; (E) Organismal Systems.

   (TIF)
   [165]Click here for additional data file.^ (450.9KB, tif)
   S3 Fig. Hierarchical clustering of the differentially expressed genes,
   using RNA-seq data derived from eight samples based on log[10] RPKM
   values.

   Y1: ‘13–33’ fruit at 20 DAF (stage 1). Y2: ‘13–33’ fruit at 35 DAF
   (stage 2). Y3: ‘13–33’ fruit at 45 DAF (stage 3). Y4: ‘13–33’ fruit at
   55 DAF (stage 4). R1: ‘Tieton’ fruit at 20 DAF (stage 1). R2: ‘Tieton’
   fruit at 35 DAF (stage 2). R3: ‘Tieton’ fruit at 45 DAF (stage 3). R4:
   ‘Tieton’ fruit at 55 DAF (stage 4).

   (TIF)
   [166]Click here for additional data file.^ (1.4MB, tif)
   S1 Table. Total sugar content and total acid content in the sweet
   cherry cultivars ‘Tieton’ and ‘13–33’.

   (DOC)
   [167]Click here for additional data file.^ (31KB, doc)
   S2 Table. Primer sequences for qRT-PCR analysis.

   (DOC)
   [168]Click here for additional data file.^ (42.5KB, doc)
   S3 Table. KEGG pathways and the corresponding unigene numbers and IDs
   in the transcriptome of sweet cherry fruit.

   (XLS)
   [169]Click here for additional data file.^ (168.5KB, xls)
   S4 Table. Differentially expressed genes (DEGs) and their Nr
   annotations in each comparative group.

   The DEGs were filtered with q-value < 0.005 and log2 (fold change)
   values > 1.

   (XLS)
   [170]Click here for additional data file.^ (2.6MB, xls)
   S5 Table. Summary of the GO enrichment analysis of the DEGs in each
   comparison.

   (XLS)
   [171]Click here for additional data file.^ (3.2MB, xls)
   S6 Table. Summary of the pathway enrichment analysis of the DEGs in
   each comparison.

   (XLS)
   [172]Click here for additional data file.^ (211KB, xls)

Acknowledgments