Abstract

   Ilyonectria robusta causes rusty root rot, the most devastating chronic
   disease of ginseng. Here, we for the first time report the high-quality
   genome of the I. robusta strain CD-56. Time-course (36 h, 72 h, and
   144 h) dual RNA-Seq analysis of the infection process was performed,
   and many genes, including candidate effectors, were found to be
   associated with the progression and success of infection. The gene
   expression profile of CD-56 showed a trend of initial inhibition and
   then gradually returned to a profile similar to that of the control.
   Analyses of the gene expression patterns and functions of
   pathogenicity-related genes, especially candidate effector genes,
   indicated that the stress response changed to an adaptive response
   during the infection process. For ginseng, gene expression patterns
   were highly related to physiological conditions. Specifically, the
   results showed that ginseng defenses were activated by CD-56 infection
   and persisted for at least 144 h thereafter but that the mechanisms
   invoked were not effective in preventing CD-56 growth. Moreover, CD-56
   did not appear to fully suppress plant defenses, even in late stages
   after infection. Our results provide new insight into the chronic
   pathogenesis of CD-56 and the comprehensive and complex inducible
   defense responses of ginseng root to I. robusta infection.

   Subject terms: Fungal genomics, Effectors in plant pathology

Introduction

   Rusty root rot disease, caused by the Ilyonectria
   destructans/Cylindrocarpon destructans species complex, is the most
   devastating chronic disease of ginseng (Panax ginseng C. A. Meyer) and
   occurs in all ginseng growing areas^[39]1. With a cultivation history
   of more than 4,000 years, ginseng is the most valuable plant in
   traditional Chinese medicine. Currently, the major cultivation areas of
   ginseng are in China and South Korea; it is also grown in Japan, North
   Korea, Canada, and the United States. The northeastern region is the
   most important ginseng production area in China. Furthermore, the
   phylogeny and population genetics of ginseng suggest that Fusong town
   in Northeast China might be the domestication center of Asian ginseng,
   as the variety cultivated there is closest to wild ginseng^[40]2.

   Ginseng rusty root rot is a soil fungal disease that can devastate
   ginseng crops. The pathogen exists not only in the soil and rhizosphere
   but also as an endophyte in the ginseng root^[41]3. Despite variation
   in the pathogenicity of members of the C. destructans species complex,
   I. robusta is the dominant pathogen with the highest proportion of
   ginseng rusty root pathogenicity (unpublished data), and the fungus is
   one of the main factors contributing to the difficulty in Asian ginseng
   continuous cropping. For example, the detection rate of rusty root rot
   in continuously cropped 2-year-old ginseng was reported to be
   95.8%^[42]4. I. robusta, isolated from Panax quinquefolium, was first
   reported and described by Hildebr^[43]5 as Ramularia robusta. After
   several revisions, Ramularia changed from Cylindrocarpon to
   Ilyonectria^[44]3,[45]6–[46]9. Currently, the isolates of Ilyonectria
   species that infect ginseng are divided into four species: I. robusta,
   I. mors-panacis, I. panacis, and I. crassa^[47]7.

   All parts of the ginseng root can be infected, with the infected root
   exhibiting reddish brown dry rot with deep gullies across the majority
   of the organ. However, infection stops with increased temperature; some
   parts of the ginseng root may recover, but the root retains the gullies
   and dry rot scars. Infected ginseng roots cannot be processed for sale
   because it is unclear how any fungal toxins present would be
   metabolized in humans. The I. destructans species complex can also
   infect some perennial herbs, such as Panax quinquefolius and Narcissus,
   and woody plants, such as Populus, Quercus, Tilia and
   Prunus^[48]7,[49]10. However, the pathogen is not considered aggressive
   and may be an opportunistic plant root pathogen^[50]6. Previous studies
   on I. destructans have focused on systematics, molecular detection, and
   pathogenicity differentiation^[51]6,[52]11–[53]14.

   The CD-56 strain used in this study is the first strain with a
   sequenced genome. It has been reported that the contents of iron and
   phenolic compounds are much higher in diseased tissue than in healthy
   tissue, which is believed to explain the rusty symptoms^[54]15.
   Additionally, Farh^[55]16 speculated that these symptoms develop as a
   result of the interaction between the pathogen and a non-resistant
   host. Although the genome and pathogenic mechanisms of this Ilyonectria
   have not been studied in detail, Gao Yuan^[56]17 applied transcriptome
   analyses to identify some resistance genes of P. ginseng induced by I.
   destructans in a time-course study.

   The dual RNA-Seq approach has been widely used to explore the molecular
   mechanisms of plant–pathogen interactions. Kawahara^[57]18 reported
   both rice and fungal genes to be expressed at a relatively early stage
   of infection (24 h), with large amounts of rice and fungal transcripts
   produced during infection. Analyses of the cacao and Moniliophthora
   perniciosa transcriptomes during their unique biotrophic interaction
   showed that infection with M. perniciosa triggers massive metabolic
   reprogramming in diseased tissues^[58]19. In addition, mixed
   transcriptome analysis revealed significant differences in the
   transcriptional responses of banana to two different strains of
   pathogen (Foc1 and Foc4) at 48 h after inoculation^[59]20. Another
   study identified an array of factors and various strategies to
   facilitate host colonization and mitigate host defenses during Brassica
   napus infection with Sclerotinia sclerotiorum^[60]21. Lysøe^[61]22 and
   Boddu^[62]23 examined head blight disease, the most destructive disease
   of wheat and barley, through global gene expression of Fusarium
   graminearum from 24 h to 196 h postinoculation. Transcriptome profiling
   of Fusarium solani f. sp. eumartii, which causes dry rot of potato, and
   infected potato tubers also provided evidence of an inducible defense
   response at 24 h after pathogen infection^[63]24, and for late blight
   of potato, faster and stronger activation of defense-related genes was
   found to correlate with tuber resistance to Phytophthora
   infestans^[64]25.

   There are various mechanisms of pathogen virulence and a wide range of
   plant immune responses. Several studies have reported that
   defense-related transcription factors, such as WRKY and YMB, and
   factors related to reactive oxygen species (ROS), the auxin pathway,
   and the ethylene pathway are manifestations of host plant
   resistance^[65]26–[66]28. To recognize pathogens, hosts have evolved
   the receptor protein immunity system that identifies effectors and
   their associated molecular patterns (PAMPs/MAMPs), and infected plants
   can then initiate first-line innate PAMP-triggered immunity or
   second-line effector-triggered immunity^[67]29–[68]32.

   Here, we report the high-quality whole-genome sequence, candidate
   effectors and ginseng–pathogen interaction transcriptome analysis of I.
   robusta CD-56. The results provide a foundation for elucidating the
   mechanisms of the chronic pathogenicity of this pathogen. CD-56
   initiates pathogenicity- and virulence-associated signaling pathways
   and possesses pathogenic mechanisms to suppress host defenses after
   three infection stages. Ginseng defends itself against CD-56 via
   expression of defense-related transcription factors and factors related
   to ROS, auxin, and ethylene pathways.

Results

Structure and general features of the I. robusta CD-56 genome

General features of the I. robusta CD-56 genome

   We used PacBio sequencing technology to generate long reads to assemble
   the genome and high-quality Illumina short reads to correct assembly
   errors. The draft genome of CD-56 contains 53 contigs, with a total
   size of 58,742,480 bp and a GC content of 49.32%. In total, 17,438
   coding sequences (CDSs) were predicted, of which 16,765 were validated
   by homology searches. The draft genome contains 221 tRNA genes, 57 rRNA
   genes, and 36 ncRNA genes. There are 8,699 CDSs (49.89%) that were
   assigned to one or more Gene Ontology (GO) functional classes
   (Fig. [69]S1A). In total, 8,402 CDSs (48.18%) were assigned a function
   in Kyoto Encyclopedia of Genes and Genomes (KEGG) based on KEGG GENE
   alignment. Table [70]1 summarizes the general features of the draft
   genome.

Table 1.

   General features of the CD-56 genome assembly.
                                      CD-56
   Genome assembly size           58,742,480 bp
   Contig No.                     53
   Contig N50                     3,405,880 bp
   Contig N90                     1,262,619 bp
   GC content                     49.32%
   Gene No.                       17,752
   tRNA gene No.                  221
   rRNA gene No.                  57
   ncRNA gene No.                 36
   Total number of CDSs           17,438
   Pseudo genes                   29
   Hypothetical proteins          8,730
   CDSs with assigned function    11,250
   CDSs assigned to KEGG pathways 8,402
   CDSs assigned to GO function   8,699
   [71]Open in a new tab

   According to KEGG pathway functional annotation analysis,
   ABC-transporter genes can be classified into ABCB, ABCC, ABCD and ABCG
   subfamilies, and genes involved in the two-component system are
   primarily related to osmolyte synthesis in response to hyperosmotic
   stress. The core elements of the mitogen-activated protein kinase
   (MAPK) signaling pathways are required for virulence in a wide array of
   fungal pathogens^[72]33. In CD-56, we predicted 25 homologs (62 genes
   in total) in the MAPK pathway and 11 homologs (22 genes in total) in
   the calcium pathway. Some common elements in the MAPK pathway that have
   been found in fungi pathogenic toward plants and human^[73]34 were
   identified in CD-56, as follows: (1) seven homologs of ERK1/2
   (previously named Fus3 and Kss1) in the MAPK pathway: GNG (previously
   named Ste18, AUP68_13344), Cdc24 (AUP68_02103, AUP68_17153), Cdc42
   (AUP68_01350), PAK1 (previously named Ste20, AUP68_11641, AUP68_11642),
   Ste11 (AUP68_01307), MAPK1_3 (previously named Fus3/Kss1, AUP68_03166,
   AUP68_07501, AUP68_16124), and Bni1 (AUP68_14625); (2) 11 homologs of
   p38 (previously named Hog1) in the MAPK pathway: Ypd1 (AUP68_03680),
   Cdc42 (AUP68_01350), PAK1 (previously named Ste20, AUP68_11641,
   AUP68_11642), Ssk1 (AUP68_07001), Ste11 (AUP68_01307), Ssk2/22
   (AUP68_05583, AUP68_08703, AUP68_02954, AUP68_04941, AUP68_14443,
   AUP68_04354, AUP68_09493, AUP68_02668, AUP68_12812, AUP68_04711,
   AUP68_14123, AUP68_13940, AUP68_04446, AUP68_07705, AUP68_07428,
   AUP68_05569, AUP68_13955, AUP68_04140, AUP68_07955, AUP68_07954,
   AUP68_00753, AUP68_15796), Pbs2 (AUP68_06808), p38 (previously named
   Hog1, AUP68_04006, AUP68_06362), Sko1 (AUP68_09790), Msn2/4
   (AUP68_15493), and Mcm1 (AUP68_06824); and (3) five homologs of Mpk1 in
   the MAPK pathway: Rh1 (AUP68_16522), Pkc1 (AUP68_02677, AUP68_12952),
   Bck1 (AUP68_09749, AUP68_16215), Mkk1/2 (AUP68_13277, AUP68_12069), and
   Fks2 (AUP68_05335, AUP68_14756, AUP68_16064). The majority of the core
   elements of ERK1/2 and p38 in the MAPK pathway were identified in the
   CD-56 genome, especially the p38 MAPK pathway. The pheromone/receptor
   pair genes Ste2 (AUP68_07997) and Ste3 (AUP68_13310) were also
   identified in CD-56, but genes encoding HD1/HD2 transcription factors
   were not detected.

General features of CD-56 secretome protein genes

   A subset of secretome protein genes from pathogens may determine the
   progression and success of infection^[74]35. We predicted 498 genes
   encoding secretome proteins in the CD-56 strain. A total of 451 genes
   were validated by homology search (nr database), and 207 genes were
   hypothetical genes. Genes with hits in the nr database are mainly from
   Neonectria ditissima, followed by Nectria haematococca, Fusarium
   oxysporum and several others; Neonectria and Fusarium are closely
   related to Ilyonectria. GO functional annotation analysis shows that
   most genes encode proteins related to extracellular region, cell part,
   cell and organelle functions at the cellular component level.
   Additionally, genes at the molecular function level are primarily
   related to catalytic activity, binding, nucleic acid binding
   transcription factor activity functions, and those at the biological
   process level are mainly involved in metabolic processes, cellular
   processes, single-organism process, biological regulation and cellular
   component organization or biogenesis (Fig. [75]1).

Figure 1.

   [76]Figure 1
   [77]Open in a new tab

   GO functional annotation analysis of the CD-56 predicted secretome gene
   set.

   Among the predicted secretome proteins, some may have an important role
   in pathogenesis. In total, 121 secretome protein genes in CD-56 were
   assigned in the Pathogen–Host Interaction (PHI) database. The predicted
   pathogenicity-associated genes from CD-56 were classified into six
   phenotype-associated genes on the basis of their ability to modulate
   host responses and trigger plant cell death (based on the “Definition
   of phenotypes” in the PHI database^[78]36), as follows: (1) reduced
   virulence: disruption of 40 genes (34.40%) in the CD-56 genome could
   reduce virulence; (2) loss of pathogenicity: three genes (2.40%) in the
   CD-56 genome are predicted to be pathogenicity factors, though
   disruption of these genes would fail to cause disease; (3) effector:
   seven genes (5.60%) in the CD-56 genome are plant avirulence
   determinant genes, which are required for direct or indirect
   recognition of a pathogen only in resistant host genotypes possessing
   the corresponding disease-resistance gene; (4) unaffected
   pathogenicity: disruption of 55 genes (44%) in the CD-56 genome could
   reduce or result in no virulence; (5) increased virulence: disruption
   of three genes (2.40%) in the CD-56 genome could increase virulence;
   (6) mixed outcome: 13 genes (11.20%) in the CD-56 genome with
   uncharacterized functions might have one or more of the function types
   mentioned above. Details are presented in Table [79]S1. In addition,
   proteins that are both high in cysteine content (≥6) and multiple
   tandem repeats (≥14) are predicted to be potential pathogenic effector
   proteins^[80]37, and 10 novel pathogenic effector candidate genes
   (AUP68_00306, AUP68_01779, AUP68_02663, AUP68_05479, AUP68_07712,
   AUP68_11521, AUP68_13033, AUP68_13384, AUP68_14284, and AUP68_16048)
   were identified in CD-56. These genes might be related directly to the
   infection of ginseng roots or indirectly to pathogenicity.

General features of CD-56 carbohydrate-active enzyme genes

   The repertoire of carbohydrate-active enzymes (CAZymes) in fungal
   genomes is considered to have a strong relationship with the
   saprophytic lifestyle of these organisms^[81]33. The genome of the
   CD-56 strain is predicted to encode 129 CAZymes among the predicted
   secretome proteins, including the following: (1) 18 genes of the
   carbohydrate-binding module (CBM) family, which are from 10
   subfamilies; (2) 17 genes of the carbohydrate esterase (CE) family,
   which are from six subfamilies; (3) 59 genes of the glucoside hydrolase
   (GH) family, which are from 32 subfamilies; (4) one gene from the
   glycosyl transferase (GT) family, which is from one subfamily; (5) 15
   genes from the polysaccharide lyase (PL) family, which are from four
   subfamilies; and (6) 19 genes from the auxiliary activity (AA) family,
   which are from seven subfamilies. Details are shown in Table [82]S2.

Ginseng transcriptome reconstruction

Transcript assembly of the ginseng transcriptome

   A total of 224,617 transcripts with lengths >= 200 bp were generated
   using Trinity software (Table [83]S3) to assemble the pure ginseng
   transcriptome (sample IDs, Ginseng_1 and Ginseng_2). The mean
   transcript size is 779 bp, with an N50 of 1,296 bp. A set of
   non-redundant unigenes was obtained by retaining the longest transcript
   in a cluster as the representative and assigning all transcripts to
   104,708 unigenes. Among these unigenes, the mean length is 681 bp, with
   an N50 of 1,107 bp. A total of 36,878 (36.22%) unigenes are longer than
   500 bp and 17,786 (16.99%) are longer than 1,000 bp (Fig. [84]S2). The
   sequencing reads were mapped to non-redundant unigenes, with mapping
   rates ranging from 63.45% to 65.55% (Table [85]S4).

Functional annotation of ginseng assembled transcripts

   Unigene sequences were aligned using BLAST searches against the NCBI nr
   and KEGG GENES databases, with a cutoff e-value of <1e^−5. Among all
   unigenes, 34,773 (33.21%) and 27,277 (26.05%) matched with sequences in
   the NCBI nr and KEGG GENES databases, respectively. The large
   percentage of unigenes with unknown function was expected because of
   the paucity of sequences from phylogenetically closely related species
   in public databases.

   Functional classification of the unigenes was conducted by GO analysis,
   and 19,791 (19,791/104,708) unigenes were successfully assigned to at
   least one GO term (Fig. [86]S1B). The unigenes were then classified
   into three categories: biological process, cellular component, and
   molecular function. In the cellular component category, the major
   represented subcategories were cell (GO: 0005623) and cell part (GO:
   0044464). In the biological process category, cellular process (GO:
   0009987) was the most represented GO term, followed by metabolic
   process (GO: 0008152).

Dual RNA-Seq analysis of I. robusta-infected ginseng cells

General features of dual RNA-Seq

   We conducted dual RNA-Seq to reveal the signaling events in the host
   (ginseng) directly modulated by the pathogen (CD-56). To assess the
   reliability of RNA-Seq, each biological sample had two replicates.
   Reads from CD-56 cells or ginseng cells alone were mapped to the CD-56
   assembled genome or the ginseng assembled transcript dataset. The
   levels of individual transcripts were highly consistent between the two
   biological replicates (Pearson’s correlation coefficient r ∈[0.8636,
   0.9884], Fig. [87]S3), except for the CD-56 sample at 72 h after
   inoculation (Pearson correlation coefficient r = 0.5461). We compared
   representative mappable reads to assess the coverage of the CD-56 and
   ginseng transcripts. Initially, the proportion of CD-56 reads ranged
   from 0.46% to 27.71%, which then increased rapidly to approximately 20%
   by 144 h after inoculation (Fig. [88]2A). This reflected the
   intracellular replication of the pathogen and confirmed the predicted
   excess of plant cell RNA over fungal cell RNA^[89]38. The observed
   changes in the proportions of reads were indicative of differentially
   expressed genes (DEGs) in both organisms during infection: (1) in CD-56
   cells, the expression levels of some virulence-associated genes
   increased and some decreased after inoculation (Fig. [90]2B and
   Table [91]S5); (2) in infected ginseng cells, expression of genes
   encoding cell-surface pattern-recognition receptors (PRRs) was
   activated after inoculation (Fig. [92]2C and Table [93]S5).

Figure 2.

   Figure 2
   [94]Open in a new tab

   Dual RNA-Seq captures the expressed transcript repertoire of infected
   cells. (A) Representative mapping statistics from I. destructans
   CD-56-infected ginseng root cells. (B) Infecting CD-56 cells at
   36 hours (36 h), 72 hours (72 h) and 144 hours (144 h) activated some
   effector genes and inhibited other effector genes (+
   [MATH: <mi mathvariant="normal">∞</mi> :MATH]
   denotes halted effector gene expression under the control condition, −
   [MATH: <mi mathvariant="normal">∞</mi> :MATH]
   denotes halted effector gene expression after infection). (C) Invaded
   host cells at 36 hours (36 h), 72 hours (72 h) and 144 hours (144 h)
   activated ginseng cell-surface PRRs (pattern-recognition receptors;
   note that a transcript expression level is defined as the average of
   the transcript expression values of the two replicate samples; the gene
   expression level is defined as the sum of all transcript expression
   values, which were assigned to a same ko No.). Data in panels B and C
   represent log2 fold changes calculated by Cuffdiff for 2 biological
   replicates. The asterisk (‘*’) in panels B and C indicates that the
   gene/transcript expression value after infection was significantly
   different from the control condition.

   Analyses of the CD-56 gene expression profile indicated that many genes
   initially expressed were suppressed at 36 and 72 h after inoculation,
   and the number of genes with suppressed expression increased over time
   after infection. In ginseng, many genes that were initially expressed
   were also suppressed after infection, with the number of genes with
   suppressed expression initially increasing after infection and then
   decreasing over time. The CD-56 and ginseng transcriptome samples could
   be separated into two physiological groups: normal (control group, mock
   infected), and infected (three time points during infection, each with
   two replicate samples). Cluster analysis indicated similar gene
   expression patterns between the two replicates of each sample
   (Fig. [95]3). Additionally, the CD-56 gene expression patterns in the
   control and at 144 h after inoculation were similar (Fig. [96]3A).
   Unlike CD-56, ginseng showed gene expression patterns that were
   strongly related to physiological conditions (Fig. [97]3B). The three
   infection periods of CD-56 could be separated into two stages: 36 h and
   72 h after inoculation as the initial infection stage and 144 h after
   inoculation as the adaptive infection stage. Cluster analysis indicated
   similar ginseng gene expression patterns between the 36 h and 72 h
   defense periods, consistent with the CD-56 gene expression patterns
   (Fig. [98]3). The three defense periods of ginseng could be separated
   into two stages: 36 and 72 h after inoculation represented the initial
   defense stage, and 144 h after inoculation represented the adaptive
   defense stage. The two defense stages of ginseng were similar to those
   of CD-56. Overall, the gene expression profile cluster analysis results
   showed that the infection response of CD-56 tended to return to the
   normal gene expression pattern by 144 h after inoculation and that
   ginseng showed a relatively consistent defense response (Fig. [99]3).

Figure 3.

   Figure 3
   [100]Open in a new tab

   CD-56 and ginseng gene expression profile hierarchical cluster analyses
   based on Pearson correlation. Pearson correlation of gene expression
   profiles was used to define the similarity in gene expression profiles
   among 8 different samples. CD56_1 & CD56_2 and Ginseng_1 & Ginseng_2
   are normal condition samples as the control group. (A) CD-56 gene
   expression profile hierarchical cluster analysis. (B) Ginseng gene
   expression profile hierarchical cluster analysis.

   We also used Cuffdiff^[101]39 and edgeR^[102]40 to identify genes
   exhibiting significant differences in expression levels between pairs
   of conditions, which allowed us to identify DEGs in CD-56 and ginseng
   at the three infection stages compared with their respective controls.
   In CD-56, there were 1,354 down-regulated genes and 260 up-regulated
   genes at 36 h after inoculation, 997 down-regulated genes and 241
   up-regulated genes at 72 h after inoculation, and 526 down-regulated
   genes and 243 up-regulated genes at 144 h after inoculation
   (Fig. [103]4A). In ginseng, there were 4,537 down-regulated genes and
   4,293 up-regulated genes at 36 h after inoculation, 3,537
   down-regulated genes and 3,335 up-regulated genes at 72 h after
   inoculation, and 4,448 down-regulated genes and 2,710 up-regulated
   genes at 144 h after inoculation (Fig. [104]4B). Moreover, 684 genes
   (up to 42.38% of DEGs) in CD-56 cells were specifically activated at
   36 h after inoculation, which was the highest proportion of activated
   genes during infection (Fig. [105]4A). A similar phenomenon was
   observed in the ginseng transcriptome (Fig. [106]4B). With regard to
   intersecting DEGs, the directions of regulation exhibited high
   uniformity (stably up- and down-regulated DEG proportions ranged from
   95.90% to 99.84%, except an outlier at 89.47%) both in CD-56 and
   ginseng (Fig. [107]4). The transcriptome of CD-56 displayed significant
   regulation at the initial infection stage compared with the control but
   weaker regulation at the adaptive infection stage. The transcriptome of
   ginseng initially showed significant regulation upon infection compared
   with the control, but this regulation rapidly weakened in subsequent
   defense stage. However, some recovery of this transcriptome regulation
   was found at the final defense stage. In general, the significant
   changes during the initial infection/defense stages might be related to
   the stress response of ginseng. In CD-56, there was an initial
   down-regulation of genes, but the number of down-regulated genes
   decreased over time after infection, suggesting that inhibition of
   fungal cells gradually diminished with time. In ginseng, the number of
   up-regulated genes decreased and then increased with time after
   infection, though the number of down-regulated genes only decreased.
   This suggests a relatively long process of sustained defense in host
   ginseng cells.

Figure 4.

   Figure 4
   [108]Open in a new tab

   DEG numbers in the 3 infection periods for CD-56 and ginseng compared
   with the control. Note: (1) down indicates that a gene was
   significantly down-regulated compared to the control; (2) up indicates
   that a gene was significantly up-regulated compared to the control; (3)
   two numbers, formatted as A/B, for intersections indicate gene numbers
   with uniform regulation directions (A, for example, both up- and both
   down-regulated genes between 36 h and 72 h of infection) to total gene
   number (B).

   Cluster analysis indicated similar DEG expression patterns between the
   two replicates of each sample (Fig. [109]S4). DEG expression patterns
   were also similar between the 72 h and 144 h samples in CD-56, forming
   a cluster (Fig. [110]S4A). Although DEG expression patterns were very
   similar between the two replicates of the 36 h CD-56 samples, they
   differed from those in other samples and formed a separate group
   (Fig. [111]S4A). The results of this cluster analysis differed from
   those of the full gene expression profile cluster analysis
   (Fig. [112]3A). Indeed, the gene expression profile cluster analysis
   results suggested that the gene expression pattern of CD-56 tended to
   slowly return to a normal pattern (similar to that in the control) by
   144 h after inoculation but that ginseng showed a relatively consistent
   defense response (Figs. [113]3A and [114]S4A). Unlike CD-56, ginseng
   exhibited DEG expression patterns that were strongly related to its
   physiological condition (Fig. [115]S4B), similar to the results of the
   full gene expression profile cluster analysis (Fig. [116]3B). In
   summary, the dual RNA-Seq analysis (Figs. [117]3 and [118]S4) indicated
   that fungal CD-56 cells gradually recovered to normal conditions after
   infecting ginseng cells but that ginseng cells were continuously
   inhibited during the course of infection.

Pathogenicity of I. robusta CD-56 infection-related genes

   GO annotation analysis of CD-56 genes related to infection. The GO
   annotation analysis of the CD-56 DEGs is shown in Fig. [119]S5.
   Comparison of the enriched GO classes among the three infection stages
   indicated that the stage at 72 h after inoculation represented an
   intermediate stage. Some of the proteins encoded by DEGs at this stage
   overlapped with those encoded by DEGs at 36 h and 144 h after
   inoculation, though no overlap between 36 h and 144 h after inoculation
   was observed. The GO class distribution changed from transport- to cell
   aging-associated functions during the 144 h after inoculation.
   Regarding enriched GO function of DEGs, the proportion of
   down-regulated genes in CD-56 peaked at 72 h after inoculation
   (88.27%). From a cellular function perspective, CD-56 cells displayed a
   change in function at 72 h after inoculation.

   Virulence-associated pathways in CD-56. KEGG pathway enrichment
   analysis showed that at 36 h after inoculation, DEGs were enriched in
   the ribosome pathway, pathogenic pathway, antigen processing and
   presentation pathway, alanine, aspartate, and glutamate pathways,
   nitrogen metabolism pathway, and Parkinson’s disease pathway. DEGs at
   the 72 h infection stage were enriched in disease-associated pathways,
   alanine, aspartate, and glutamate metabolism pathways, and the
   neuroactive ligand-receptor interaction pathway. At 144 h after
   inoculation, DEGs were enriched in amino sugar and nucleotide sugar
   metabolism pathways, fructose and mannose metabolism pathways, the
   pentose and glucuronate interconversion pathway, the nitrogen
   metabolism pathway, and the citrate cycle (TCA cycle) pathway. The KEGG
   pathway enrichment results are presented in Table [120]2.

Table 2.

   KEGG pathway enrichment analysis of CD-56 DEGs at three infection
   stages.
   KEGG Pathway ID Pathway name M/N^a 36 h VS control 72 h VS control
   144 h VS control
   m/n^b p-value m/n^b p-value m/n^b p-value
   ko03010 Ribosome 119/4,400 39/463 2.10E-11^* 8/353 0.75065 4/234
   0.885899
   ko05130 Pathogenic Escherichia coli infection 20/4,400 8/463 5.67E-04^*
   3/353 0.212972 1/234 0.665593
   ko04612 Antigen processing and presentation 22/4,400 8/463 0.00119^*
   3/353 0.256954 1/234 0.700374
   ko00250 Alanine, aspartate and glutamate metabolism 61/4,400 14/463
   0.003575^* 11/353 0.008216^* 6/234 0.103376
   ko00910 Nitrogen metabolism 56/4,400 13/463 0.004437^* 10/353 0.012317
   8/234 0.008793^*
   ko05012 Parkinson’s disease 47/4,400 11/463 0.008059^* 10/353
   0.003417^* 6/234 0.036624
   ko04930 Type II diabetes mellitus 15/4,400 2/463 0.478804 5/353
   0.004943^* 3/234 0.042036
   ko05016 Huntington’s disease 68/4,400 12/463 0.04901 12/353 0.007021^*
   5/234 0.293932
   ko05010 Alzheimer’s disease 60/4,400 10/463 0.093754 11/353 0.007239^*
   6/234 0.097288
   ko04080 Neuroactive ligand-receptor interaction 6/4,400 2/463 0.124591
   3/353 0.008526^* 1/234 0.279695
   ko00520 Amino sugar and nucleotide sugar metabolism 81/4,400 9/463
   0.484646 9/353 0.198799 12/234 0.001033^*
   ko00051 Fructose and mannose metabolism 180/4,400 28/463 0.02083 23/353
   0.016129 19/234 0.00292^*
   ko00040 Pentose and glucuronate interconversions 44/4,400 4/463
   0.695087 5/353 0.275757 7/234 0.007794^*
   ko00020 Citrate cycle (TCA cycle) 45/4,400 6/463 0.334514 5/353
   0.291727 7/234 0.008823^*
   [121]Open in a new tab

   ^aM, number of genes assigned to the corresponding KEGG pathway in the
   genome; N, number of genes assigned to at least one KEGG pathway in the
   genome.

   ^bm, number of DEGs assigned to the corresponding KEGG pathway in the
   pairwise comparison; n, number of DEGs assigned to at least one KEGG
   pathway in the pairwise comparison.

   ^*The asterisk indicates DEGs enriched significantly in a KEGG pathway
   corresponding to that infection stage.

   After infection of the host, the enriched pathways in CD-56 changed
   from protein synthesis and pathogenicity-related pathways to
   disease-associated pathways and finally to the regulation of sugar and
   energy metabolism. At 36 h after inoculation, all DEGs encoding
   ribosome proteins were activated except for the gene encoding the L36e
   ribosomal protein (AUP68_17121); however, most DEGs recovered to normal
   levels at later stages of infection, and some gradually became
   suppressed. DEGs in the pathogenic pathway and antigen processing and
   presentation pathway were down-regulated at 36 h after inoculation,
   though some exhibited gradual increases in expression over time. This
   indicated that the expression levels of genes encoding ribosome
   components and pathogenicity-associated proteins in CD-56 gradually
   returned to normal levels (similar to those in the control) after
   infection. At 144 h after infection, the enrichment in sugar metabolism
   and energy metabolism was mainly associated with the synthesis of cell
   wall components. Chitin is one of the main components of the fungal
   cell wall, and its synthesis is catalyzed by CHS1^[122]41,[123]42.
   Expression of CHS1 was down-regulated at 36 h after inoculation,
   recovered to normal levels at 72 h, and was up-regulated at 144 h,
   indicating that chitin synthesis was initially inhibited in CD-56 but
   activated several days after the initial infection.

   DEGs encoding proteins localized at the cytomembrane and responding to
   pheromone, cell wall stress, high osmolarity, and starvation signals
   were suppressed at 36 h after inoculation and slightly up-regulated at
   72 h after inoculation, recovering to normal levels at 144 h after
   inoculation. The suppression phenomenon shifted from the cytomembrane
   to intracellular parts after infection. Interestingly, Ste12
   (AUP68_08252, AUP68_13264), which encodes a master regulator of
   invasive growth in plant pathogenic fungi to control fungal virulence
   downstream of the pathogenic MAPK cascade^[124]43, did not show
   significantly different regulation among the three infection stages. In
   addition, DEGs involved in the calcium signaling pathway, which
   indirectly affects proliferation, fertilization, learning, and memory,
   were also suppressed at 36 h after inoculation, recovered to normal
   levels at 72 h, but tended to be suppressed at 144 h.

   I. robusta CD-56 candidate effectors. The majority of DEG secretome
   protein genes were down-regulated, and the proportion of down-regulated
   DEGs decreased gradually. However, the down-regulated DEGs that were
   identified in the PHI database were relatively stable in the three
   infection stages, whereas the number of up-regulated DEGs increased
   gradually (Table [125]3). Compared to the control, the predicted PHI
   effectors that were significantly regulated (Fig. [126]5) were as
   follows: (1) reduced virulence: AUP68_00030, which was up-regulated in
   all three infection stages, AUP68_05703 and AUP68_17846 were
   down-regulated in all three infection stages, especially AUP68_17846,
   which were not expressed in the 36 and 72 h infection periods,
   AUP68_00038, AUP68_17173, AUP68_16785 and AUP68_07699 were up-regulated
   at the 144 h infection period, and AUP68_08647 and AUP68_02790 were
   down-regulated in the 144 h infection period; (2) effector: AUP68_00344
   was not expressed at the initial infection stage, though it was
   expressed and up-regulated in the adaptive infection stage; (3)
   unaffected pathogenicity: AUP68_17266, AUP68_08648, and AUP68_07027
   were down-regulated and ceased in the 36 and 72 h infection periods,
   AUP68_17266 and AUP68_08648 remained down-regulated in the 144 h
   infection period, AUP68_04859 and AUP68_13455 were up-regulated in the
   initial infection stage, and AUP68_13455 remained up-regulated at the
   adaptive infection stage; (4) increased virulence: AUP68_16907 was
   down-regulated in the 72 h infection period but up-regulated in the
   144 h infection period; (5) mixed outcome: AUP68_14257 was up-regulated
   in all three infection periods, but AUP68_11568 and AUP68_15336 were
   up-regulated only in the adaptive infection stage. The novel predicted
   effector AUP68_13384 was the only DEG with halted expression
   (significantly down-regulated) after 72 h of infection, which might
   indicate that it is not a ginseng-specific effector. The details are
   provided in Table [127]3.

Table 3.

   The distribution of CD-56 DEGs for diverse secrectome types.
   36 h VS control 72 h VS control 144 h VS control
   up down total up down total up down total
   Genome 260 1,354 1,614 241 997 1,238 243 526 769
   Secretome 9 26 35 10 23 33 12 13 25
   Effectors hit in PHI 4 5 9 6 6 12 11 6 17
   CAZyme 4 2 6 7 2 9 12 2 14
   Novel effectors 0 0 0 0 1 1 0 0 0
   [128]Open in a new tab

Figure 5.

   [129]Figure 5
   [130]Open in a new tab

   The dynamic change tendency of DEGs with hits in the PHI database
   compared to the control. Note that (1) down indicates that a gene was
   significantly down-regulated compared to the control; (2) up indicates
   that a gene was significantly up-regulated compared to the control.

   Regarding reduced-virulence DEGs, AUP68_00030 was the predominantly
   expressed gene, and it was up-regulated in all three infection periods.
   The effector gene AUP68_00344 was activated and up-regulated in the
   adaptive infection stage, indicating that it might be recognized by the
   corresponding ginseng disease-resistance gene, a host-specific toxin
   gene. For unaffected-pathogenicity genes, although the predominant
   regulatory mechanism appears to be down-regulation, expression levels
   were predominantly up-regulated. With regard to increased-virulence
   genes, AUP68_16907 was down-regulated at 72 h after infection and
   up-regulated to a very high level at 144 h. Last, significantly
   regulated mixed-outcome genes were uniformly up-regulated. In general,
   genes with different types of genetic functions showed corresponding
   toxin peaks at different times.

   Pathogenicity genes among CAZymes. In total, 17 DEGs encoding CAZymes
   were identified in the three infection stages. Although expression of
   most of these genes was up-regulated, the AA1 subfamily gene
   AUP68_08648 (switched off after 36 and 72 h of infection), the AA5
   subfamily gene AUP68_08647, and the CBM1 subfamily gene AUP68_07027
   (switched off after 36 and 72 h of infection) exhibited down-regulated
   expression (Fig. [131]6). Moreover, lack of expression was observed for
   four and three genes at 36 and 72 h after infection, respectively. The
   following results were found for DEGs with up-regulated expression: (1)
   genes from PL3, AA9 and GH128 subfamilies at 36 h after infection; (2)
   genes from PL3, PL1, AA9, GH12, and GH131 subfamilies at 72 h after
   infection; and (3) genes from PL3, PL1, AA8, GH12, GH131, GH5, GH30,
   GH43 and GH93 subfamilies at 144 h after infection. Both the number and
   expression levels of genes indicated predominant up-regulation for
   CAZymes, with both increasing with infection time. The primarily
   associated CAZymes that might relate to the saprophytic lifestyle of
   CD-56 are PL and GH subfamily genes.

Figure 6.

   [132]Figure 6
   [133]Open in a new tab

   The dynamic change tendency of DEGs encoding CAZymes compared to the
   control. Note that (1) down indicates that a gene was significantly
   down-regulated compared to the control; (2) up indicates that a gene
   was significantly up-regulated compared to the control.

Expression of defense-related genes in ginseng

   To investigate the responses of defense-related genes in ginseng, we
   focused on six types of DEGs: those encoding components of the
   plant–pathogen interaction pathway (KEGG map04626), those related to
   ROS, auxin, and ethylene, and those encoding MYB and WRKY transcription
   factors^[134]8,[135]44–[136]47. In the plant–pathogen interaction
   pathway, cell-surface PRRs perceive pathogens and are key components of
   the primary response against invading pathogens^[137]31. Many genes
   encoding cell-surface PRRs were initially up-regulated in ginseng roots
   upon infection, but levels then gradually decreased (Fig. [138]2C). In
   the ethylene, WRKY, and plant–pathogen interaction pathways, there were
   more up-regulated genes than down-regulated genes at all three
   infection stages (Table [139]4), indicating that the ginseng root
   mounted a positive defensive response against the pathogen. In
   addition, there were more down-regulated than up-regulated
   auxin-related genes, indicating that the growth of ginseng was
   inhibited. For genes related to ROS and those encoding MYB
   transcription factors, there were more up-regulated than down-regulated
   genes at 72 h after inoculation, possibly because of a delayed response
   to the pathogen. Cluster analysis of the defense-related DEGs confirmed
   similar gene expression patterns between the two replicates of each
   sample and similar DEG expression patterns between control samples and
   samples at 144 h after inoculation (Fig. [140]7). These results
   indicate that DEGs during the defense response tended to return to
   normal expression levels by 144 h after inoculation, which was
   different from the trends observed across the entire transcriptome
   (Fig. [141]3B).

Table 4.

   Number of defense-related DEGs in various categories in ginseng.
   36 h VS control 72 h VS control 144 h VS control
   up down up down up down
   Total DEGs 4,537 4,293 3,537 3,335 4,448 2,710
   MYB 21 20 24 22 18 15
   Reactive oxygen species 1 4 2 1 3 0
   Auxin 6 34 3 31 4 23
   Ethylene 35 20 30 16 31 13
   WRKY 36 7 37 4 27 1
   Plant–pathogen interaction 202 115 159 94 194 60
   [142]Open in a new tab

Figure 7.

   Figure 7
   [143]Open in a new tab

   Ginseng defense-related DEGs (except plant–pathogen interaction pathway
   DEGs) expression profile hierarchical cluster analysis based on Pearson
   correlation. Pearson correlation of gene expression profiles was used
   to define the similarity in gene expression profiles among 8 different
   samples. Ginseng_1 and Ginseng_2 are normal condition samples as the
   control group.

Relative expression of secretome protein genes by quantitative real-time
(qRT-PCR)

   The expression levels of ten selected genes were significantly
   different at the different infection stages, with all being
   up-regulated during infection. AUP68_00344 was not detected at the 36 h
   after infection of the transcriptome but was expressed at a low level
   according to qRT-PCR, and the results were consistent with the
   transcriptome data at the 72 h and 144 h after infection. Expression of
   AUP68_13455 was highest at the 36 h after infection, decreasing with
   infection time. AUP68_00030, AUP68_07699, AUP68_04332, AUP68_16907,
   AUP68_17508, AUP68_01118 and AUP68_15336 were expressed at the lowest
   levels at the 36 h after infection but gradually increased with
   infection time; AUP68_14257 levels were high at 36 h, and 144 h but
   reduced at 72 h. The results of relative gene expression were
   consistent with the trend of the same genes in the transcriptome
   (Fig. [144]8), and thus the transcriptomic data are reliable.

Figure 8.

   Figure 8
   [145]Open in a new tab

   Confirmation of differential expression of secretome protein genes by
   qRT-PCR. Gray, green, red and purple represent infected samples at 0 h
   (control), 36 h, 72 h, 144 h; the control was normalized to “1”, and
   the ITS gene was used as a reference gene.

Discussion

   I. robusta is a soil-borne pathogenic fungus that causes the
   devastating rusty root rot disease of ginseng and is present in all
   ginseng growing areas^[146]1. This is the first report of the
   high-quality I. robusta genome sequence obtained using PacBio
   technology. We explored the molecular mechanisms of rusty root rot
   disease, which threatens ginseng production worldwide, through the
   first comprehensive genomic and transcriptomic analyses of I. robusta
   and its infection process. Our results defined three stages of I.
   robusta CD-56 infection after inoculation and one control stage. Our
   CD-56 transcriptomic analysis revealed that the number of expressed
   genes decreased at the initial infection stage and then increased
   gradually after inoculation. The CD-56 gene expression profiles
   exhibited a trend of initially suppressed expression, gradually
   returning to levels similar to those in the control (Fig. [147]3A). The
   number of significantly up- and down-regulated genes decreased
   gradually over time after inoculation (Fig. [148]4A). Moreover, global
   gene expression patterns in CD-56 changed from suppression to
   adaptation during its interaction with ginseng, consistent with its
   chronic pathogenic characteristics. In general, the gene expression
   patterns of CD-56 differed from those of members of its closest related
   genus, Fusarium, highlighting the diversity of plant pathogenic
   fungi^[149]22,[150]23,[151]48.

   Secretome proteins are considered relevant to the progression and
   success of infection^[152]35. The proportion of secretome protein genes
   in I. robusta (2.86%) is much lower than that in species in Fusarium,
   such as Fusarium fujikuroi^[153]49 (8.78%) and Fusarium
   graminearum^[154]50 (10.51%), but it is similar to that of the
   lignocellulose-degrading basidiomycete fungus Pleurotus
   ostreatus^[155]51. There are 121 secretory protein genes with hits in
   the PHI database^[156]36. The gene number in I. robusta is nearly three
   times that in the sunflower rust fungus Puccinia helianthi
   Schw^[157]52, and I. robusta contains ‘increased virulence’ class
   genes, which are absent in P. helianthi Schw^[158]52. Based on dual
   RNA-Seq analysis, only 19 PHI genes were significantly regulated after
   inoculation. In particular, only one effector gene and one increased
   virulence gene were significantly regulated at 72 h and 144 h after
   inoculation, with both up-regulated at 144 h after inoculation. Both
   the high-virulence-related gene number and significantly regulated gene
   number are small, indicating that the degree of pathogenicity may be
   more reflected in the efficacy of toxicity.

   CAZymes in fungal genomes are reported to have a strong relationship
   with the saprophytic lifestyle of fungi^[159]46. The number of CAZymes
   (129 genes) in the I. robusta genome is much lower than the number in
   most brown (~366 genes), rot (~480 genes) and soft-rot (~553 genes)
   fungi^[160]53. Indeed, only one GT family gene is identified in I.
   robusta, whereas there are many GT family genes in Fusarium
   fujikuroi^[161]49 and most brown, white and soft rot fungi^[162]53.
   Many cellulose-degrading enzymes are categorized within GH classes (59
   genes) with abundance of the GH43 (7 genes) and GH28 (6 genes)
   families, which can catalyze the degradation of hemicellulose, and
   pectin^[163]53, in the I. robusta genome. Although GH class genes show
   enrichment of GH3, GH5 class enzymes are enriched in most white, brown
   and soft rot fungi^[164]53. GHs are also reported to be abundant in F.
   fujikuroi^[165]49, F. virguliforme^[166]54 and most white, brown and
   soft rot fungi^[167]53. I. robusta has one GH67 gene (AUP68_07694) that
   is absent in most plant pathogens, whereas some common hemicellulose
   classes of plant pathogenic fungi are absent in I. robusta, such as
   GH29 and GH44. Our dual RNA-Seq analysis indicated that many of these
   CAZyme genes are up-regulated during the host–pathogen interaction
   (Table [168]3), a situation similar to that in F. fujikuroi^[169]49 and
   the brown rot fungus Postia placenta^[170]55. The down-regulated CAZyme
   genes belong to the AA1, AA5 and CBM1 families (Fig. [171]6).
   Enrichment of the GH (7 genes), PL (5 genes) and AA (4 genes) classes
   is found for the regulated CAZyme genes, suggesting that these enzymes
   play important roles in plant cell wall degradation. However, the GH67
   gene and the genes from abundant GH classes were not significantly
   regulated (Fig. [172]6). Moreover, an abundance of pectin-degrading
   enzyme families, CE8 (1 gene), GH28 (6 genes), PL1 (7 genes), PL3 (5
   genes), and PL9 (1 gene) was observed, with 2 PL1 genes and 3 PL3 genes
   significantly up-regulated, which may help I. robusta to colonize root
   tissues. These results indicate that the role and mechanism of CAZyme
   genes in the saprophytic lifestyle is much different than that in most
   rot fungi. Nonetheless, it remains unclear whether any of these enzymes
   directly attack ginseng root cells.

   The GO functional annotation analysis showed that the function of CD-56
   DEGs changed from transport to cell aging at 72 h after inoculation
   (Fig. [173]S5). In addition, our enriched pathway analysis of CD-56
   DEGs indicated that the interaction response changed from
   pathogenicity- to disease-associated pathways and finally to sugar- and
   energy-associated metabolism (Table [174]2). DEGs encoding proteins
   that are located at the cytomembrane and respond to signals from
   pheromones, cell wall stress, high osmolarity, and starvation were
   initially suppressed and then recovered, with the suppression
   phenomenon shifting from the cytomembrane to intracellular locations.
   CHS1 encodes chitin synthase, which is involved in biosynthesis of
   chitin, a major component of the cell wall^[175]38,[176]39, and CHS1
   gene expression in CD-56 changed from suppression to recovery and
   finally activation after inoculation. This pattern of expression
   suggests that CD-56 cells gradually proliferate over time after
   inoculation. During the interaction between CD-56 and ginseng, gene
   expression patterns changed from a stress response to an adaptive
   response. CD-56 might halt infection before colonizing ginseng if the
   perception of invasion initiates PAMP-triggered immunity at the ginseng
   root cell surface^[177]31, which may be one of the reasons for its
   chronic pathogenesis.

   To investigate the plant defense response, we constructed ginseng
   transcriptomes and analyzed changes after inoculation with the
   pathogen. The global gene expression patterns in ginseng root cells
   differed from those in CD-56 cells, with the former being highly
   related to physiological conditions (Fig. [178]3B). In ginseng, the
   number of up-regulated genes decreased and the number of down-regulated
   genes first increased and then decreased after inoculation with CD-56
   (Fig. [179]4B). Overall, our profile analysis indicated that plant
   defenses were activated and that the defense response persisted in the
   ginseng root, but that the defenses were not sufficient to prevent the
   growth of CD-56.

   We analyzed six types of defense-related genes in detail: genes in the
   plant–pathogen interaction pathway, those related to ROS, auxin, and
   ethylene, and those encoding MYB and WRKY transcription factors. We
   found that the gene expression patterns in ginseng tended to return to
   normal levels after inoculation (Fig. [180]7). Cell surface PRRs are
   involved in the primary response to invading pathogens, and the
   recognition of the pathogen triggers a relatively weak but
   broad-spectrum immune response^[181]32. The expression profiles of
   genes encoding PRRs suggested that ginseng root cells rapidly
   recognized the invading pathogen CD-56, though the response became
   weaker with time after infection (Fig. [182]2C). This may reflect the
   process of CD-56 colonization of ginseng roots^[183]32. Regardless,
   CD-56 does not fully suppress plant defenses, even during prolonged
   infection, as previously observed for other fungal pathogens^[184]19.
   For example, most WRKY transcription factor DEGs in ginseng were
   up-regulated during infection, similar to potato tubers infected with
   P. infestans^[185]25, though the defense mechanism of ginseng appears
   to have some unique traits. For instance, expression of genes
   associated with ROS production was inhibited in the ginseng root, yet
   such genes were shown to be strongly expressed during the compatible
   interaction between cacao and M. perniciosa^[186]19. The rapid
   production of ROS by plants is a hallmark of pathogen recognition and
   correlates with the hypersensitive response^[187]25. Our results
   provide new insight into the I. robusta infection process, and we
   identified genes related to the infection process in ginseng roots.

Materials and Methods

Isolation of the CD-56 strain and extraction of genomic DNA

   A single-spore isolate of I. robusta CD-56 was isolated from rusty
   rot-infected roots of 3-year-old ginseng (P. ginseng) plants. The
   diseased samples were collected from Jilin Province, China. Strain
   CD-56 was cultured on potato dextrose agar (PDA) for 144 h on plates
   covered with cellophane. The fungus was transferred to liquid medium
   (potato dextrose broth) and cultured for an additional 36 h at 22 °C.
   Genomic DNA was extracted using DNeasy^® Plant Mini Kit (Qiagen,
   Hilden, CA; Cat# 69104).

Genome sequencing and genome assembly

   The CD-56 cell genome was sequenced by Pacific Biosciences II (Pacific
   Biosciences, San Francisco, CA, USA) and Illumina HiSeq3000 (Illumina,
   San Francisco, CA, USA) technology. We used Pacific Biosciences SMART
   analysis software 1.2 to generate long ‘filtered subreads’, which were
   high-quality reads. Filtered subreads were generated through the
   removal of adaptors, low-quality bases, and short reads. Here, we refer
   to subreads as ‘reads’ and do not refer to raw reads. After the reads
   were filtered, 12 SMRT cell sequencing generated 545,582 long reads,
   and the sum of bases was 6,064,719,683. The read N50 length was up to
   16,517 bp. The SMRT cells yielded approximately 65.48-fold coverage.
   All PacBio reads were assembled into polished contigs using HGAP
   v3^[188]56, and PBJelly was used to fill gaps^[189]57. Assembly
   sequence errors were corrected using high-quality HiSeq reads. Two DNA
   libraries were constructed for Illumina sequencing, including a 250-bp
   pair-end library and a 3–5-kb mate-pair library. We generated
   23,322,349 pair-end read pairs (2 × 101 bp) and 62,861,581 mate-pair
   reads (2 × 101 bp). High-quality sequencing reads are a prerequisite
   for precise correction of an assembled genome. To this end, we carried
   out a filtering process to filter low-quality reads (reads with excess
   “N” and low-quality scores), duplicated reads, adaptor reads, and
   contaminated reads. After the reads were filtered, the two libraries
   yielded approximately 50-fold coverage of high-quality reads (details
   shown in Table [190]S6). Indel and base sequencing errors in the SMRT
   sequencing data were corrected using the SNP calling method^[191]58.
   The genome assembly sequences were aligned to UniVec
   (ftp://ftp.ncbi.nih.gov/pub/UniVec) to identify contaminating sequences
   that may be of adaptor, vector, and primer, among other origins using
   BLAST (filtering with the following criteria: e-value 10-5 and identity
   90.0). The reserved sequences were aligned to the nt database to
   identify contaminated sequences that may be of bacterial origin using
   BLAST (filtering with following criteria: e-value 10-5, identity 90.0,
   query coverage 90, and no hits in eukaryotic genomes). Finally, we
   manually removed contaminated contigs using in-house programs. This
   Whole-Genome Shotgun project has been deposited in Genome Warehouse in
   BIG Data Center^[192]59, Beijing Institute of Genomics (BIG), Chinese
   Academy of Sciences (CAS), under accession number GWHAAAI00000000. The
   dataset is publicly accessible at [193]https://bigd.big.ac.cn/gwh.

Cell culture and transcriptome sequencing

   Roots of 3-year-old ginseng (Damaya) were surface-sterilized with 75%
   alcohol and then rinsing with sterile distilled water. All samples were
   placed in a crisper (15 cm × 10 cm) and incubated at 22 °C for 24 h
   under a 12 h light/12 h dark photoperiod for sample culture conditions
   before inoculation using a published method^[194]11. Roots were
   inoculated by inserting two cut-off pipette tips (200 µl, BBI) with
   inoculum into the roots. Each pipette tip contained 30 µl of spore
   suspension (1 × 10^7/ml) and was inserted into the root to a depth of
   5 mm (Fig. [195]S6). Ten ginseng roots were inoculated and incubated
   for 0, 36, 72, and 144 h. The diseased part of the same site was
   selected for subsequent trials with two replicates at each time point.
   A spore suspension was used as the fungal control (sample IDs, CD56_1
   and CD56_2), and roots inoculated with sterile water for 0 h were used
   as the host control (sample IDs, Ginseng_1 and Ginseng_2). Phenotypes
   were observed at 36 h, 72 h, and 144 h after inoculation (sample IDs,
   36 h_1 and 36 h_2, 72 h_1 and 72 h_2, and 144 h_1 and 144 h_2,
   respectively). A cork borer was used for sampling to ensure uniformity
   among the different time points. All samples were frozen in liquid
   nitrogen immediately after collection and stored at −80 °C. Total RNA
   was extracted using Qiagen Kit No. 74904 and purified using Qiagen Kit
   No. 08004 (Qiagen, Hilden, Germany).

   All samples were sequenced with an Illumina sequencer. The HiSeq3000
   paired-end DNA library with a fragment size of 500 bp was constructed
   using the Kapa Stranded mRNA-Seq Kit Illumina® platform (Kapa
   Biosystems, Boston, MA, USA) according to the kit’s protocol. This
   procedure generated 2 × 101 bp paired-end reads. The dual RNA-Seq raw
   read data reported in this paper have been deposited in Genome Sequence
   Archive^[196]60 in BIG Data Center^[197]59, BIG, CAS, Chinese Academy
   of Sciences under accession number CRA000204; the data are publicly
   accessible at [198]https://bigd.big.ac.cn/gsa.

Genome annotation

   To identify repetitive sequences in the genome, RepeatMasker was used
   with the Repbase library (v20.08) and the species option ‘Fungi’.
   Repeat sequences were masked with ‘N’, and a detailed list of the
   repeat types was obtained. Open reading frames (ORFs) in the CD-56
   genome were predicted using Augustus^[199]61 and FGenenSH 3.0^[200]62.
   The RNA-Seq data were incorporated into the Augustus prediction with
   BLAT^[201]63 (including iterative mapping). The predicted ORFs were
   validated by BLAST^[202]64, which was used to search for homologous
   genes, and the predicted CD-56 protein sequences were aligned to the
   NCBI nr (v20150902) protein database and nt (v20161218) nucleotide
   database by BLAST^[203]64 (e-value <= 1e-5) (v2.2.26). The best hit in
   the nr protein alignment was used to infer the biological function of
   each predicted CD-56 protein sequence. GO annotation was performed with
   the software Blast2GO^[204]65, which assigned homologous sequences
   aligned by BLAST in the NCBI nr database to GO terms. The CD-56 protein
   sequences were also compared with the KEGG GENES database^[205]66 by
   BLAST (e-value <= 1e-5) for KEGG Orthology (KO) assignments and pathway
   mapping. Transfer RNA (tRNA) genes in the genomic sequence were
   detected by tRNAscan-SE^[206]67. Ribosomal RNAs were identified using
   RNAmmer^[207]68. Other ncRNAs were identified based on hits in the Rfam
   database^[208]69.

Transcriptome reconstruction and analysis

   After sequencing adapters were removed and consecutive low-quality
   bases (quality score < 20) were trimmed from both the 5’ and 3’ end of
   the reads, high-quality RNA-Seq reads from CD-56 were aligned to the
   CD-56 draft genome, our own assembled genome (GWHAAAI00000000), using
   TopHat2 (v2.0.14) with default parameters. Transcripts of each sample
   were assembled using Cufflinks (v2.2.1), and DEGs between samples were
   identified by Cuffdiff (p-value < 0.05), a component of the Cufflinks
   package.

   The RNA-Seq reads from the ginseng samples were also processed by
   removing sequencing adapters and trimming consecutive low-quality bases
   and then assembled into transcripts using Trinity^[209]70 (v2.2.0, -min
   200). A set of non-redundant unigenes was obtained by keeping the
   longest transcript in a cluster as representative. Gene expression was
   quantified using RSEM (v1.2.6)^[210]71. Pairwise comparisons between
   samples were performed using the edgeR (McCarthy et al., 2012) package.
   We used p < 0.001 as the threshold to screen for DEGs.

   For pathway and GO enrichment analyses of DEGs, the threshold of
   p < 0.01 was selected to identify significantly enriched KEGG and GO
   terms using the hypergeometric test.

Analysis of effector genes

   Secretome proteins of CD-56 were identified through several prediction
   software programs. SignalP v4.1 server^[211]72 was used to predict
   signal peptide cleavage sites and TMHMM v2.0
   ([212]http://www.cbs.dtu.dk/services/TMHMM/) to predict transmembrane
   helices. TargetP v1.1
   ([213]http://www.biomedsearch.com/sci/TargetP-11-server/) was employed
   to remove mitochondrial proteins, and ProtComp v9.0
   (http://www.softberry.com/berry.phtml?topic=protcompan&group=programs&s
   ubgroup=proloc) was employed to predict the subcellular localization of
   fungal proteins. Proteins containing signal peptide cleavage sites but
   no transmembrane helices were considered to be secreted
   proteins^[214]37, and genes homologous to the predicted secretome
   protein genes were searched in the nr database using blastp
   (e-value ≤ 1e^−5, identity ≥40%). GO functional annotations were
   obtained from Blast2GO^[215]73. Visualization of the annotation
   analysis results of the full-length novel protein-coding gene set was
   plotted in WEGO^[216]74. The cysteine content of the predicted
   secretome proteins was determined with Perl scripts written in-house.
   Multiple tandem repeats were predicted by T-Reks^[217]75. To classify
   associated pathogenicity genes and effector genes, the PHI database
   (Version 4.2) was searched for hits for the predicted secretome protein
   genes.

   We used the Hmmscan program in the HMMER 3.0 package (with default
   parameters) to search each of the fungal predicted secretome proteins
   with the family-specific HMM profiles of CAZymes downloaded from the
   dbCAN database as queries^[218]76. The primary results were processed
   by the hmmscan-parser script supplied by dbCAN.

qRT-PCR confirmation

   The relative expression levels of selected secreted proteins were
   confirmed using qRT-PCR. cDNA was synthesized using Transcriptor
   First-Strand cDNA Synthesis Kit (B532445, Sangon, Shanghai, China), and
   500 ng of total RNA was reverse transcribed in a 20-µl reaction mixture
   following DNase I (2270a, Takara, Shiga, Japan) treatment, with three
   replicates for each gene. Each reaction contained 2X SG Fast qPCR
   Master Mix (High Rox, B639273, BBI) and 1 µl of 1:10 diluted cDNA
   template with gene-specific primers (Table [219]S7), and a StepOne Plus
   (ABI, Foster, CA, USA) was used. Relative quantitative analysis was
   performed using the 2^−ΔΔCt method with ITS as the reference gene.

Supplementary information

   [220]Supplementary file.^ (5MB, docx)

Acknowledgements