Abstract Nearly one-third of the world's population is latently infected with Mycobacterium tuberculosis (M. tb), which represents a huge disease reservoir for reactivation and a major obstacle for effective control of tuberculosis. During latent infection, M. tb is thought to enter nonreplicative dormant states by virtue of its response to hypoxia and nutrient-deprived conditions. Knowledge of the genetic programs used to facilitate entry into and exit from the nonreplicative dormant states remains incomplete. In this study, we examined the transcriptional changes of Mycobacterium marinum (M. marinum), a pathogenic mycobacterial species closely related to M. tb, at different stages of resuscitation from hypoxia-induced dormancy. RNA-seq analyses were performed on M. marinum cultures recovered at multiple time points after resuscitation. Differentially expressed genes (DEGs) at each time period were identified and analyzed. Co-expression networks of transcription factors and DEGs in each period were constructed. In addition, we performed a weighted gene co-expression network analysis (WGCNA) on all genes and obtained 12 distinct gene modules. Collectively, these data provided valuable insight into the transcriptome changes of M. marinum upon resuscitation as well as gene module function of the bacteria during active metabolism and growth. Keywords: transcriptional regulation, resuscitation, M. marinum, hypoxia, latency Introduction Mycobacterium tuberculosis (M. tb), the causative agent of tuberculosis (TB), is the leading cause of death due to an infectious disease globally, with an estimated 10 million new cases and 1.3 million deaths in 2017. There were an additional 300,000 deaths from TB among HIV-positive people. The success of M. tb as a leading pathogen is associated with its ability to infect and persist in the host. About 1.7 billion people, 23% of the world's population, are estimated to have a latent TB infection (LTBI), which is asymptomatic but can persist for decades ([41]Stewart et al., 2003; [42]North and Jung, 2004). About 5–10% of LTBI will eventually develop active disease, and host immunosuppression (e.g., HIV coinfection) markedly increases the risk of reactivation ([43]Corbett et al., 2007). LTBI poses a major challenge to the effective control of TB because of the difficulty in treatment and the fact that LTBI represents a huge disease reservoir. During LTBI, M. tb is thought to enter nonreplicative ‘dormant' states by virtue of its lowered or altered metabolism in response to hypoxia, nitrosative stress, and/or nutrient deprivation ([44]Boshoff and Barry, 2005). Accordingly, much research has been focused on environmental conditions and genetic programs that induce bacteriostasis, and the most extensively studied culture condition is hypoxia ([45]Wayne and Sohaskey, 2001; [46]Rustad et al., 2009). It was shown that an immediate bacterial response (2 hr) was the coordinated upregulation of 47 M. tb genes under the control of the response regulator (DosR) and two sensor kinases (DosS and DosT), known as the DosR regulon ([47]Sherman et al., 2001; [48]Boon and Dick, 2002; [49]Park et al., 2003). A second set of 230 genes, induced by longer hypoxia exposure (7 days), was also identified ([50]Rustad et al., 2008). These genes, collectively known as the enduring hypoxic response (EHR), were DosR-independent genes ([51]Rustad et al., 2008). During the reactivation of LTBI, the dormant bacteria are believed to resuscitate and resume active growth and metabolism. A few recent studies have used reaeration of hypoxic cultures for in vitro modeling of reactivation or resuscitation ([52]Veatch and Kaushal, 2018). Several regulatory proteins, such as transcription factor ClgR and sigma factors SigH and SigE, were found to play a role in M. tb resuscitation from hypoxia ([53]Mcgillivray et al., 2015; [54]Iona et al., 2016; [55]Veatch et al., 2016). Despite the progress, knowledge of the genetic programs used to facilitate entry into and exit from the nonreplicative dormant states remains incomplete. In this study, we examined the transcriptional changes of Mycobacterium marinum (M. marinum) at different stages of resuscitation from hypoxia-induced dormancy. M. marinum is a pathogenic Mycobacterium and the closest genetic relative of the M. tb complex. M. marinum is an excellent model through which to understand various aspects of host–pathogen interactions in M. tb pathogenesis. For example, M. marinum and M. tb share many virulence determinants, such as the ESX-1 secretion system ([56]Tobin and Ramakrishnan, 2008) and lipid virulence factors phthiocerol dimycocerosates and phenolic glycolipids ([57]Yu et al., 2012). As such, findings from the current study of M. marinum may be applicable to M. tb. Result RNA-Seq Analysis of M. marinum Recovered From Hypoxia Larry Wayne and co-workers were the first to develop an in vitro model to mimic the hypoxic environment of the human granuloma ([58]Wayne, 1977; [59]Wayne and Hayes, 1996; [60]Wayne and Sohaskey, 2001). In the Wayne model, a sealed, standing culture is incubated over an extended period while the bacteria deplete the available oxygen. The gradual depletion of oxygen leads to nonreplicating persistence states with a concomitant shift in gene expression and metabolism. To gain insight into the genetic mechanisms that facilitate the exit of mycobacteria from the nonreplicative state, we grew M. marinum under hypoxia for 7 days using the Wayne model and then reaerated the cultures. At different time points thereafter (0, 0.5, 4, 12, 24, and 48 hr), M. marinum cultures were collected and subjected to RNA-seq analysis. The growth curve of the bacteria is shown in [61]Supplementary Figure 1 . A total of 18 samples were collected (three biological replicates at each time point) and analyzed. The RNA-seq reads showed a high mapping ratio for all samples (>96%) ([62] Table 1 ), supporting the overall sequencing accuracy. Transcripts of more than 4,900 genes were detected in each sample. We compared the RNA-seq data of cultures recovered at different time points under aerobic conditions. As expected, results showed that the correlation coefficient decreased as the interval between two samples increased ([63] Figure 1 ). This result also suggested that the recovery from hypoxia is a gradual but dynamic process. Table 1. RNA-seq reads of 18 samples (three biological replicates at each time point). Sample Total reads Mapped reads Pair mapped reads Single mapped reads Mapped ratio(%) 0hr_rep1 25667618 25206648 25071544 135104 98.2 0hr_rep2 24461374 24025951 23875630 150321 98.22 0hr_rep3 24356981 23991626 23851008 140618 98.5 0.5hr_rep1 27750348 27226045 27037288 188757 98.11 0.5hr_rep2 29227286 28693463 28508636 184827 98.17 0.5hr_rep3 30215819 29671934 29509385 162549 98.2 4hr_rep1 29552072 28996782 28795942 200840 98.12 4hr_rep2 25707932 25227133 25058798 168335 98.13 4hr_rep3 26459312 25956585 25781988 174597 98.1 12hr_rep1 20056376 19435701 19294860 140841 96.91 12hr_rep2 25849366 25305478 25143598 161880 97.9 12hr_rep3 24639687 24048335 23903266 145069 97.6 24hr_rep1 28399172 27857378 27662324 195054 98.09 24hr_rep2 31131638 30539305 30334952 204353 98.1 24hr_rep3 29686534 29152176 28950812 201364 98.2 48hr_rep1 28813928 28353514 28145540 207974 98.4 48hr_rep2 27427442 26995636 26807948 187688 98.43 48hr_rep3 28062635 27641695 27444061 197634 98.5 [64]Open in a new tab Figure 1. Figure 1 [65]Open in a new tab Calculated correlation coefficients between RNA-seq data from samples of different time points (0_hour, 0.5_hour, 4_hour, 12_hour, 24_hour, and 48_hour) during resuscitation. Dynamic Changes of Gene Expression at Different Stage of Resuscitation To analyze transcriptome changes of M. marinum, we focused on genes with RPKM ≥ 10 and compared samples from adjacent intervals: between 0.5 and 0 hr, 4 and 0.5 hr, 12 and 4 hr, 24 and 12 hr, as well as 48 and 24 hr. Differentially expressed genes (DEGs) were identified, and this was defined as a fold change greater than 2 and false discovery rate P value less than 0.05. At the earliest time point after resuscitation (0.5 hr), 136 DEGs were detected, of which 71 were upregulated and 65 were downregulated. Between 4 and 0.5 hr, most of the DGEs were downregulated (81 out of 88 total DEGs). The numbers of DGEs found between 12 and 4 hr, 24 and 12 hr, as well as 48 and 24 hr were 72, 85, and 172, respectively. The heat map of the five groups of DEGs is shown in [66]Figure 2 . Figure 2. [67]Figure 2 [68]Open in a new tab Heatmaps of DEGs between adjacent time points. At each time point, data from three biologically independent samples were included. (A) 0.5 vs. 0 hr; (B) 4 vs. 0.5 hr; (C) 12 vs. 4 hr; (D) 24 vs. 12 hr; and (E) 48 vs. 24 hr. The red color indicates upregulation. The blue color indicates downregulation. We performed a Venn analysis of the five DEG groups ([69] Figure 3A ). The proportion of unique genes in each group was high: 68.4% (93/136, between 0.5 and 0 hr), 60.2% (53/88, between 4 and 0.5 hr), 46.7% (35/72, between 12 and 4 hr), 47.1% (40/85, between 24 and 12 hr), and 74.4% (128/172, between 48 and 24 hr). This suggests that a variety of genes were involved at different stages of resuscitation. Figure 3. [70]Figure 3 [71]Open in a new tab Analysis of DEGs in five groups. (A) The Venn diagram of five DEGs. (B) The expression trend of all DEGs (440 genes) in the five periods (left). The number of DEGs in each period is shown on the right. Combing the DEGs of different stages, a total of 440 genes were identified ([72] Supplementary Table 1 ), and their expression underwent dynamic changes during resuscitation ([73] Figure 3B ). For example, the expression of MMAR_0922, MMAR_3562, and MMAR_1654 were significantly changed at the early stage of resuscitation, suggesting that they may play an important role in this period. Some genes had changed in multiple time periods. For example, the expression of MMAR_3403 was changed in last three periods, suggesting that this gene may be associated with the late stage of resuscitation. Validation of RNA-seq Results by RT-qPCR To validate the RNA-seq results, a real-time quantitative (RT-qPCR) analysis was performed. Three biologically independent samples at each time point were used for this experiment. For each DEG group, we selected the top 10 upregulated genes and 10 downregulated genes for this analysis ([74] Figure 4 ). Between 0 and 0.5 hr, MMAR_4852 and MMAR_5170 (whiB4) were significantly downregulated, and six genes, MMAR_5122 (lipX), MMAR_0548 (espG3), MMAR_0547 (esxR), MMAR_0551 (eccE3), MMAR_0546 (esxG), and MMAR_0550 (mycP3), were significantly upregulated ([75] Figure 4A ). Between 0.5 to 4 hr, six genes, including MMAR_1656, MMAR_1658 (hycQ), MMAR_1653 (Rv0081), MMAR_1655, MMAR_5122 (lipX), and MMAR_5170 (whiB4), were significantly downregulated and four genes MMAR_0845 (hemB), MMAR_5484, MMAR_1908 (ATC1), and MMAR_3776 (rpfE) were significantly upregulated ([76] Figure 4B ). Between 4 and 12 hr, MMAR_2343 (papA1), MMAR_3555, and MMAR_2320 (wecE) were significantly upregulated and MMAR_4903 were significantly downregulated ([77] Figure 4C ). Between 12 and 24 hr, six genes, MMAR_0335, MMAR_0602, MMAR_4903, MMAR_4899, MMAR_2009, and MMAR_0615 (iniA), were significantly downregulated ([78] Figure 4D ). Between 24 and 48 hr, six genes, MMAR_3465 (PPE51), MMAR_4824, MMAR_4482 (cypM), MMAR_4750, MMAR_2944, and MMAR_1790 (PPE2), were significantly downregulated, and seven genes, MMAR_2651, MMAR_2914 (katG), MMAR_2649, MMAR_5315 (lpqH), MMAR_5319, MMAR_2839 (mpt63), and MMAR_0656, were significantly upregulated ([79] Figure 4E ). Figure 4. [80]Figure 4 [81]Open in a new tab qRT-PCR results. (A–E) Approximately 20 genes from each period were selected and analyzed by qRT-PCR. *p < 0.05; **p < 0.01; and ***p < 0.001. (F) Scatter plot of RT-PCR data of all genes analyzed in (A–E), comparing them to RNA-seq data of the same genes. There is a good agreement between the RNA-seq and qPCR data, evident from the scatter plot using the expression levels of all 97 genes that were analyzed by both RNA-seq and qPCR (R^2^= 0.784) ([82] Figure 4F ). Based on this result, we consider that the RNA-seq data is reliable. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway Enrichment Analysis To gain insight into the biological consequence of the observed transcriptome changes, we performed GO and KEGG pathway analyses of the five DEG groups ([83] Figure 5 ). A GO analysis was applied to identify the functional categories of DEGs. Between 0.5 and 0 hr, more than half of DEGs were involved in membrane and cell wall processes and were significantly enriched ([84] Figure 5 ). This is consistent with the notion that, upon reaeration, the bacteria resumed cell division, which involved cell wall and membrane biogenesis. A KEGG analysis consistently revealed that DEGs involved in metabolic pathways and biosynthesis were significantly enriched and accounted for the largest number. A similar trend was observed for later periods, in which genes involved in cell wall and membrane biogenesis were highly enriched and accounted for the majority of DEGs at these stages. These results provide snapshots into the recovery of the bacteria from hypoxia and active growth under aerobic conditions. Figure 5. [85]Figure 5 [86]Open in a new tab GO and KEGG analysis of DEGs in the five periods. Co-Expression Analysis Between Transcriptional Regulators and mRNAs Co-expression networks can show relationships between genes. To explore the regulatory mechanisms at different stages of resuscitation, we constructed a co-expression network between transcriptional regulators and mRNAs. For this, we selected known transcriptional regulators, including transcription factors and sigma factors from the five DEGs, and calculated the correlation coefficients between these transcription factors and the remaining DEGs in the same group. We considered that a relationship existed between a given transcriptional regulator and other genes if the absolute value of the correlation coefficient was greater than 0.9, which included both positive and negative correlation. Based on these results, we constructed five co-expression modules and integrated them into a large network ([87] Figure 6 ). The dark blue nodes in the figure represent transcription factors, and the green nodes denote DEGs in the same period. The size of the node is determined by the degree of connectivity. Greater degrees of connectivity are indicated by larger points. If there is a line between two nodes, then there is a relationship between them. Figure 6. [88]Figure 6 [89]Open in a new tab Co-expression networks in five periods. Each dot represents a gene. Transcription factors are labeled in blue, and other genes are labeled in green. Line between dots represents co-expression relationship between genes. The size of dot is proportional to the level of connectivity. In the first co-expression module (between 0 and 0.5 hr), three transcription factors, MMAR_4874 (CosR), MMAR_1653 (Rv0081), and MMAR_4852 (KmtR), formed the major regulatory hubs, and MMAR_4874 (CosR) was the largest hub and interacted with other hubs in the network ([90] Figure 6 ). The MMAR_4874 (CosR) and MMAR_1653 (Rv0081) hubs remained in the second co-expression module (between 0.5 and 4 hr), in addition to three new hubs formed by MMAR_4254, MMAR_1725 and MMAR_1132. In the third period (between 4 and 12 hr), MMAR_0229 and MMAR_4902 formed the hubs. In the fourth period (between 12 and 24 hr), MMAR_2003 (SigB) and MMAR_4219 formed the hubs. In the final period (24 to 48 hr), MMAR_2651, MMAR_1555, and MMAR_0249 formed the hubs. Weighted Gene Co-Expression Network Analysis (WGCNA) The DEG analysis focused on partial dynamic changes during resuscitation. While the co-expression network of transcription factors provides an overview of the regulatory programs enabling the resuscitation of M. marinum, our knowledge on the overall genetic changes is still missing. Therefore, in this section, we analyzed the expression of all genes from cultures at different stages of resuscitation. Weighted gene co-expression network analysis (WGCNA) ([91]Langfelder and Horvath, 2008) is a method for analyzing the gene expression patterns of multiple samples. It clusters genes into modules by similar expression trends and reveals the relationship between gene modules and specific traits or phenotypes. We applied this method to analyze the RNA-seq data of M. marinum at different stages of resuscitation. The gene modules were identified by the WGCNA package in R software. We first determined the appropriate “soft-thresholding” value, which emphasizes strong gene–gene correlations at the expense of weak correlations. An optimal parameter (power = 20) was determined by plotting the strength of correlation against a series (range 2 to 20) of soft threshold powers ([92] Figure 7A ). Figure 7. [93]Figure 7 [94]Open in a new tab WGCNA cluster analysis. (A) Plot of the strength of correlation against a series (range 2 to 20) of soft threshold powers. (B) Gene clusters and gene module fusion. An unsigned pairwise correlation matrix was calculated, and the WGCNA algorithm was used to transfer the correlation coefficient between genes into the adjacent coefficient. Then, the dissimilarity of the topological overlap matrix was calculated based on the adjacent coefficient. Using the calculated dissimilarity, we carried out a hierarchical analysis by using agglomerative hierarchical clustering, also known as the bottom-up method. Other assumptions were made: (i) distances between different classes were measured by the average connectivity, and (ii) there should be at least 30 genes in each gene module. Based on these analyses, we initially obtained 48 gene modules. The hierarchical cluster tree was then treated using the dynamic tree cut algorithm in the WGCNA package. A total of 13 gene modules were obtained. The “gray” module was the default module, which included discarded genes that could not be clustered. Thus, we focused on the analysis in the remaining 12 gene modules. The process of fusion is shown in [95]Figure 7B . The number of genes varied among these 12 modules, and the detailed information is listed in [96]Table 2 and [97]Supplementary Table 2 . Table 2. Information of gene modules identified by WGCNA. Module_ID Number of genes module_34 1711 module_26 919 module_43 424 module_14 320 module_20 298 module_3 268 module_35 142 module_29 136 module_12 127 module_22 89 module_47 89 module_45 46 [98]Open in a new tab The first principal component analysis (PCA) was performed on the 12 gene modules ([99] Figure 8 ). The PCA results reflected the main trend of gene expression in the modules. Module 20 played an important role in the early stage (0 to 0.5 hr) of recovery, module 35 played a role mostly in the middle stage (4 to 12 hr), and module 12 was only involved in the last stage (24 to 48 hr). Other modules appeared to play roles in more extended periods. Figure 8. [100]Figure 8 [101]Open in a new tab PCA analysis of 12 gene modules. The X axis represents time period, and the Y axis represents expression of first principal component. Identification of Key Gene Modules Associated With Different Stages of Resuscitation Thus far, we have identified 5 DEGs and 12 gene modules. We then performed an enrichment analysis between them. When the P value of Fisher's exact test was less than 0.001, we considered that these gene modules were significantly enriched in the DEG sets. The results are shown in [102]Figure 9 . Interestingly, module 20 was significantly enriched in DEGs of the early stage (0 to 0.5 hr) and module 12 was significantly enriched in DEGs of the last stage (24 to 48 hr) of the recovery. This is consistent with the result that these two modules were only involved in the early and last stages of resuscitation, respectively ([103] Figure 8 ). Figure 9. [104]Figure 9 [105]Open in a new tab Enrichment analysis between 12 gene modules and 5 DEGs. The above bar chart represents the number of DEGs in each period, and the bar chart on the right represents the number of genes in each gene module. Blue squares represent a significant enrichment between the row set and the column set. Discussion In this study, we examined the transcriptome changes of M. marinum recovered from hypoxia-induced dormancy. To gain a comprehensive view, multiple time points, including shortly after resuscitation (0.5 hr) to more extended periods up to 48 hr, were included. For each time point, three biologically independent samples were analyzed. Transcripts of the whole genome were analyzed by RNA-seq, and the quality of the RNA-seq data was reflected by the high genome mapping ratio and further validated by qPCR analysis of close to 100 genes. With these high-quality sequence data, we performed in-depth analyses, which included the identification of DEGs and the construction of co-expression network of transcription factors in each period. The availability of transcriptomes of independent samples at multiple time points also allowed us to employ a weighted gene co-expression network analysis to identify gene modules of M. marinum. Collectively, these data provide valuable insight into not only the genetic changes of the bacteria upon resuscitation but also the gene module function of M. marinum during active metabolism and growth. A total of 136 DEGs were identified in M. marinum upon resuscitation from dormancy (0 to 0.5 hr), including eight transcription factors ([106] Figure 6 ). Interestingly, all of these transcription factors were significantly downregulated. Among them, MMAR_1653 is a homolog of Rv0081 in M. tb. Previous studies have shown that Rv0081 is a member of the DosR regulon and is induced at the early stage of hypoxia ([107]Sherman et al., 2001). Rv0081 is a major regulator of M. tb response to hypoxia and forms a large regulatory hub ([108]Galagan et al., 2013; [109]Chawla et al., 2018; [110]Sun et al., 2018). Rv0081 plays an important role connecting the early and enduring hypoxic responses ([111]Sun et al., 2018). WhiB4 (MMAR_5170) is an oxygen-sensitive transcription factor and has been shown to regulate PE/PPE family proteins, and it plays a role in M. marinum virulence ([112]Chawla et al., 2012; [113]Wu et al., 2017). CosR (MMAR_4874) is a copper-inducible transcriptional regulator, and the loss of cosR resulted in a hypoxia-type response with the induction of the DosR regulon ([114]Talaat et al., 2004; [115]Ward et al., 2008; [116]Marcus et al., 2016). Given their roles in the hypoxic response, it is not surprising that Rv0081, WhiB4, and CosR underwent dynamic changes in expression upon resuscitation by reaeration. Two other transcription factors that were downregulated at this stage, MMAR_5405 (EthR) and MMAR_1394 (Rv3176c), belong to the TetR family transcription factors ([117]Leiba et al., 2014; [118]Sharma et al., 2017). We also found that multiple members of the ESAT-6 family proteins were upregulated upon resuscitation ([119]Harboe et al., 1996; [120]Priscille et al., 2004), including EsxA ([121]Sandra et al., 2010; [122]Zhang et al., 2016), EsxB ([123]Sandra et al., 2010), EsxG ([124]Sweeney et al., 2011), EsxH ([125]Alka et al., 2013; [126]Portal-Celhay et al., 2016), EsxK, and EsxN ([127]Zhigang et al., 2017). These proteins are components of the Type VII secretion systems, and many of them are important T cell antigens and play a critical role modulating the host–pathogen interactions ([128]Abdallah et al., 2007). From 0.5 to 4 hr after reaeration, 11 transcription factors were downregulated ([129] Figure 6 ), including Rv0081, CsoR, and WhiB4 as mentioned above. Notably, the expression of two other WhiB family proteins ([130]Averina et al., 2012), WhiB3 and WhiB5, were also significantly altered. WhiB3 responds to dormancy signals, including hypoxia and NO, and controls redox homeostasis of the bacteria ([131]Priscille et al., 2004). WhiB5 responds to oxygen and controls the expression type VII secretion systems ([132]Priscille et al., 2004). Consistently, we found that whiB3 was downregulated while whiB5 was upregulated at this stage of resuscitation. During the period of 4 to 12 hr, we identified 72 DEGs, including 7 PPE family genes (MMAR_5121, MMAR_1095, MMAR_1235, MMAR_1847, MMAR_1905, MMAR_2669, and MMAR_3989). The PE/PPE family proteins play a critical role in mycobacterial pathogenesis ([133]Fishbein et al., 2015). In addition, several genes of the Mce family were upregulated, including MMAR_3865 (mce2B), MMAR_3868 (mce5E), MMAR_3867 (mce5D), MMAR_3866 (mce5C). The Mce family genes comprise four mammalian cell invasion factor (mce) operons (mce1-4), and some of these are involved in the invasion of host cells ([134]Zhang and Xie, 2011). A picture appears to have emerged from these analyses; in the early stage of resuscitation from the hypoxia-induced dormancy, transcription factors critical for a hypoxia-induced response are downregulated, and, as the recovery continues, genes important for virulence and host interactions are upregulated. A WGCNA analysis revealed 12 distinct gene modules. Of particular interest is gene module 20, which was involved in the early stage of resuscitation only ([135] Figure 8 ). This module comprises of ~300 genes, many of which have unknown functions or annotations. Future studies focusing on genes in this module may help to understand the molecular machinery enabling the exit of the bacteria from dormancy. Methods and Materials Bacterial Strain, Media, and Growth Conditions M. marinum 1218R (ATCC 927) was grown in Middlebrook 7H9 broth to OD600~0.5, at which point they were aliquoted and cultured in screw-capped conical flasks at 30°C without additional oxygen. The hypoxic culture conditions were described previously by Wayne and Hayes ([136]Wayne and Hayes, 1996). After 7 days in hypoxic conditions, the screw cap was replaced with a permeable membrane, and the rest of the conditions were unchanged. After aeration, samples were taken at 0 h, 0.5 h, 4 h, 12 h, 24 h, and 48 h, and an aliquot was used to measure the OD value ([137] Supplementary Figure 1 ). The remaining samples were collected and snap frozen in liquid nitrogen for RNA sequencing. RNA Extraction, Illumina Sequencing, and RT-qPCR M. marinum cultures were centrifuged at 4,500 × g for 5 min at room temperature and frozen on dry ice. The frozen cell pellets were resuspended in 1 mL TRIzol reagent (CW Bio). RNA extraction and illumina sequencing were performed as previously described ([138]Wu et al., 2017). Raw data of RNA sequencing have been uploaded to the GEO database (BioProject ID : PRJNA588556). For RT-qPCR validation of RNA-seq data, 1 µg RNA was reversed-transcribed to cDNA, which was then used as the template for RT-PCR analysis. The primers for analyzing the selected genes were listed in [139]Supplementary Table 3 . Transcriptome and Bioinformatics Analysis The RNA-seq analysis and identification of differentially expressed genes (DEGs) were performed as previously described ([140]Lee et al., 2019). GO and KEGG Analysis We used the Kyoto Encyclopedia of Genes and Genomes (KEGG) database ([141]Ogata et al., 2000) ([142]www.genome.jp/kegg/) and the Gene Ontology (GO) database ([143]Ashburner et al., 2000) ([144]www.geneontology.org/) for our data analysis. Construction of the Gene Co-Expression Network We calculated the correlation coefficient between identified transcription factors and other DEGs in each time period. The correlation coefficient value ranges from -1 to 1, representing a negative and positive correlation, respectively. We considered the expression of two genes as correlated if the absolute value of the correlation coefficient was larger than 0.8. The results were imported into Cytoscape 3.0 to generate the network map ([145]Kohl et al., 2011). Weighted Gene Co-Expression Network Analysis (WGCNA) We used the RNA-seq data from multiple time points (three biological independent samples at each time point) for WGCNA analysis. We used the WGCNA package to cluster gene modules as follows. 1. Define gene co-expression similarity: calculate the similarity between any two genes using Pearson's correlation coefficient (Sij  =  |cor(i,j)|, the correlation coefficient of gene i and gene j), which then forms the correlation matrix (S  =  [S[ij]]). 2. Define the exponential weighted value β: for any gene pair (i and j), apply the exponential adjacency function in the WGCNA algorithm to measure their relation index, namely, the exponential weighted β square of the correlation coefficient (a[ij]  =  power(S[ij], β)  =  |S[ij]|^β). Exponential weighted β is the power of the correlation coefficient. We selected β  =  5 after the analysis (fit value R^2 to approximately 0.9). 3. Define a measure of node dissimilarity: after determining the adjacency function parameter β, the correlation matrix S  =  [S[ij]] is switched into the adjacency matrix A  =  [a[ij]] and converted into the topological overlap matrix Ω  =  [ω[ij]]. k[i] or k[j] indicate the sum of one node's adjacency coefficients. The node is a gene (i or j). [MATH: ωij=lij< mo>+aijmin{ki,kj}< mo>+1 ai j :MATH] [MATH: lij=μaiμaμ< /mi>jki=μaiμki=μ aiμ :MATH] 4. Build hierarchical clustering tree to identify gene modules: the hierarchical clustering tree built using the dissimilarity coefficient [MATH: dijω :MATH] [MATH: (dijω=1ωij :MATH] ), and the different branches represent the gene modules. Enrichment Analysis and PCA Analysis To determine whether one set of genes were more enriched in another set of genes, we used the Chi-square test or Fisher's exact test ([146]Upton, 1992). First, the two sets of genes were taken and used to form a 2*2 contingency table. If there was a value less than or equal to five in the table, the Fisher's exact test was applied; otherwise, the Chi-square test was applied. When the p value was less than 0.05, the two sets were considered significantly enriched to each other. PCA analyses were performed by the princomp function in R software (version 3.5.1) Data Availability Statement Raw data of RNA sequencing have been uploaded to the GEO database (BioProject ID:PRJNA588556). Author Contributions JJ and CL performed the experiment. JJ, CL, and JL wrote the manuscript. YW and JZ prepared [147]Figures 1 –[148] 9 . LS, KY, and WX prepared [149]Tables 1 and [150]2 . LZ, YL, and JL provided guidance for experiments. All authors reviewed the manuscript. Conflict of Interest The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. Acknowledgments