Abstract The establishment of new therapeutic strategies for metabolic syndrome is urgently needed because metabolic syndrome, which is characterized by several disorders, such as hypertension, increases the risk of lifestyle-related diseases. One approach is to focus on the pre-disease state, a state with high susceptibility before the disease onset, which is considered as the best period for preventive treatment. In order to detect the pre-disease state, we recently proposed mathematical theory called the dynamical network biomarker (DNB) theory based on the critical transition paradigm. Here, we investigated time-course gene expression profiles of a mouse model of metabolic syndrome using 64 whole-genome microarrays based on the DNB theory, and showed the detection of a pre-disease state before metabolic syndrome defined by characteristic behavior of 147 DNB genes. The results of our study demonstrating the existence of a notable pre-disease state before metabolic syndrome may help to design novel and effective therapeutic strategies for preventing metabolic syndrome, enabling just-in-time preemptive interventions. Subject terms: Predictive markers, Metabolic syndrome Introduction Metabolic syndrome is a state wherein a blood glucose level, blood pressure, and/or a triglyceride level are elevated higher than those in normal ranges, mainly due to abdominal obesity^[44]1. A reduced level of high-density lipoprotein (HDL) cholesterol is also observed^[45]1. Metabolic syndrome increases the risk of lifestyle-related diseases, including type 2 diabetes, cardiovascular disease (CVD), and nonalcoholic steatohepatitis (NASH)^[46]1–[47]3. Diabetic retinopathy is a major cause of blindness in adults. Moreover, lethal heart attacks and strokes are ultimately induced from arteriosclerosis originating from metabolic syndrome^[48]3. NASH can progress to hepatocellular carcinoma (HCC)^[49]2. The incidence of metabolic syndrome is increasing because of the lack of effective treatment methods. The incidence of obesity, a major cause of metabolic syndrome, also continues to increase worldwide. According to the World Health Organization (WHO), in 2016, more than 650 million (13%) adults were obese, with body mass index (BMI) ≥30 kg/m^2^[50]4. The number of individuals with obesity has increased nearly three times since 1975^[51]4. Therefore, there is a need for the development of new therapeutic strategies for metabolic syndrome, particularly in the early disease stage, which is considered as the best period for effective treatment. One approach is to focus on the pre-disease state^[52]5 (that is, the most susceptible state to disease progression) before metabolic syndrome and elucidate the complex mechanisms underlying the transition from a relatively healthy condition or mild obesity to metabolic syndrome. It is important to note that the pre-disease state does not mean the absence of any noticeable changes at any part of the body. Instead, it is the state that is (1) not diagnosed as the disease state based on the existing criteria but (2) supposed to have high susceptibility or vulnerability to diseases. Recently, we proposed mathematical theory for the detection of the pre-disease state before complex/multifactorial diseases, called the dynamical network biomarker (DNB) theory^[53]5. The main purpose of this theory is to detect early warning signals of critical transitions^[54]6 in biological systems. Critical transitions are sudden and large-scale state transitions that occur in many complex systems, such as ecological systems^[55]7, climate systems^[56]8, financial markets^[57]9, microorganism populations^[58]10, and the human body^[59]5. It is important to note that critical transitions are different from critical phenomena and phase transitions in physics. Recent studies revealed that the early warning signals of critical transitions in various systems share some common features. For example, increases in variance and auto-correlation, and decreases in recovery rates are observed frequently^[60]6. Furthermore, increases in the strength of intervariable correlations^[61]11 have been reported. This is not necessarily because the coupling between variables becomes stronger by itself near a critical transition. A possible mechanism is that when variances of variables increase, their inherent interactions may become much apparent, resulting in stronger correlations. In the context of the DNB theory, a state transition from a healthy state to a disease state is regarded as a critical transition if (1) quick recovery to the healthy state becomes difficult or even impossible once the transition occurs, and (2) intervention and prevention as treatment before the transition, that is, at the pre-disease state or earlier, are much easier than those after the transition. The idea regarding the development of a disease as a critical transition has been widely accepted in the literature^[62]5,[63]12–[64]17. The DNB theory provides statistical methods to select relevant variables for detection of the pre-disease state. The basic assumption is that a small number of closely related variables, called DNBs, convey early warning signals for the impending critical transition. The DNB theory and its extensions have also proposed several measures for the detection of the pre-disease state^[65]5,[66]14,[67]16. For example, the average standard deviation of DNB variables and the average absolute value of correlation coefficients between DNB variables are easy to calculate and are widely applicable measures. Simultaneous changes in these measures are regarded as a reliable early warning signal, suggesting that a critical transition to a disease state will occur. On the other hand, measures based on auto-correlation and recovery rates with respect to DNB variables were not considered in previous studies mainly because these statistics require unrealizable repetitive measurements at short intervals from the same individual. The DNB theory has been applied to real data of many diseases, such as acute respiratory distress syndrome^[68]5, diabetes mellitus^[69]12, influenza^[70]14, cancer^[71]13–[72]15,[73]17, and Alzheimer’s disease^[74]16, as well as experimental data in cell biology, such as that of cell differentiation^[75]18. Moreover, researches on the improvement of the statistical methods^[76]14,[77]16 and refinement of the theory^[78]19 are in progress. Metabolic syndrome is characterized by several disorders, such as hyperglycemia, hypertension, and dyslipidemia, all of which are caused by the accumulation of visceral fat due to overeating and lack of exercise^[79]1,[80]3. Therefore, many animal models, such as ob/ob and db/db mice, have been established^[81]20 in order to elucidate the mechanisms underlying metabolic syndrome and construct new therapeutic strategies. Tsumura, Suzuki, Obese, Diabetes (TSOD) mice^[82]21,[83]22 are known to develop a wide range of disorders similar to human metabolic syndrome, including hyperglycemia^[84]22, hypertension^[85]23, dyslipidemia^[86]22, glucose intolerance^[87]21, insulin resistance^[88]24, peripheral neuropathy^[89]25, intestinal dysbiosis^[90]26, and histopathological characteristics in the liver similar to those of human nonalcoholic fatty liver disease (NAFLD)^[91]27. In the present study, we used the DNB theory to detect the pre-disease state, which is known as Mibyou^[92]28 in traditional Japanese medicine and Wei Bing^[93]29 in traditional Chinese medicine, before metabolic syndrome in TSOD mice. Results Figure [94]1A shows the body weights and blood glucose levels of TSOD mice. The numbers of analyzed samples are shown in Supplementary Table [95]S1. The average body weight and average blood glucose level did not exceed the suggested thresholds for defining obesity in TSOD mice^[96]30 (body weight ≥40 g) and prediabetes in rodents^[97]31 (blood glucose level ≥200 mg/dL) until 7 weeks of age. Our results were consistent with those of previous studies reporting that the metabolic syndrome onset in TSOD mice was seen at approximately 8–12 weeks of age^[98]20,[99]24,[100]27. Figure 1. [101]Figure 1 [102]Open in a new tab Body weights, blood sugar concentrations, and PCA of the transcriptome data. (A) Body weights and blood sugar concentrations measured from TSOD mice. Error bars show 95% confidence intervals. Red dashed lines show the suggested thresholds for defining obesity in TSOD mice (body weight ≥40 g) and prediabetes in rodents (blood glucose level ≥200 mg/dL). (B) Scree plot of the largest 15 components. The number of meaningful PCs is estimated to be the number of eigenvalues calculated from the original transcriptome data larger than the 95 percentiles of eigenvalues calculated from shuffled data. For more details, see Dimensionality reduction in Materials and Methods. (C) PCA plot of the transcriptome of TSOD mice (circles) and TSNO mice (diamonds). The numbers in parentheses denote explained variance ratios corresponding to each PC. Regarding results of other meaningful components, see Supplementary Fig. [103]S1. PC: principal component. In order to identify the pre-disease state before metabolic syndrome, we comprehensively assessed gene expression profiles in the adipose tissues of TSOD and control Tsumura, Suzuki, Non-Obesity (TSNO) mice^[104]22 at each age using DNA microarrays. A principal component analysis (PCA) was initially performed in order to reveal overall trends in transcriptome data (Fig. [105]1B,C). Eight meaningful principal components (PCs) were estimated (Fig. [106]1B). The two strains of TSOD and TSNO mice were almost completely separated in the PCA plot using PC1 and PC2 (Fig. [107]1C). The data points of individual mice were roughly aligned along age, except for the last 2 weeks in each group. The overall shifts to the right and down may reflect common developmental or age-related changes. No outlier was found, and thus, the quality and reliability of all gene expression data were satisfactory. In addition, no apparent abrupt change in TSOD mice was observed at every week of age in the PCA plot, which indicated the necessity of additional elaborated statistical analyses to reveal the pre-disease state before metabolic syndrome. The results of other meaningful PCs are included in Supplementary Fig. [108]S1. Furthermore, differentially expressed genes (DEGs) between TSOD and TSNO mice were selected. We compared each age of the two mice groups (for example, TSOD mice (n = 5) versus TSNO mice (n = 5) at 3 weeks of age). The numbers of DEGs were 889, 852, 1321, 664, and 1296 for 3, 4, 5, 6, and 7 weeks of age, respectively. The total number of genes that were expressed differentially in at least one time period was 2665. Approximately 50% genes were included in only one DEG set, and the others were shared by two or more DEG sets (Fig. [109]2A). Although Venn diagrams are commonly used to visualize the overlap pattern of a few gene sets, we selected a different method because there were five DEG sets. Figure 2. [110]Figure 2 [111]Open in a new tab DEGs between TSOD and TSNO mice. (A) Overlap pattern of the five DEG sets. Each row corresponds to a gene that was included in at least one DEG set, and black regions indicate the time periods at which the gene was differentially expressed. (B) Heatmap of the union set of the DEGs. The color scale shows the z-score (per row) of the log expression. The left color boxes next to the dendrogram denote gene clusters obtained from hierarchical clustering. For more details, see Clustering analysis in Materials and Methods. (C) The time series of the average z-score of the top five largest clusters. (D) GO terms enriched in the second cluster. Each bar shows the number of genes with the corresponding term in the analyzed gene set. (E) KEGG pathways enriched in the second cluster. The meaning of each bar is similar to that in D. Regarding other cluster results of the enrichment analyses, see Supplementary Figs [112]S2–[113]S5. The union set of the DEGs was separated into clusters (Fig. [114]1B), and the top five largest clusters were extracted (Fig. [115]2C). The sizes of the five clusters were 611, 579, 530, 457, and 179. The properties of the five clusters were analyzed using the Gene Ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis. For example, the GO (Fig. [116]2D) and KEGG pathway (Fig. [117]2E) enrichment analyses revealed that the second cluster contained many genes associated with inflammation and immune responses. Other cluster results from the enrichment analyses are included in Supplementary Figs [118]S2–[119]S5. We then searched DNB genes using a simplified version of the original method^[120]5 (see DNB selection in Materials and Methods) and found a group of 147 genes (Table [121]1) that showed a marked peak at 5 weeks of age in TSOD mice in the average standard deviation I[s] (Fig. [122]3A) and average correlation strength I[r] (Fig. [123]3B). The simultaneous increases in these two statistics indicated that the DNB genes temporally showed unusually large fluctuations as well as strong correlations at the time period. The emergence of such a gene cluster has been suggested as a possible early warning signal of any abrupt change in health conditions^[124]5. Therefore, our results suggested that TSOD mice at 5 weeks of age, which were several weeks earlier than the onset of metabolic syndrome^[125]20,[126]24,[127]27, were in the pre-disease state preceding any essential change in the progression from a healthy state to metabolic syndrome. Table 1. List of 147 DNB genes obtained from the 5-week-old TSOD mouse group. The gene symbols are sorted alphabetically. List of 147 DNB genes Actl10 Adam20 Adam4 Adam5 Adam6a Als2cr11 Ankrd22 Btbd35f13 Capza3 Ccdc155 Ccdc54 Ccdc89 Cd55b Cmtm2a Cmtm2b Cox7b2 Cox8c Cst9 Dazl Ddx4 Defa29 Dmrtb1 Dnajc5b Eqtn Fam178b Fam71f2 Fbxw10 Fcrla Gata3 Gk2 Gm14725 Gm20877 Gm20878 Gm21637 Gm6309 Gm6370 Gm6583 Gm6588 Gm6890 Gm8702 Got1l1 Grxcr2 Gtsf1l Il31 Iqcf6 Kcnj3 Kcnmb4 Klk1b24 Lyzl6 March10 Mbd3l1 Morn2 Mroh8 Myc Odf3 Olfr165 Oxct2a Pcsk6 Pdha2 Pebp4 Pgc Piwil1 Plcz1 Ppp1r2-ps7 Prok2 Prps1l1 Prr22 Prr27 Prss35 Prss37 Prss38 Prss52 Pth2 Rbakdn Rbm44 Rnf17 Sf3b3 Sgms2 Sh3d21 Sh3gl3 Slc22a23 Slc2a3 Slc47a2 Slco6b1 Slx Sox3 Sox5os3 Spanxn4 Spata3 Spata31d1c Spata45 Spin2d Sstr2 Ssty1 Stpg4 Syngr4 Tbc1d21 Tex101 Tex36 Tex48 Thegl Tmem217 Tmem30c Tmem35a Trim69 Tssk2 Tssk6 Tuba3b Tubd1 Ubqln3 Vmac Vsig1 Vwa5b1 Wdcp Zfp541 Zik1 Znrf4 1700006A11Rik 1700009C05Rik 1700011M02Rik F17Rik 1700025D23Rik 1700026J12Rik 1700030L20Rik 1700031F10Rik 1700034J05Rik L16Rik 1700061N14Rik 1700092M07Rik 1700099I09Rik 1700113H08Rik 1700126A01Rik K09Rik 4732460I02Rik 4921524L21Rik 4930415O20Rik 4930449C09Rik 4930449I24Rik I11Rik 4930522O17Rik 4930529F21Rik 4930544D05Rik 4930571K23Rik 4930579F01Rik E13Rik 4933402N22Rik 4933406F09Rik [128]Open in a new tab Figure 3. [129]Figure 3 [130]Open in a new tab Characteristics of 147 DNB genes. (A) The average standard deviation I[s]. (B) The average correlation strength I[r]. (C) Enriched GO annotations. No KEGG pathway was enriched in the DNB genes. (D) A t-SNE plot of the union set of the DNB genes and 2665 DEGs. The overlapping genes were colored as DNB genes. For more details, see Dimensionality reduction in Materials and Methods. (E) Spatio-temporal fluctuation patterns of the DNB genes and DEGs. The color scale shows the standard deviation. We assessed the reproducibility of our DNB selection procedure using a leave-one-out approach. One sample was removed temporally from either 5-week-old TSOD mice data or 5-week-old TSNO mice data. DNB genes were then selected in the same manner, but using only the remaining samples. The genes obtained were compared with the original 147 genes. The overlap size was 74.6±12.0 (mean ± standard error of mean), suggesting an intermediate level of reproducibility of the DNB selection procedure. Most GO annotations enriched in the DNB genes were associated with reproduction, such as spermatogenesis and spermatid development (Fig. [131]3C), which would not be directly related to the pathogenesis of metabolic syndrome. The KEGG pathway enrichment analysis gave no significant result. A possible reason for the failure of the GO and KEGG enrichment analyses is that these analyses are based on existing knowledge, and thus they could have difficulty for characterizing the DNB genes obtained from a novel aspect related to the pre-disease state. In order to visualize the spatio-temporal fluctuation patterns of the DNB genes and other genes, we located the union set of the 147 DNB genes and 2665 DEGs on a two-dimensional plane using the t-distributed stochastic neighbor embedding (t-SNE)^[132]32 method (Fig. [133]3D). We selected this dimensionality reduction method because the DNB genes were mostly concentrated in one region. Figure [134]3E shows the spatio-temporal fluctuation patterns using the two-dimensional coordinates. Most of the DNB genes exhibited large fluctuations at 5 weeks of age in TSOD mice, whereas the majority of the other genes did not. Although a few non-DNB genes also showed large fluctuations at 5 weeks of age, they were excluded because of the lack of strong correlations with the DNB genes. Discussion The present study used the DNB theory to detect the pre-disease state before metabolic syndrome (Fig. [135]3A,B). This extends the scope of the DNB theory in two directions. The first direction is about the variety of disorders. We previously reported that the DNB theory was able to detect the pre-disease state solely before diabetes^[136]12. On the other hand, metabolic syndrome is characterized by several disorders, such as hyperglycemia, hypertension, and dyslipidemia. Therefore, the present study is important in that the applicability of the DNB theory to such complex diseases has been shown. Second, the progression of metabolic syndrome is relatively slow, and thus our results support the idea that the DNB theory is applicable to both acute and chronic diseases. The DNB theory is based on critical transitions^[137]6, which are abrupt changes between distinct states usually accompanied by hysteresis, namely healthy and disease states. Therefore, the DNB theory was mainly applied to data on acute diseases (for example, acute respiratory distress syndrome and influenza) to detect their pre-disease states^[138]5,[139]14. On the other hand, many chronic diseases, such as metabolic syndrome, do not necessarily show apparently abrupt or discontinuous changes at the disease onset, and the health condition seems to be getting worse continuously. However, even in these cases, abrupt changes may occur in the gene expression profiles. In addition, the existence of a positive feedback loop in obesity has been suggested^[140]33, indicating that moving from a disease state back to a healthy state is more difficult than vice versa. This suggests a one-directional critical transition from a healthy state to a disease state, although generally the opposite-directional recovery process can be another critical transition^[141]34. Therefore, the DNB theory has been expected to be applicable to some chronic diseases as well as acute diseases, which has been confirmed in the present study. We here explain in detail that TSOD mice at 5 weeks of age, at which we identified 147 DNB genes, did not develop metabolic syndrome. The onset of metabolic syndrome in TSOD mice is generally considered to be approximately 8–12 weeks^[142]20,[143]24,[144]27, at which many features common to human metabolic syndrome become apparent. For example, urinary glucose became detectable after 8 weeks of age^[145]22, and the enlargement of adipocytes and formation of crown-like structures with macrophage aggregation became observable in the visceral fat at approximately 12 weeks of age^[146]20. Diagnostic criteria for obesity in TSOD mice and prediabetes in rodents are suggested to be body weight ≥40 g and blood glucose level ≥200 mg/dL, respectively^[147]30,[148]31. Clearly, in the present study, TSOD mice at 5 weeks of age did not meet these criteria (Fig. [149]1A). In addition, as previously reported^[150]27,[151]35, transcriptions of proinflammatory cytokines TNF and Interleukin-6 were not largely increased at approximately 5 weeks of age (Supplementary Fig. [152]S6). These genes were known to be upregulated by 11 weeks of age^[153]23,[154]35. Expression of other representative metabolic genes^[155]36 also did not largely change (Supplementary Fig. [156]S6). Taking all things into consideration, TSOD mice at 5 weeks of age were undoubtedly several weeks before the acquisition of major metabolic syndrome phenotypes. Therefore, we believe that the pre-disease state before metabolic syndrome was detected in TSOD mice at 5 weeks of age. However, it is important to note that some other noticeable changes are known to occur in TSOD mice as early as 5 weeks of age. For example, the level of hydroxyoctadecadienoic acid, a biomarker of oxidative stress, was reported to be significantly higher in TSOD mice at 5 weeks of age than in age-matched TSNO mice^[157]35. One of the possible causes of the increase in the oxidative stress level is abnormal iron metabolism, and aberrant accumulation of iron was reported in the spleen of TSOD mice at 8 weeks of age^[158]37. If measured, splenic iron deposition is possibly observable at earlier ages. In addition, the intestinal microbiome was reported to be different between TSOD and TSNO mice at 5 weeks of age^[159]38. Further investigations from various perspectives are needed for better understanding of TSOD mice at 5 weeks of age. For example, pathological examinations^[160]37, which are equipped with the minute power of observations, as well as metagenome^[161]38 and proteome analyses will be useful in future studies. We discuss the timing of the critical transition. We think that a critical transition occurred between 5 and 6 weeks of age, and thus I[s]and I[r] dropped rather than continued to rise at 6 and 7 weeks of age. This interpretation does not necessarily contradict the general notion that the onset of metabolic syndrome in TSOD mice is approximately 8–12 weeks. If a critical transition occurs mainly in adipose tissues between 5 and 6 weeks of age, it may trigger further changes in other parts of the body of mice during 6–7 weeks of age or later, and eventually the major characteristics of metabolic syndrome are acquired at approximately 8–12 weeks of age. This putative scenario is also consistent with the widely accepted view that inflammation in adipose tissues causes secretion of various chemical signals called adipokines, which affect many organs, and then metabolic syndrome is induced^[162]1–[163]3. We also think that multiple important changes may occur in different parts of the body and at different weeks of age during the entire course of the progression from a healthy state to metabolic syndrome in TSOD mice. We consider whether the usage of TSNO mice as the control group was adequate for the present study. TSNO mice has been established as the control group against TSOD mice^[164]21,[165]22, and many studies on TSOD mice used TSNO mice as the control group^[166]23–[167]27,[168]35,[169]37,[170]38. Therefore, we compared TSOD and TSNO mice. On the other hand, the two strains were known to exhibit some differences at early ages as discussed already, and it was unknown whether TSNO mice were suitable as the control group for the study of the pre-disease state. In order to resolve this issue, we compared TSOD mice at 5 weeks of age and TSOD mice at 3 weeks of age, and confirmed that similar DNB genes were selected. The number of genes was 209, and 30 of them were shared with the original 147 DNB genes. The overlap size was significantly large (p = 9.7E-33, Fisher’s exact test). The 209 genes showed sharp peaks of I[s] and I[r] at 5 weeks of age (Supplementary Fig. [171]S7A,B), which are similar to the original results (Fig. [172]3A,B). In addition, enriched GO annotations in the 209 genes (Supplementary Fig. [173]S7C) were also similar to the original results (Fig. [174]3C). Therefore, our results of the DNB selection were not largely dependent on the choice of the control group, which justifies the validity of our main results based on the comparisons between TSOD and TSNO mice. We also discuss possible relations between the pre-disease state before metabolic syndrome and the intestinal microbiome. Intestinal dysbiosis is generally thought to be associated with metabolic syndrome. Previous studies reported that the intestinal microbiome of TSOD mice was similar to but different from that of TSNO mice^[175]26,[176]38. Our experiments also confirmed differences in intestinal microbiomes of TSOD and TSNO mice at 5 weeks of age (Intestinal microbiome analysis in Supplementary Information and Supplementary Fig. [177]S8). Especially, the abundance of Bacteroides, which is a major genus in the phylum Bacteroidetes, decreased largely in TSOD mice at 5 weeks of age. This may be associated with the imbalance between Bacteroidetes and Firmicutes, which was reported to occur in TSOD mice at 12 weeks of age^[178]38 and 24 weeks of age^[179]26. However, it is concluded that observations of the intestinal microbiome and diabetic state were independent and that their interaction and causal relationship are still unclear^[180]38. The altered intestinal microbiome of TSOD mice might have influenced the gene expression patterns in adipose tissues at or before the pre-disease state. It is our future work to clarify the complex etiology and pathology of metabolic syndrome with respect to the host-guest interactions using the DNB theory. Elevated fluctuations of the DNB genes (Fig. [181]3) should be interpreted carefully. According to previous studies^[182]5,[183]12, we calculated the DNB scores using measurements from different individuals and did not use repeated measurements from the same individual. This is because, in order to investigate the gene expression profiles in adipose tissues, a mouse was dissected for each measurement. Therefore, what we investigated were population fluctuations, a part of which would be explained by random effects (variability caused by individual-specific mean values). In order to overcome this limitation, we plan to develop a new statistical method to apply the DNB theory to peripheral blood data in clinical study. We here discuss the time interval of the I[s] and I[r] signals shown in Fig. [184]3A,B. These statistics are expected to increase as the system’s stability decreases^[185]5. This means that if the system’s stability gradually decreases for a certain time interval before a critical transition, these statistics may show an increasing trend during that time interval. It is important to note that actual values fluctuate to some extent due to noise^[186]6,[187]8. Although the time resolution was not very high, the two statistics appeared to begin to increase at 4 weeks of age, which is consistent with expectations. Additional measurements between 4 and 5 weeks of age are needed as further evidence. If the increasing trend is confirmed, we will be more convinced that the observed sharp peak at 5 weeks of age was caused by a decrease in the system’s stability rather than other reasons, such as temporal changes in noise intensity. Another viewpoint is the usefulness of the I[s] and I[r] signals as early warning signals. If a certain signal is detectable in a short time interval only, frequent measurements are generally required to avoid missing it. Our results suggest that the expressions of the DNB genes in TSOD mice need to be assessed at least once every week to avoid missing sharp increases in I[s] and I[r]. Therefore, even if similar early warning signals exist for humans, weekly measurements of gene expressions in individuals before metabolic syndrome are needed, which may be impractical due to the high associated costs. In order to resolve this issue, the development of cost-effective biomarkers of the pre-disease state, which are easy to measure and show signals for a longer time interval, will be needed. As discussed already, we plan to develop such new biomarkers in future research. We here discuss the relationship between our results and critical slowing down. Critical slowing down before a critical transition is a phenomenon that recovery from perturbation becomes increasingly slow as the system’s stability decreases. We could not measure the recovery rate directly in this study because unrealizable repetitive measurements at short intervals from the same individual is generally required. On the other hand, the relative recovery rate can be estimated because the recovery rate is known to be approximately and inversely proportional to the largest eigenvalue of the sample covariance matrix of the state variables^[188]19. By using the 147 DNB genes, we estimated the relative recovery rate for each week, and observed that the estimated relative recovery rate was smallest at 5 weeks of age (Supplementary Fig. [189]S9). This suggests critical slowing down at 5 weeks of age in TSOD mice. Finally, we discuss how our results will contribute to mitigate the prevalence of metabolic syndrome. Diet and exercise therapies are generally the first choice for prevention and treatment of metabolic syndrome, but their adherence is often poor among individuals with obesity. Pharmacotherapy is used when diet and exercise therapies alone are not sufficient, but most medicines are targeting each symptom in metabolic syndrome, such as hyperglycemia, which can lead to polypharmacy. On the other hand, we demonstrated the existence of a notable pre-disease state before metabolic syndrome defined by characteristic behavior of the DNB genes (Fig. [190]3). This suggests the possibility to design novel and effective therapeutic strategies for preventing metabolic syndrome, enabling just-in-time preemptive interventions. Materials and Methods Spontaneous mouse model of metabolic syndrome Three-week-old male TSOD (Tsumura, Suzuki, Obese, Diabetes) mice and TSNO (Tsumura, Suzuki, Non-Obesity) mice were purchased from the Institute for Animal Reproduction (Ibaraki, Japan). Mice were housed in groups of two or three per cage, maintained at 24±2 °C on a 12-hour light and 12-hour dark cycle, and given normal chow diet (MF; Oriental Yeast Co., Ltd., Tokyo, Japan) and water ad libitum. Nonfasting blood glucose concentrations in tail vein blood and body weights were measured in 3-, 4-, 5-, 6-, and 7-week-old TSOD and TSNO mice. After measurements, the mice in each group were dissected to collect epididymal white adipose tissue under anesthesia to minimize suffering. Regarding each adipose tissue sample of individual mice, the total RNA was extracted using the RNeasy Total RNA Extraction kit (Qiagen, Valencia, CA, USA). The numbers of samples taken for subsequent analyses are shown in Supplementary Table [191]S1. This animal study was performed in strict accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals of University of Toyama. The protocol was approved by the Committee on the Ethics of Animal Experiments of the University of Toyama. Microarray assay In order to investigate the gene expression profiles of each mouse, the Agilent SurePrint G3 Mouse Gene Expression 8 × 60 K Microarray Kit (Agilent Technologies, Santa Clara, CA, USA) was used. Cyanine-3 (Cy3)-labeled cRNA was prepared from 0.1 μg total RNA using the Low Input Quick Amp Labeling Kit (Agilent) according to the manufacturer’s instructions, followed by RNeasy column purification (QIAGEN, Valencia, CA). Dye incorporation and cRNA yield were checked with the NanoDrop ND-2000 Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) and Agilent 2100 Bioanalyzer (Agilent). A total of 0.6 μg of Cy3-labeled cRNA was fragmented at 60 °C for 30 minutes in a reaction volume of 25 μl containing 1 × Agilent fragmentation buffer and 2 × Agilent blocking agent following the manufacturer’s instructions. On completion of the fragmentation reaction, 25 μl of 2 × Agilent hybridization buffer was added to the fragmentation mixture and hybridized to SurePrint G3 Mouse Gene Expression 8 × 60 K Microarray (Agilent) at 65 °C for 17 hours in a rotating Agilent hybridization oven. After hybridization, microarrays were washed at room temperature for 1 minute with GE Wash Buffer 1 (Agilent) and at 37 °C for 1 minute with GE Wash buffer 2 (Agilent). Slides were scanned immediately after washing on the Agilent G2505C Microarray Scanner System (Agilent) using the one color scan setting for 8 × 60 k array slides (scan area 61 mm × 21.6 mm, scan resolution 3 μm, the dye channel was set to green, and photomultiplier tube (PMT) gain was set to 100 percent). Scanned images were analyzed with Feature Extraction Software 12.0.3.1 (Agilent) using default parameters to obtain background-subtracted and spatially detrended Processed Signal intensities. Microarray data preprocessing The probe names in the raw dataset were converted to gene symbols according to the latest annotation table available at [192]https://earray.chem.agilent.com/earray/catalogGeneLists.do?action= displaylist. Probe names without any gene symbol annotation were removed. When multiple probe names were assigned to a single gene symbol, the mean value was taken. Gene expression values were then divided by the 2% trimmed mean (the mean value calculated by discarding the lowest 2% and highest 2% values) in each sample in order to normalize the dataset. Normalized values were base-2 log-transformed. Dimensionality reduction Two dimensionality reduction methods were used in the present study: the principal component analysis (PCA) for the all gene expression data and t-distributed stochastic neighbor embedding (t-SNE)^[193]32 for visualizing results of the DNB analysis. PCA is based on an eigendecomposition of a sample covariance matrix, or equivalently, a singular value decomposition of a mean-centered data matrix^[194]19. The number of meaningful principal components (PCs) were determined by a variant of parallel analysis. 100 matrices with the same size as the original data were generated by shuffling the original data. PCA was then performed for each randomized matrix to calculate 95 percentiles of the eigenvalues. The number of the original eigenvalues above them was taken as the meaningful PCs. We also used t-SNE in order for nonlinear transformation of high-dimensional data points to a two-dimensional plane. It is important to note that both axes are equally relevant, and the measure for the mapping quality of t-SNE, which is of the form of Kullback-Leibler divergence, is invariant under rotations or reflections of the plane. We used a t-SNE implementation in the scikit-learn package of python, and set the perplexity to 100 and multiplier for early exaggeration to 10. Default values were used for the other parameters. The dissimilarity between genes was calculated as 1 − |r[ij]| + |r′[ij]|, where r[ij] and r′[ij] are the correlation coefficients between the ith and jth genes of 5-week-old TSOD and TSNO mice, respectively. Extraction of differentially expressed genes Differentially expressed genes (DEGs) are genes with expression values that markedly change between different conditions or groups. In the present study, DEGs were extracted based on fold-changes and hypothesis testing. In fold-change filtering, the arithmetic mean of the log-transformed values (or equivalently, the logarithm of the geometric mean in the original scale) was calculated for each gene in each group. Genes that exhibited more than one inter-group difference, which corresponded to more than a two-fold change in the original scale, were taken as the first group of DEG candidates. On the other hand, two-tailed Welch’s t-tests were performed for each gene using log-transformed values. In order to alleviate the large risk of Type-I errors in multiple hypothesis testing, we adjusted the significance level by the Benjamini-Hochberg (BH) procedure, which controls the supremum of the expected value of the false discovery rate (FDR). The genes for which the null hypothesis was rejected based on the adjusted significance level (E(FDR) ≤ 0.05) were taken as the second group of DEG candidates. We extracted the intersection of the two candidate groups as DEGs. Clustering analysis We used a hierarchical clustering method to find gene clusters that showed similar time evolutions. Since the dynamic range of the gene expression profiles was large, we initially performed z-score normalization for each gene. Dissimilarity between genes was evaluated based on 1 − r[ij], where r[ij] is the correlation coefficient between the ith and jth genes. The average linkage method was then used to calculate the dendrogram. Genes were separated into clusters with a cutoff value of 0.5 for inter-cluster dissimilarity. Enrichment analysis Enrichment analyses of GO annotations and the KEGG pathways were performed using the web tool of the DAVID (Database for Annotation, Visualization and Integrated Discovery) database^[195]39 ([196]https://david.ncifcrf.gov). The purpose of the enrichment analysis is to characterize a set of genes, such as DEGs. The target gene set is compared with many other well-characterized gene sets (for example, genes sharing the same GO annotation or those involved in the same KEGG pathway), and overlaps are calculated. Based on this information, we may reveal what types of genes are largely included in the target gene set. In order to establish whether an overlap between two gene sets is significantly large, the one-tailed Fisher’s exact test based on a hypergeometric distribution is generally used. Its p-value is defined as follows: [MATH: p=k=xmin(n1,n2)(Nn1n2k )(n1k)(N n2)=1k=0x1(Nn1n2k )(n1k)(N n2), :MATH] 1 where N is the total number of genes of the target organism, n[1] is the size of the gene set under analysis, n[2] is the size of another gene set whose characteristics are known, and x is the overlap between the two gene sets. It is important to note that the p-values provided by the DAVID’s web tool are based on a modified version of Fisher’s exact test, in which x and n[1] are substituted by x − 1 and n[1] − 1, respectively. These modifications make the test more conservative and suppress false positives. Although multiple hypothesis testing was involved, we initially picked up all gene sets with p ≤ 0.05 and x ≥ 2 for exploratory purposes. Top 10 gene sets were then selected based on the size of overlaps. DNB selection DNB genes are genes with expression values that exhibit unusually large fluctuations in a collective manner at the pre-disease state. Since no standardized method for DNB selection has been established currently, various approaches have been investigated^[197]5,[198]14,[199]16. In the present study, we used a simplified version of the original method^[200]5. We here introduce some notations. Let X = {x[ik]}(i = 1, …, N, k = 1, …, K) denote a gene expression dataset restricted to a certain condition (for example, 5-week-old TSOD mice), where N is the total number of genes and K is the number of samples in the subdata. The gene-wise mean m[i](X), standard deviation s[i](X), and correlation coefficient r[ij](X) are defined as follows: [MATH: mi(X)= k=1KxikK,(i=1,,N), :MATH] 2 [MATH: si(X)=k=1K< mspace width=".25em">(xikmi (X))2K< mo>−1,(i=1,,N), :MATH] 3 [MATH: rij(X)= k=1K(xikmi (X))(xjkmj (X))(K1)si(X)sj(X),(i,j=1,,N). :MATH] 4 It is important to note that only some of the correlation coefficients were actually calculated in the following procedure. Let Y = {y[ik]}(i = 1, …, N, k = 1, …, L) denote the data of the control group, where L is the number of samples in Y. The gene-wise mean m[i](Y), standard deviation s[i](Y), and correlation coefficient r[ij](Y) are similarly defined. Based on the statistics described above, we selected candidate genes of DNB in three steps (Fig. [201]4). First, genes that showed relatively large increases in fluctuations were selected based on the ratio of the standard deviations: [MATH: vi(1)=si(X)s i(Y),(i=1,,N). :MATH] 5 Figure 4. [202]Figure 4 [203]Open in a new tab Schematic illustration of DNB selection. S[0]: indices of all genes. S[1]: indices of genes with large increases in fluctuations, which are selected from S[0]. S[2]: indices of genes with large increases in correlation strength within S[1], which are selected from S[1]. S[3]: indices of genes with large, but non-specific increases in correlation strength against a broad range of genes, which are selected from S[1]. S^*: indices of DNB candidates with large increases in correlation strength specifically within S[1], each element of which is in S[2] but not in S[3]. The indices corresponding to the highest [MATH: θ1 :MATH] (%) values of [MATH: vi(1) :MATH] were selected and denoted as [MATH: S1 :MATH] . The selection of a parameter value for [MATH: θ1 :MATH] as well as other parameters is explained later. Second, a cluster of genes that temporally showed strong correlations was extracted based on the average change of the strength of the correlation coefficients: [MATH: vi(2)=jS1(|rij(X)||rij(Y)|),(iS1). :MATH] 6 The indices corresponding to the highest θ[2] (%) values of [MATH: vi(2) :MATH] were extracted and denoted as S[2]. We did not use here a clustering method, such as hierarchical clustering, for three reasons: (1) [MATH: vi(2) :MATH] is easy to calculate, (2) only one parameter is involved, and (3) a selection step among multiple clusters is not necessary. Third, genes that showed non-specific increases in the correlation strength against a broad range of genes were excluded based on the following statistics: [MATH: vi(3)=jS0\S1(|rij(X)||rij(Y)|),(iS1), :MATH] 7 where [MATH: S0={1,,N} :MATH] is the set of all gene indices. The indices corresponding to the highest [MATH: θ3 :MATH] (%) values of [MATH: vi(3) :MATH] were extracted and denoted as [MATH: S3 :MATH] . The genes corresponding to [MATH: S=S2\S 3 :MATH] , which showed large increases in correlation strength specifically within [MATH: S1 :MATH] , were then selected as candidate genes of DNB. We investigated various parameter values of θ[1], θ[2], and θ[3], and obtained many DNB candidate sets. In order to evaluate the performance of each candidate set as an indicator of the pre-disease state, we calculated the average standard deviation I[s] and the average correlation strength I[r] as follows: [MATH: Is=1|S|iSsi(Z), :MATH] 8 [MATH: Ir=2|S∗< /mo>|(|S∗< /mo>|1)i,jS,i<j|ri< /mi>j(Z)|c, :MATH] 9 where S^* is the indices of a DNB candidate set, |S^*| denotes the cardinality of S^*, Z = {z[ik]}(i = 1, …, N, k = 1, …, M) is a subdata of a certain condition, and c is a correction term that depends on the sample size (c = 0.50 for 4 samples and c = 0.42 for 5 samples). The correction term represents the expected correlation strength of two independent random variables both following the standard normal distribution. We introduced this term because a systematic increase in correlation strength was observed in subdata with only 4 samples. Based on the two DNB scores I[s] and I[r], we searched a gene set that showed a sharp peak both in I[s] and I[r] at a particular time period, and found a gene set consisting of 147 genes. The corresponding parameter values were θ[1] = 10%, θ[2] = 50%, and θ[3] = 80%. Although the original method^[204]5 also proposed the third statistic, the average correlation strength between the DNB variables and the others, we did not use it because the third statistic frequently does not behave as expected. Theoretically, the third statistic works only when the dominant eigenvector of the Jacobian matrix of a dynamical system contains exactly zero elements^[205]5, and even a small deviation from zero may result in wrong behavior. In fact, the third statistic for the 147 DNB genes was almost constant (Supplementary Fig. [206]S10). A possible reason is that the dominant eigenvector of the Jacobian matrix in this case did not contain exactly zero elements. This may occur if we regard the elements of the eigenvector as continuous random variables following certain continuous distributions because the conditions of real living organisms are perpetually fluctuating. In that case, the probability measure of any one of the elements being exactly zero becomes zero. Therefore, all elements take non-zero values almost surely as long as the assumptions described above hold true. Supplementary information [207]Supplementary Information^ (569.1KB, pdf) Acknowledgements