Abstract

   Quiescence, a temporary withdrawal from the cell cycle, plays a key
   role in tissue homeostasis and regeneration. Quiescence is increasingly
   viewed as a continuum between shallow and deep quiescence, reflecting
   different potentials to proliferate. The depth of quiescence is altered
   in a range of diseases and during aging. Here, we leveraged
   genome-scale metabolic modeling (GEM) to define the metabolic and
   epigenetic changes that take place with quiescence deepening. We
   discovered contrasting changes in lipid catabolism and anabolism and
   diverging trends in histone methylation and acetylation. We then built
   a multi-cell type machine learning model that accurately predicts
   quiescence depth in diverse biological contexts. Using both machine
   learning and genome-scale flux simulations, we performed
   high-throughput screening of chemical and genetic modulators of
   quiescence and identified novel small molecule and genetic modulators
   with relevance to cancer and aging.

   Keywords: quiescence deepening, genome-scale models, machine learning,
   cancer, aging
     __________________________________________________________________

Significance Statement.

   Quiescence is a sleep-like cell state important for tissue
   regeneration. Quiescence exists on a spectrum, from shallow to deep
   quiescence, based on a cell's ability to proliferate. Through
   computational modeling, we simulated metabolic changes associated with
   quiescence deepening. We also developed a machine learning model that
   accurately predicts quiescence depth in multiple cell types and
   quiescence induction methods. Our models predicted genes and small
   molecules capable of tuning the depth of quiescence. These modulating
   factors show a striking association with cancer initiation and aging.

Introduction

   Cellular quiescence is a state of reversible dormancy in which cell
   division is halted but able to resume if needed. As a common somatic
   cell state, appropriate transition into and out of quiescence is
   essential for proper tissue homeostasis ([26]1, [27]2). Quiescence is
   induced in stem cells, for instance, where it shields cells from
   environmental stress. In response to tissue injury, however, activation
   of quiescent stem cells is key to regenerating damaged tissue ([28]3,
   [29]4).

   Rather than a uniform state, quiescence is increasingly viewed as a
   continuum between shallow and deep quiescence, akin to light and deep
   sleep, requiring different potencies of growth stimulation to induce
   proliferation. Cells in shallower quiescence respond more easily to
   growth signals, whereas cells in deeper quiescence are more difficult
   to activate ([30]5). This phenomenon is observed in a wide range of
   contexts. Cells cultured under serum starvation or contact inhibition
   become progressively desensitized to growth signals ([31]6). Likewise,
   quiescent liver cells display impaired regenerative capacity with age,
   reflective of a deeper quiescence ([32]7).

   Tuning the depth of quiescence is critical in a range of physiological
   contexts. Muscle stem cells, for instance, shift to a shallower state
   of quiescence in response to tissue injury or stress ([33]8), and
   certain neural stem cells (NSCs) exit quiescence upon nutritional
   stimulus ([34]9). Additionally, quiescence depth is dysregulated in a
   range of diseases and during aging. In cancer, malignant cells can
   become quiescent, where they evade most cancer treatments ([35]10), and
   with age, quiescence deepens across various tissues and cell types,
   impairing tissue regeneration and compromising cellular resilience
   ([36]5). In light of this, better understanding the factors modulating
   quiescence would shed light on an important facet of quiescent cell
   biology and may have therapeutic relevance to the variety of disease
   states in which quiescence depth is dysregulated. These factors,
   however, remain poorly defined. In this study, we take a multifaceted
   approach to uncover modulators of quiescence depth encompassing
   genome-scale metabolic model (GEM) and machine learning predictors of
   quiescence.

   Genome-scale metabolic models are mathematical representations of
   metabolic networks encompassing an organism's entire genome ([37]11).
   By incorporating constraints, including physicochemical and
   environmental limitations, GEMs can simulate fluxes through metabolic
   reactions and accurately predict responses to genetic perturbations
   ([38]12). GEMs can also incorporate omics data to further constrain the
   set of allowable fluxes ([39]13). For instance, Frezza et al. ([40]14)
   created a tissue-specific metabolic model using transcriptomics data,
   and then accurately predicted how metabolism was rewired in the absence
   of a key citric acid cycle enzyme. GEMs have also been used to define
   the metabolic signature of quiescent cancer cells through omics data
   incorporation ([41]15).

   Here, we leveraged metabolic models to simulate the metabolic changes
   underlying quiescence deepening and perform in silico gene knockouts to
   identify genes capable of shifting quiescence depth toward a shallower
   or deeper state. In doing so, we highlighted metabolic and epigenetic
   trends associated with quiescence deepening and built a metabolic
   flux-based quiescence depth predictor that accommodates multiple types
   of input data. Metabolism and epigenetics are profoundly important to
   cellular biology, yet the metabolic and epigenetic trends associated
   with quiescence deepening remain unexplored, so this analysis provides
   a window into critical yet undefined aspects of quiescence.

   In addition, we developed a novel transcriptomics-based quiescence
   depth predictor built using multiple quiescence induction methods and
   multiple cell types and validated its accuracy in diverse contexts.
   Using this predictor, we performed high-throughput screening of
   chemical and genetic perturbations to identify interventions that
   robustly altered quiescence depth. We also explored quiescence depth in
   embryogenesis and stem cell aging at the single-cell level. We
   validated our predictions using cancer genomics data (Fig. [42]1).

Fig. 1.

   [43]Fig. 1.
   [44]Open in a new tab

   Graphical overview. In this article, we take a multifaceted approach to
   identifying modulators of quiescence depth: (i) metabolic modeling and
   flux simulations that enable in silico genetic knockouts and (ii) a
   multi-cell type transcriptomics quiescence depth predictor that
   facilitates screening chemical and genetic perturbations for their
   effect on quiescence depth. In doing so, we simulate metabolic and
   epigenetic features of quiescence deepening and explore quiescence
   depth in embryogenesis and aging. Predictions are validated using a
   cancer genomics database.

Results

Metabolic modeling simulates metabolic fluxes during quiescence deepening

   First, we simulated metabolic fluxes for progressively deeper stages of
   quiescence by incorporating transcriptomics from nine stages of
   quiescence depth into a GEM containing 2,953 genes and 8,377
   metabolites organized into 13,028 reactions ([45]16). Quiescence
   deepening was induced in rat fibroblasts through serum starvation, and
   quiescence depth was quantified as the days since serum starvation
   began, denoted as quiescence depth score (QDS) ([46]5). Each quiescence
   depth score had three replicates for a total of 27 samples. To
   incorporate transcriptomics, we used Integrative Network Inference for
   Tissues (INIT), a well-established tool for creating
   transcriptomics-constrained metabolic models ([47]17). Then we randomly
   sampled from the flux solution space (Fig. [48]2A). Similar results
   were obtained using linear Integrative Metabolic Analysis Tool (iMAT)
   ([49]18) and parsimonious flux balance analysis (see Methods).

Fig. 2.

   [50]Fig. 2.
   [51]Open in a new tab

   Metabolic pathway analysis of quiescence deepening. A) Workflow for
   simulating metabolic reaction fluxes. Transcriptomics from nine stages
   of quiescence depth were integrated into a metabolic model (Rat-GEM) to
   constrain the set of allowable fluxes (using INIT). Then, we used CHRR
   to randomly sample from the set of potential fluxes for each quiescence
   depth. B) Top subsystems associated with quiescence depth. For each
   subsystem, each reaction in the subsystem was correlated with
   quiescence depth and the average Pearson correlation was computed.
   Subsystems with the highest average Pearson correlation coefficients
   are shown here (Q-value < 1E-5, one-sample t test). A positive mean
   correlation coefficient implies a positive association with quiescence
   depth while a negative mean correlation coefficient indicates a
   negative association with quiescence depth. C) Subsystem enrichment
   analysis of reactions negatively correlating with QDS at a P-value of
   <0.15. Subsystems (pathways) ranked by fold enrichment are shown here
   along with an enrichment P-value calculated using the hypergeometric
   function and corrected with the Benjamini–Hochberg adjustment (Q-value
   < 0.01 subsystems shown here). D) Subsystem enrichment analysis of
   reactions positively correlating with QDS at a P-value of <0.15.
   Subsystems (pathways) ranked by fold enrichment are shown here along
   with an enrichment P-value calculated using the hypergeometric function
   and corrected with the Benjamini–Hochberg adjustment (Q-value < 0.01
   subsystems shown here). Top 10 subsystems meeting criteria are shown
   here, ranked by fold enrichment. E) Normalized reaction fluxes across
   quiescence depths for all reactions negatively correlating with QDS at
   a P-value of <0.15. F) Normalized reaction fluxes across quiescence
   depths for all reactions positively correlating with QDS at a P-value
   of <0.15.

   We then identified which metabolic reactions corresponded most strongly
   to quiescence deepening by correlating reaction fluxes with quiescence
   depth and highlighting the reactions with the strongest association.
   The fluxes of these reactions show a strong correspondence with
   quiescence depth (Fig. [52]2E and F). To better understand the
   metabolic pathways represented by these reactions, each reaction's
   association with a subsystem in the metabolic model was leveraged.
   Using this information, we performed an enrichment analysis to identify
   which metabolic subsystems were overrepresented in the reactions
   positively correlating to quiescence depth (increasing with quiescence
   depth) and the reactions negatively correlating to quiescence depth
   (decreasing with quiescence depth) (Fig. [53]2C and D). The basis for
   this fold enrichment is the frequency of a given subsystem in the
   differential fluxes compared to the frequency of a given subsystem in
   all reactions in the metabolic model. Glycosaminoglycan, arachidonic
   acid, and linoleate metabolism were a few of the subsystems found to
   rise with quiescence deepening, whereas fatty acid biosynthesis,
   prostaglandin biosynthesis, and omega-3 metabolism were predicted to
   fall.

   A further pathway analysis was then performed in which each metabolic
   subsystem's average correlation with quiescence depth was calculated,
   and subsystems with the highest absolute correlation were identified
   (Fig. [54]2B). This revealed many of the same trends, with lipid
   metabolism subsystems conspicuous.

   Given the enrichment of lipid metabolism pathways in these subsystems,
   we decided to delve deeper into these pathways specifically. To do so,
   we examined a comprehensive set of lipid-related metabolic subsystems
   and corroborated the trends using transcriptomics data.

Lipid anabolism and catabolism display largely opposing trends during
quiescence deepening

   The most striking metabolic pattern observed during quiescence
   deepening was the diverging trends of lipid anabolism and catabolism.
   As quiescence deepens, beta-oxidation fluxes substantially increase
   while fatty acid and cholesterol biosynthesis fluxes decline, albeit in
   a less uniform fashion (Fig. [55]3A and B), suggesting a metabolic
   shift from fatty acid biosynthesis to fatty acid oxidation takes place
   during quiescence deepening. We highlight representative metabolic
   subsystems and reactions in Fig. [56]3C and D for lipid anabolism and
   catabolism, respectively, using Escher plots ([57]19). The full-size
   metabolic network plots are shown in Figs. [58]S2 and S3.

Fig. 3.

   [59]Fig. 3.
   [60]Open in a new tab

   Lipid anabolism and lipid catabolism display largely opposing trends
   during quiescence deepening. A) Mean Pearson correlation coefficient of
   all reactions with QDS for lipid anabolism subsystems. Only subsystems
   with a mean correlation coefficient different from zero (Q-value <
   0.05, one-sided t test) are shown here. B) Mean Pearson correlation
   coefficient of all reactions with QDS for lipid catabolism subsystems.
   Only subsystems with a mean correlation coefficient different from zero
   (Q-value < 0.05, one-sided t test) are shown here. C) Metabolic pathway
   plot for example lipid anabolism subsystem: fatty acid biosynthesis
   (odd-chain). Reactions are colored by the Pearson correlation
   coefficient with quiescence depth. D) Metabolic pathway plot for
   example lipid catabolism subsystem: beta oxidation of di-unsaturated
   fatty acids (N-6) (mitochondrial). Reactions are colored by the Pearson
   correlation coefficient with quiescence depth. E) Distribution of
   Pearson correlation coefficients with quiescence depth for transcripts
   belonging to the GO biological process “lipid biosynthesis.” Only
   correlation coefficients with Q-value < 0.05 are shown here. F)
   Distribution of Pearson correlation coefficients with quiescence depth
   for transcripts belonging to the GO biological process “lipid
   catabolism.” Only correlation coefficients with Q-value < 0.05 are
   shown here.

   This metabolic shift aligns strongly with experimental metabolomics
   data from Yao et al. ([61]20) in which quiescent cells displayed higher
   rates of carnitine-shuttle-dependent fatty acid oxidation and lower
   rates of lipid biosynthesis compared to proliferating controls. It also
   aligns with transcriptomics data used for model reduction. To
   corroborate the trends with transcriptomics data, we identified genes
   related to lipid biosynthesis and lipid catabolism, and then computed
   their correlation with quiescence depth (Fig. [62]3E and F). For lipid
   biosynthesis genes, we see a modest majority decline with quiescence
   deepening whereas for lipid catabolism, a more substantial majority
   increases with quiescence depth, both aligning with the trends seen in
   the fluxes.

Epigenetic simulations reveal diverging trends in histone acetylation and
methylation as quiescence deepens

   Epigenetic modifications play a profound role in regulating a cell's
   gene expression, and their nature is highly dependent on the metabolic
   state of the cell. The metabolite acetyl-CoA, for instance, serves as
   the substrate for histone acetylation, a histone modification that
   activates gene transcription, while the metabolite S-adenosylmethionine
   (SAM) serves as the carbon donor in histone methylation ([63]21,
   [64]22) (Fig. [65]4A). Given the interplay between metabolism and
   epigenetics, metabolic models enable the simulation of histone
   modifications based on the abundance and flux of metabolic precursors
   ([66]23–25).

Fig. 4.

   [67]Fig. 4.
   [68]Open in a new tab

   Quiescence deepening is associated with diverging trends in epigenetic
   histone acetylation and methylation. A) Diagram of chromatin
   highlighting histone post-translational modifications and their
   relationship to metabolism. SAM is the substrate for histone
   methyltransferases (HMT) to methylate histone residues and convert SAM
   to S-adenosylhomocysteine (SAH). Acetyl-CoA is the substrate for
   histone acetyltransferases (HAT) to acetylate histone residues with CoA
   as a byproduct. Citrate is converted to acetyl-CoA by ATP citrate lyase
   (ACLY). These reactions are simulated in the metabolic model. B)
   Simulated mean nuclear histone acetylation flux vs. quiescence depth
   reveals a downward trend in histone acetylation as quiescence deepens.
   Pearson correlation performed between quiescence depth and mean nuclear
   histone acetylation flux. C) Top subsystems whose knockout reverse the
   acetylation trend, ranked by Pearson correlation between QDS and
   histone acetylation upon subsystem knockout. D) Simulated mean nuclear
   histone methylation flux vs. quiescence depth reveals an upward trend
   in histone methylation with increasing quiescence depth. Pearson
   correlation performed between quiescence depth and mean nuclear histone
   methylation flux. E) Top subsystems whose knockout reverses the
   methylation trend, ranked by Pearson correlation between QDS and
   histone methylation upon subsystem knockout.

   Leveraging this ability, we simulated histone acetylation and
   methylation as quiescence transitions from a shallow to a deep state.
   Remarkably, we observed diverging trends in these two histone
   modifications: histone acetylation is predicted to fall during
   quiescence deepening (Fig. [69]4B) while histone methylation is
   predicted to rise (Fig. [70]4D). In agreement with these results,
   histone acetylation is known to fall with quiescence induction
   ([71]26).

   What metabolic pathways could be driving these trends? To answer this
   question, we knocked out each subsystem in the metabolic model and
   observed the resulting epigenetic trend. For histone acetylation, the
   downward trend was abolished most strongly by knocking out fatty acid
   activation and carnitine-shuttle-related reactions—pathways known to
   impact histone acetylation in other contexts ([72]27, [73]28) (Fig.
   [74]4C). Given the steady increase in carnitine-shuttle-dependent fatty
   acid oxidation predicted during quiescence deepening, this suggests
   that increased fatty acid oxidation may be responsible for mediating
   the downward trend in histone acetylation. For histone methylation,
   knockdown of peptide metabolism and several lipid-related
   subsystems—both anabolic and catabolic—had the strongest impact in
   abolishing the upward trend (Fig. [75]4E), implying a link may exist
   between rising histone methylation and the predicted shift from fatty
   acid anabolism to catabolism during quiescence deepening.

In silico gene knockout simulations predict a range of quiescence
depth-altering genes that align with experimental data

   Metabolic models enable the large-scale simulation of genetic
   perturbations by applying known information about an organism's
   metabolic networks and their relationship to genes ([76]13). Leveraging
   this functionality, we performed a gene knockout screening to identify
   genes that may modulate quiescence depth.

   To accomplish this, we employed the algorithm robust metabolic
   transformation algorithm (rMTA), which identifies genes whose knockout
   shifts the metabolic state of a cell from a source state to a target
   state ([77]29). We simulated metabolic fluxes for shallow quiescence
   and deep quiescence through flux sampling, then performed two
   screenings: one for genes whose knockout shifts the cell from shallow
   to deep quiescence and another for genes whose knockout shifts the cell
   from deep to shallow quiescence (Fig. [78]5A).

Fig. 5.

   [79]Fig. 5.
   [80]Open in a new tab

   Gene knockout simulations predict a range of quiescence depth
   modulators in alignment with experimental data. A) Overview of gene
   knockout simulations. Transcriptomics from shallow and deep quiescence
   states were used to constrain a metabolic model and perform flux
   simulations, enabling the simulation of the metabolic state of a cell
   in shallow and deep quiescence. Then, we used the rMTA to predict gene
   knockouts that could move a cell to (i) a shallower quiescence state or
   (ii) a deeper quiescence state. B) Accuracy of gene perturbation
   simulations for top 50 hits. To validate these results experimentally,
   we used a cancer genomic database called CancerMine. The CancerMine
   database validation relied on the following principle: genes whose
   knockdown lowered quiescence depth was considered validated if they
   were annotated as tumor suppressors (genes whose downregulation
   promotes cancer growth), genes whose knockdown raised quiescence were
   considered validated if they were annotated as oncogenes (genes whose
   downregulation lowers cancer growth). C) Enrichment analysis. For
   quiescence deepeners, we calculated the fold enrichment and P-value of
   tumor suppressors among the top 50 hits compared to the background list
   of all genes knocked out with rMTA. For quiescence reducers, we
   calculated the fold enrichment of cancer oncogenes or drivers compared
   to the background set of genes. P-values are computed using the
   hypergeometric function and listed above the bar chart bars. **P-value
   < 0.01. D) Top 20 hits of the gene perturbation simulation for
   quiescence reducers: The top 20 genes whose knockout is predicted to
   move the cells toward a deeper quiescence state are shown here (ranked
   by rMTA robust transformations core). Here, the CancerMine annotation
   described in B) is shown for each prediction. E) Top 20 hits of the
   gene perturbation simulation for quiescence deepeners: The top 20 genes
   whose knockout is predicted to move the cells toward a shallower
   quiescence state are shown here (ranked by rMTA robust transformations
   core). As before, the CancerMine annotation described in B) is shown
   here for each prediction. The gene MPC1 is highlighted. This gene has
   experimental evidence for its inhibition lowering quiescence depth (as
   predicted here).

   To validate the results of these predictions, we compared the
   predictions with known oncogenes and tumor suppressors from the
   CancerMine database ([81]30). This comparison relied on the following
   principle: genes whose knockdown lowered quiescence depth (“quiescence
   deepeners”) were considered validated if they were annotated in the
   database as tumor suppressors (genes whose downregulation promotes
   cancer growth) and genes whose knockdown deepened quiescence depth
   (“quiescence reducers”) were considered validated if they were
   annotated as oncogenes or cancer drivers (genes whose downregulation
   lowers cancer growth). This validation approach was used because of the
   sparsity of studies screening genes directly for quiescence depth and
   proliferation's strong association with quiescence depth—shallower
   quiescent cells are more easily able to proliferate and several genes
   that alter quiescence depth also affect proliferation in nonquiescent
   cells, such as genes in the Rb-E2F pathway ([82]31, [83]32).

   For genes whose knockdown was predicted to shift the cell to deeper
   quiescence (“quiescence reducers”), a remarkable percentage agree with
   experimental data. Virtually, all these genes with CancerMine
   annotations are considered cancer oncogenes or drivers (19/20 of the
   top 20 predictions with annotations, 23/25 of the top 25 predictions
   with annotations), which aligns with a role in driving quiescence to a
   shallower state with more proliferative potential (Fig. [84]5B). The
   enrichment for cancer driver genes in these predictions out of all
   genes tested is shown in Fig. [85]5C along with the P-value calculated
   using the hypergeometric function. Examining genes in this category
   reveals that many are involved in lipid metabolism and cancer
   progression. The genes fatty acid synthase (FASN) and fatty acid
   elongase 1 (ELOVL1), for instance, are both involved in fatty acid
   biosynthesis and cancer initiation ([86]33, [87]34), and the genes
   cholestenol delta-isomerase (EBP), squalene epoxidase (SQLE), and
   24-dehydrocholesterol reductase (DHCR24) represent a diverse set of
   cholesterol biosynthesis-related oncogenes ([88]35).

   For genes whose knockdown is predicted to shift the cell toward
   shallower quiescence (“quiescence deepeners”), 8 of the top 21
   predictions with annotations are classified as tumor suppressors. While
   this enrichment does not reach statistical significance out of all
   tested genes, one prediction is especially noteworthy for its alignment
   with experimental data: mitochondrial pyruvate carrier 1 (MPC1), which
   is listed among the top 20 genes whose knockout reduces quiescence
   mostly strongly out of 2,953 tested genes. This gene has direct
   experimental evidence for its inhibition lowering quiescence depth.
   Both pharmacological and genetic inhibition of MPC1 activates quiescent
   NSCs by shifting metabolism away from oxidative phosphorylation and
   toward glycolysis ([89]36), in alignment with the simulated results.
   Exploring other predictions in this category shows that several genes
   are involved in bile acid synthesis, including 3 beta-hydroxysteroid
   dehydrogenase type 7 (HSD3B7) and cholesterol 7α-hydroxylase (CYP7A1)
   ([90]37), and several are involved in nucleotide metabolism such as
   purine nucleoside phosphorylase (PNP), guanylate kinase 1 (GUK1), and
   adenosine deaminase (ADA) ([91]38–40).

   The top 20 hits for both quiescence reducers and drivers along with
   their CancerMine annotation are shown in Fig. [92]5D and E. The top 50
   predictions for both quiescence reducers and deepeners are listed in
   Table [93]S1.

Flux-based quiescence depth predictor captures metabolic trends associated
with quiescence deepening for multiple types of input data

   To synthesize the metabolic variation underlying quiescence deepening
   into a single score, we built a machine-learning model to predict QDS
   using reaction fluxes. This model captures the metabolic trends behind
   quiescence deepening in an easily interpretable manner. It also
   accommodates multiple types of input data. An elastic net architecture
   was chosen to build the most parsimonious model and strike a balance
   between underfitting and overfitting the reaction coefficients (i.e.
   too many and too few model coefficients).

   The model's accuracy in predicting quiescence depth was validated both
   internally and externally. Based on internal k-fold cross-validation,
   the model accurately predicted quiescence depth across the entire range
   of values when trained on 90% of the data and tested on the remaining
   10% (Fig. [94]6C). To validate the accuracy of the model externally in
   independent datasets, we examined six datasets involving nine distinct
   experimental conditions for proliferating vs. quiescent cells
   ([95]42–45). Given that shallower quiescent cells are closer to
   proliferation than deeper quiescent cells, we reasoned that quiescent
   cells should have a higher QDS relative to proliferating cells. Here,
   we observed a trend toward higher quiescence depth in seven of the nine
   datasets and a higher quiescence depth in four of the nine datasets at
   a P-value threshold of 0.1 (Fig. [96]6D) despite vastly different
   experimental conditions than the data the model was trained on,
   including interleukin 3 (IL-3) starvation in murine lymphocytes (Fig.
   [97]6D, iii) and serum starvation in human cancer cells (Fig. [98]6D,
   iv). The model, however, does not generalize to all forms of
   quiescence, such as Mek inhibition and CDK4/6 inhibition, likely
   because these are pharmacologically induced. It also does not
   generalize to all cell types, such as liposarcoma cells (Fig. [99]6D,
   vi), likely because of the vast differences in metabolism between
   fibroblasts and adipocytes ([100]47).

Fig. 6.

   [101]Fig. 6.
   [102]Open in a new tab

   Flux-based machine learning model captures metabolic signatures of
   quiescence deepening for multiple types of input data. A) Diagram of
   the machine learning model. Reaction fluxes are inputted, and the model
   is trained to predict QDS (days since serum starvation) using elastic
   net regression. B) Top subsystems enriched in the model's positive
   coefficients (up with quiescence deepening) and negative coefficients
   (down with quiescence deepening). The top 10 subsystems with Q-value <
   0.05 for the hypergeometric fold enrichment test are shown here. C)
   K-fold cross-validation accuracy plot. The model is trained and
   optimized on 90% of the data and tested in the remaining 10% for 10
   random subsets of the data. Note: all parameter are un-optimized in
   training set except for fold-change threshold, which is set close to
   one to maximize the impact of genes on fluxes. D) External validation
   shows that the model predicts deeper quiescence (higher QDS) for cells
   in a quiescent phase vs. a proliferative phase for multiple types of
   quiescence and proliferation induction and multiple types of input
   omics data (transcriptomics, translatomics, and proteomics). Fluxes are
   generated using genes upregulated in quiescence relative to
   proliferation vs. fluxes generated using genes upregulated in
   proliferation relative to quiescence with a fold change threshold of
   1.00. (i) Proliferation is induced using epidermal growth factor (EGF),
   horse serum, insulin, and hydrocortisone. Quiescence is induced with
   serum starvation, Mek inhibition, CDK46 inhibition, and contact
   inhibition. Transcriptomics measured. Comparisons were performed with a
   two-sample t test. Source: [103]GSE122927 ([104]41). (ii) Proliferation
   is induced using EGF or tissue plasminogen activator (TPA). Quiescence
   induced with serum starvation. Transcriptomics measured. Comparisons
   were performed with a two-sample t test. Source: Ref. ([105]42). (iii)
   QDS for proliferating vs. quiescent cells. Proliferation is induced by
   IL-3. Quiescence is induced with IL-3 starvation. Proteomics measured.
   Comparisons were performed with a paired t test. Source: Ref.
   ([106]43). (iv) QDS for proliferating vs. quiescent cells.
   Proliferation is induced using EGF. Quiescence induced with serum
   starvation. Translatomics measured. Comparisons were performed with a
   paired t test. Source: [107]GSE112295 ([108]44). (v) Proliferation is
   induced using fetal bovine serum (FBS). Quiescence induced with 7-day
   contact-inhibition. Transcriptomics measured. Comparisons were
   performed with a two-sample t test. Source: [109]GSE117444 ([110]45).
   (vi) Proliferation is induced using FBS. Quiescence induced with serum
   starvation. Transcriptomics measured. Comparisons were performed with a
   two-sample t test. Source: [111]GSE74620 ([112]46).

   One of the advantages of this flux-based approach is its ability to
   accommodate multiple types of omics data. Using transcriptomics,
   translatomics, and proteomics data as inputs, external validation
   accurately predicted quiescent cells to have a deeper quiescence than
   proliferating cells (Fig. [113]6D i, iii, iv). Features in the model
   largely aligned with metabolic trends highlighted in Fig. [114]2:
   reactions corresponding to heme metabolism, carnitine shuttle, and
   fatty acid oxidation were enriched in the positive coefficients (up
   with quiescence deepening) while reactions associated with fatty acid
   biosynthesis were overrepresented in the negative coefficients (down
   with quiescence deepening) (Fig. [115]6B). The full set of features is
   listed in Table [116]S2.

Multi-cell type quiescence depth predictor accurately predicts quiescence
depth in a wide range of contexts

   While the flux-based model provides both mechanistic interpretability
   and flexibility in terms of data types that can be inputted, the large
   number of computations it requires limits its application in external
   datasets. Hence a second, complementary model that directly uses
   transcriptomics data to predict QDS was also developed to facilitate
   high-throughput screening of perturbations to modify quiescence depth
   and explore quiescence in large external datasets. To build the model,
   we compiled quiescence-deepening gene expression datasets from three
   experimental conditions including rat embryo fibroblasts (REF)
   ([117]5), human retinal pigment epithelial cells (RPEs) ([118]48), and
   human dermal fibroblasts (HFB) ([119]48) (Fig. [120]7A). Taken
   together, these datasets encompass multiple species (rat and human),
   multiple cell types (epithelial and fibroblast), and multiple
   quiescence induction methods (serum starvation and contact inhibition).
   Similar to before, quiescence depth was denoted as the days since
   quiescence induction began, and each quiescence depth was normalized to
   the maximum of each dataset. Transcriptomics gene expression was set as
   the input to the model and the output was set as the normalized
   quiescence depth. Like before, an elastic net architecture was chosen
   to balance underfitting and overfitting the model.

Fig. 7.

   [121]Fig. 7.
   [122]Open in a new tab

   Multi-cell type machine learning model accurately predicts QDS in a
   wide range of contexts. A) Diagram of the machine learning model.
   Transcriptomics from three datasets measuring quiescence deepening is
   inputted, and the model is trained to predict QDS (days since serum
   starvation, normalized to the maximum of each dataset) using elastic
   net regression. B) GO pathway enrichment analysis for positive and
   negative coefficients of model. The top eight biological processes for
   negative coefficients (down in quiescence deepening, Q-value < 0.65)
   and positive coefficients (up in quiescence deepening, Q-value < 0.05)
   as ranked by Q-value of enrichment. C) K-fold cross-validation accuracy
   plot. The model is trained and optimized on 90% of the data and tested
   in the remaining 10% for 10 random subsets of the data. D) External
   validation: model predictions for proliferating vs. quiescent cells in
   various cell types and experimental conditions. (i) Proliferation is
   induced using FBS. Quiescence induced with 7-day contact-inhibition.
   Comparisons were performed with a two-sample t test. Source:
   [123]GSE117444 ([124]45). (ii) Proliferation is induced using FBS.
   Quiescence induced with serum starvation. Comparisons were performed
   with a two-sample t test. Source: [125]GSE74620 ([126]46). (iii)
   Proliferation is induced using EGF, horse serum, insulin, and
   hydrocortisone. Quiescence is induced with serum starvation, Mek
   inhibition, CDK46 inhibition, and contact inhibition. Comparisons were
   performed with a two-sample t test. Source: [127]GSE122927 ([128]41).
   (iv) QDS for proliferating vs. quiescent cells. Proliferation is
   induced using EGF. Quiescence induced with serum starvation.
   Comparisons were performed with a two-sample t test. Source:
   [129]GSE112295 ([130]44). (v) Proliferation is induced using EGF or
   TPA. Quiescence induced with serum starvation. Comparisons were
   performed with a two-sample t test. Source: Ref. ([131]42).

   As before, the model's accuracy was validated internally and
   externally. For k-fold cross-validation, the model accurately predicted
   quiescence depth across the entire range of values when trained on 90%
   of the data and tested on the final 10% (Fig. [132]7C). To validate the
   accuracy of the model in independent datasets, we examined five
   external gene expression datasets involving eight distinct experimental
   conditions for proliferating vs. quiescent cells ([133]42–45).
   Strikingly, quiescent cells were accurately predicted to have a deeper
   quiescence than proliferating controls for six of the eight conditions
   tested. This was also observed across multiple types of growth
   stimulation and multiple types of quiescence induction, including
   contact inhibition, serum starvation, Mek inhibition, and CDK4/6
   inhibition (Fig. [134]7D).

   To gain a better understanding of the pathways captured in this model,
   we performed a gene ontology (GO) enrichment analysis for the negative
   features (down in quiescence deepening) and the positive features (up
   in quiescence deepening) (Fig. [135]7B). Various ion transport pathways
   are enriched in the positive model coefficients whereas several
   immune-related and Wnt-related pathways are enriched in the negative
   features. The full set of features is listed in Table [136]S3.

Changes in quiescence depth are predicted during stem cell aging and
embryogenesis

   Fujimaki et al. ([137]5) observed a deepening of quiescence during stem
   cell aging. Exploring this area with our multi-cell type machine
   learning model, we examined quiescence depth across age for both
   hematopoietic stem cells (HSCs) ([138]49, [139]50) and NSCs ([140]51).
   For datasets with significant missing features, we reconstructed the
   model using genes shared between that dataset and the training gene set
   (see Methods).

   For both NSCs and HSCs, we observed a trend reflecting a deepening of
   quiescence with age (Fig. [141]8A and B). For HSCs, we applied the
   model to a single-cell transcriptomics dataset measuring gene
   expression in long-term (LT) HSCs, short-term (ST) HSCs, and
   multipotent progenitor cells (MPPs). Intriguingly, we see a deepening
   of quiescence in LT and ST HSCs with age but not MPPs, suggesting aging
   is associated with cell-type-specific changes in quiescence depth. For
   NSCs, the trend of increasing quiescence depth followed by a reversal
   matches the nonmonotonic changes in NSC behavior with aging observed by
   Apostolopoulou et al. ([142]51) (Fig. [143]8B).

Fig. 8.

   [144]Fig. 8.
   [145]Open in a new tab

   Changes in quiescence depth are associated with stem cell aging and
   embryogenesis. A) HSC aging: (i) Older HSCs have a trend toward higher
   QDS than young HSCs. Source: [146]GSE109546. (ii) Older LT HSCs and ST
   HSCs have higher QDS than young cells of the same type. No aging trend
   was observed in MPPs. Single-cell RNA-seq of aging HSCs. Source:
   [147]GSE59114. B) Median QDS and age correlate in mouse subventricular
   zone (SVZ) neural stem cell (NSC) niche (Pearson correlation). Source:
   [148]GSE104651. C) During embryogenesis, quiescence depth is predicted
   to fall and then rise between the zygote and 32-cell state. Single cell
   RNA-seq measured for preimplantation mouse blastomeres. Source:
   [149]GSE136714. D) GO transcriptomic analysis of proliferation markers.
   Shown here are the mean z-score-normalized transcript levels for
   transcripts belonging to GO cell growth and GO stem cell proliferation.
   While GO cell growth-related genes show no definitive pattern, stem
   cell proliferation-related genes rise and then fall in a pattern that
   (except for the zygote stage) aligns with the fall and rise in QDS seen
   in early embryogenesis.

   Examining quiescence in a different biological context, we investigated
   quiescence depth during embryogenesis, specifically focusing on the
   zygote to 32-cell state. Here, the model predicted a fall and then a
   rise in quiescence depth between the zygote and 32-cell (Fig. [150]8C).
   This suggests that during embryogenesis, cells are most proliferative
   around the four-cell state and less so before and after. To corroborate
   these findings, we examined the normalized expression levels of genes
   belonging to two proliferation-related gene sets (Fig. [151]8D). While
   cell growth gene expression shows no definitive trends, stem cell
   proliferation gene expression (aside from the zygote stage) rises and
   then falls during early embryogenesis, which aligns with the changes in
   quiescence depth predicted by the machine learning model. Given that
   embryonic cells are a type of stem cell, the stem cell proliferation
   gene set serves as a more relevant validation of these predictions.
   Taken together, our model implies that quiescence depth regulation is a
   critical factor in both development and aging.

Various small molecules and genes predicted to alter quiescence align
strongly with experimental results

   We next applied the multi-cell type QDS predictor to screen a range of
   perturbations for their ability to modify quiescence depth. As input to
   the model, we used data measured as part of the library of integrated
   cellular signatures (LINCS) program that measures gene expression in
   response to small molecule and genetic perturbations using the L1000
   assay ([152]52) (Fig. [153]9A). Importantly, we reconstructed the model
   using genes shared between the high confidence “landmark” L1000 gene
   set and the training gene set.

Fig. 9.

   [154]Fig. 9.
   [155]Open in a new tab

   Small molecules and gene knockouts predicted to alter quiescence align
   strongly with experimental results. A) Overview of perturbation
   screening: L1000 data measured as part of the library of integrated
   network-based cellular signatures program for two classes of
   perturbations (small molecules and gene knockouts) is inputted into the
   QDS predictor to screen perturbations for their effect on quiescence
   depth. B) Top small molecules whose knockout is predicted to alter
   quiescence depth. For each of the 1,713 small molecules tested,
   conditions are controlled to a 24-h timepoint and 10 μM dose across
   multiple cell lines. Top 25 predictions with Q-value < 0.05 shown here,
   as ranked by magnitude of quiescence depth score modulation. Q-value is
   the Benjamin–Hochberg-corrected P-value of a two-sample t test between
   predicted QDS for the control condition (drug vehicle) and treatment
   condition (small molecule compound). C) Top genes whose knockout is
   predicted to alter quiescence depth (Q-value < 0.05). D) Enrichment of
   tumor suppressors among quiescence deepeners (genes whose knockout
   lowers QDS and increases proliferative potential), and enrichment of
   oncogenes/cancer drivers among quiescence reducers (genes whose
   knockout raises QDS and decreases proliferative potential). P-value
   calculated using the hypergeometric function. Results are shown at
   three Q-value thresholds for selecting top gene knockout hits: 0.05,
   0.1, and 0.85. Cancer annotations sourced from the CancerMine database.
   Genes with a majority of tumor suppressor annotations in CancerMine are
   classified as tumor suppressors, and genes with a majority of cancer
   driver or oncogene annotations are classified as cancer drivers.

   First, we screened all 1,713 small molecules in the LINCS dataset for
   their ability to alter quiescence depth. For each molecule, QDS was
   predicted and compared to QDS values predicted for a vehicle control
   with a two-sample t test. The ratio between the average QDS in the
   treatment group and the average QDS in the control group was then
   computed. At a false discovery rate (FDR) of 5%, 488 small molecules
   were predicted to raise or lower quiescence depth. The top 25
   predictions are highlighted in Fig. [156]8B, and the full list of
   compounds is listed in Table [157]S4.

   Among the top compounds predicted to deepen quiescence,
   cyclin-dependent kinase (CDK), histone deacetylase (HDAC), and
   topoisomerase II (TOPII) inhibitors are common (Fig. [158]9B). In
   alignment with their predicted role as quiescence deepeners, many of
   the identified small molecules predicted to deepen quiescence have
   direct experimental evidence for inducing quiescence through cell cycle
   arrest, including the CDK inhibitors R-547 ([159]53), AT-7519
   ([160]54), and AZD-5438 ([161]55). Others may be novel compounds to
   alter quiescence. Of the 488 small molecules predicted to alter
   quiescence, 51 are predicted to lower quiescence depth. While several
   of these compounds have antiproliferative activity and are thus likely
   false positives, the hormone estradiol is among the compounds predicted
   to lower quiescence, which aligns with the proliferation-promoting
   effects of estrogen ([162]56).

   Next, we screened all 53 gene knockouts in the LINCS dataset for their
   ability to modulate quiescence. QDS values predicted for each gene
   knockout were compared to QDS predicted for an empty gene vector
   control. As with the metabolic model knockouts, we used the CancerMine
   database to validate our predictions. Here, for both directions of QDS
   modulation, a striking percentage of genes were validated with
   experimental cancer data. At an FDR of 5%, of the genes with
   annotations, all genes whose knockout lowered QDS are annotated as
   tumor suppressors, and all genes whose knockout raised QDS are
   considered cancer drivers (Fig. [163]9C).

   For the genes whose knockout is predicted to lower quiescence depth,
   the strongest prediction was made for tumor protein p53 (TP53), a gene
   famous for its tumor-suppressive and antiproliferative properties
   ([164]57). At more lenient differential QDS thresholds, three of the
   top five genes whose knockout lowers QDS are tumor suppressors, an
   enrichment of 3.86-fold over the background gene set (P-value 0.021).

   For genes whose knockout is predicted to deepen quiescence, results are
   similarly striking. At Q-value thresholds up to 0.3, all genes whose
   knockout is predicted to raise quiescence depth are considered
   oncogenes or cancer drivers. Among the top predictions are several
   notorious oncogenes including proto-oncogene C-Myc (MYC), which is
   dysregulated in over 50% of human cancers ([165]58), and proto-oncogene
   B-Raf (BRAF), whose mutation is associated with nearly 7% of all human
   cancers ([166]59). The top gene knockout hits for both directions of
   QDS alteration are shown in Fig. [167]9C along with their CancerMine
   annotation. The full list of genes is listed in Table [168]S5.

Discussion

   Here, we employed two distinct yet complementary approaches to identify
   modulators of quiescence depth: in silico gene knockouts in metabolic
   models, and a multi-cell type quiescence depth predictor that
   facilitates perturbation screening. In doing so, we also simulated
   metabolic and epigenetic changes underlying quiescence deepening,
   revealing substantial shifts in lipid metabolism and epigenetic
   modifications as quiescence shifts from a shallow to a deep state. For
   both screening approaches, top quiescence-altering candidates align
   strongly with experimental data and suggest various pathways capable of
   shifting quiescence toward a shallower or deeper state.

   The most striking metabolic trends identified during quiescence
   deepening involve lipid metabolism, namely the fall in various kinds of
   lipid anabolism and the rise of lipid catabolism, particularly fatty
   acid oxidation. In alignment with these results, fatty acid oxidation
   is a key hallmark of quiescence across multiple cell types. Comparing
   the metabolomics of proliferating vs. quiescent fibroblasts, Yao et al.
   ([169]20) observed that quiescent fibroblasts showed higher levels of
   carnitine-shuttle-dependent fatty acid oxidation. Further, they
   observed lower rates of complex-lipid biosynthesis in quiescent cells.
   Additional studies have shown that in both fibroblasts ([170]60) and
   NSCs ([171]61), quiescent cells are so reliant upon fatty acid
   oxidation for energy production that inhibiting the pathway impairs
   cell viability. Both lend experimental evidence to a metabolic shift
   from fatty acid anabolism to catabolism as cells transition from
   proliferation to progressively deeper stages of quiescence.

   Based on epigenetic simulations, similarly strong trends were
   identified in histone modifications with acetylation falling and
   methylation rising as quiescence deepens. Quiescence is typically
   associated with reduced gene expression and more compact chromatin
   architecture. It is also associated with changes in histone
   methylation, although the direction of this change is variable and
   depends on the specific histone residue ([172]26). For acetylation, a
   clearer pattern is evident. In agreement with the simulated results,
   quiescence induction is associated with a reduction in histone
   acetylation across multiple cell types and histone residues. Since
   acetylation promotes a more accessible chromatin structure, this
   reduction—in part—mediates the transition to lower levels of gene
   expression during the transition from proliferation to quiescence
   ([173]26). An investigation into the metabolic pathways driving the
   trends in histone modifications revealed that shifts in fatty acid
   metabolism—higher fatty acid oxidation, lower fatty acid
   biosynthesis—in quiescence could explain the observed trends and
   highlight the strong interplay between metabolism and epigenetics.
   Through our modeling, we showed how quiescence-inducing perturbations
   such as serum starvation can result in metabolic changes that influence
   epigenetic modifications, which alter gene expression and ultimately
   result in withdrawal from the cell cycle.

   Given the alteration in quiescence depth observed in aging and disease
   ([174]5, [175]60, [176]62), identifying chemical and genetic modulators
   of quiescence may be key to addressing these conditions. In this study,
   we identified potential modulators of quiescence depth using two
   approaches. Using rMTA, we identified genes whose knockout could
   robustly shift the metabolic state of a cell to a deeper or shallower
   state of quiescence. Particularly for quiescence reducers, results
   align well with experimental data. Notably, several genes predicted to
   affect quiescence depth here are involved in lipid metabolism, aligning
   with the shifts in lipid metabolism predicted in quiescence deepening.

   We also developed two complementary quiescence depth predictors: a
   flux-based predictor that captures metabolic variations in quiescence
   deepening and a gene expression-based predictor based on multiple cell
   types and quiescence induction methods. Both models accurately
   predicted quiescent cells to have higher quiescence depth than
   proliferating cells for multiple types of growth induction, multiple
   types of quiescence induction, and multiple types of cells, suggesting
   they are generalizable to contexts beyond the datasets they were
   trained. Applying the multi-cell type model to screen L1000
   perturbations generated predictions that align well with experimental
   data for both small molecules and gene knockouts. It also predicted
   changes in proliferative ability during embryogenesis, suggesting
   quiescence depth is an important biological factor from the earliest
   stages of development onto later aging processes. Interestingly,
   several small molecules whose knockout is predicted to affect
   quiescence depth target histone deacetylases (HDACs), underscoring the
   link between epigenetics and quiescence deepening, and connecting the
   metabolic modeling and knockout screening.

   Several limitations to this study should be noted. Metabolic
   simulations are a powerful hypothesis-generating tool, and although
   several predictions align with experimental datasets, more
   comprehensive experimental validation is necessary to confirm their
   accuracy. Moreover, while GEMs are a well-established tool to study
   metabolism, their predictions are limited to the genes and metabolic
   pathways that are well described in literature, which represents only a
   subset of mammalian metabolism. The small molecule and genetic
   perturbations will also require further validation to confirm their
   ability to alter quiescence depth.

   In sum, we leverage two approaches to identify perturbations with
   strong potential to alter quiescence depth and with relevance to cancer
   and aging. We also show that GEMs can capture known aspects of
   quiescence deepening across metabolic and epigenetic levels and
   simulate the interplay between metabolic networks and epigenetic
   modifications. Our multi-cell type quiescence depth predictor is
   generalizable to a wide range of contexts. This study underscores the
   power of metabolic modeling and machine learning to characterize and
   predict modulators of cell states.

Methods

Differential expression analysis

   To identify differentially expressed genes found in omics datasets for
   integration into the metabolic model, we employed a fold change-based
   approach. For this approach, the ratio between two groups was computed
   and transcripts with a fold change above a certain threshold were
   considered upregulated and those with a fold change below a certain
   threshold downregulated. A fold change threshold of 1.00 was used
   unless otherwise noted.

Flux prediction

   To simulate metabolic reaction fluxes, we used two complementary
   approaches: flux sampling and a linear version of the iMAT algorithm
   ([177]18, [178]24). For both approaches, we used the Rat-GEM metabolic
   model ([179]16), which was chosen to match the biology of the rat
   embryonic fibroblast cells in which the quiescence deepening
   transcriptomics was measured. This model contains 2,953 genes, 8,377
   metabolites, and 13,028 reactions.

   For flux sampling, quiescence depth-specific metabolic models were
   created for nine stages of quiescence depth with three samples each by
   incorporating highly and lowly expressed genes to constrain the models
   using INIT ([180]17, [181]63). Highly expressed genes were defined as
   genes whose quantile-normalized, log2-normalized value was at least the
   mean plus alpha times the standard deviation in at least beta percent
   of the three quiescence depth replicates ([182]Eq. 1). Lowly expressed
   genes were defined similarly ([183]Eq. 2).
   [MATH: <mrow><mi mathvariant="normal">Gen</mi></mrow><msub><mrow><mi
   mathvariant="normal">e</mi></mrow><mrow><mrow><mi
   mathvariant="normal">up</mi></mrow></mrow></msub><mo>≥</mo><mrow><mi
   mathvariant="normal">sum</mi></mrow><mo
   stretchy="false">(</mo><mrow><mrow><mi
   mathvariant="normal">Log</mi></mrow><mn>2</mn><mo
   stretchy="false">(</mo><mrow><mrow><mi
   mathvariant="normal">Gen</mi></mrow><msub><mrow><mi
   mathvariant="normal">e</mi></mrow><mrow><mrow><mi
   mathvariant="normal">abun</mi></mrow><mo>.</mo></mrow></msub><mo>+</mo>
   <mn>1</mn></mrow><mo stretchy="false">)</mo><mo>≥</mo><mo
   stretchy="false">(</mo><mrow><mi
   mathvariant="normal">mean</mi></mrow><mo>+</mo><mrow><mi
   mathvariant="normal">alpha</mi><mo>*</mo><mi
   mathvariant="normal">st</mi></mrow><mrow><mo>.</mo><mspace
   width="0.25em"></mspace><mi
   mathvariant="normal">dev</mi></mrow><mo>.</mo></mrow><mo
   stretchy="false">)</mo><mo stretchy="false">)</mo><mo>≥</mo><mrow><mi
   mathvariant="normal">beta</mi></mrow><mo>*</mo><mn>3.</mn> :MATH]
   (1)
   [MATH: <mtable columnalign="right left" rowspacing=".5em"
   columnspacing="thickmathspace" displaystyle="true"><mtr><mtd><mspace
   width="-6.95pt"></mspace><mrow><mi
   mathvariant="normal">Gen</mi></mrow><msub><mrow><mi
   mathvariant="normal">e</mi></mrow><mrow><mrow><mi
   mathvariant="normal">down</mi></mrow></mrow></msub></mtd><mtd><mo>≥</mo
   ><mrow><mi mathvariant="normal">sum</mi></mrow><mo
   stretchy="false">(</mo><mrow><mrow><mi
   mathvariant="normal">Log</mi></mrow><mn>2</mn><mo
   stretchy="false">(</mo><mrow><mrow><mi
   mathvariant="normal">Gen</mi></mrow><msub><mrow><mi
   mathvariant="normal">e</mi></mrow><mrow><mrow><mi
   mathvariant="normal">abun</mi></mrow><mo>.</mo></mrow></msub><mo>+</mo>
   <mn>1</mn></mrow><mo stretchy="false">)</mo><mo>≤</mo><mo
   stretchy="false">(</mo><mrow><mi
   mathvariant="normal">mean</mi></mrow><mo>−</mo><mrow><mi
   mathvariant="normal">alpha</mi><mo>*</mo><mi
   mathvariant="normal">st</mi></mrow><mrow><mo>.</mo><mspace
   width="0.25em"></mspace><mi
   mathvariant="normal">dev</mi></mrow><mo>.</mo></mrow><mo
   stretchy="false">)</mo><mo
   stretchy="false">)</mo></mtd></mtr><mtr><mtd></mtd><mtd><mo>≥</mo><mrow
   ><mi
   mathvariant="normal">beta</mi></mrow><mo>*</mo><mn>3.</mn></mtd></mtr><
   /mtable> :MATH]
   (2)

   Alpha was chosen as the minimum value that generated metabolic models
   that could obtain a feasible flux solution (described in the next step)
   through flux sampling for all nine quiescence depths. Alpha was raised
   in increments of 0.1 until a value of 1.1 was determined. Beta was
   chosen as 0.6 (meaning genes had to meet the specified criteria in at
   least two of the three replicates) to balance capturing changes in
   transcripts over time and still imposing a robust threshold for
   selecting up and down genes.

   We then used coordinate hit-and-run with rounding (CHRR) ([184]64) to
   perform flux sampling on the quiescence-specific metabolic models for
   each stage of quiescence depth. 1,000 flux samples were generated for
   each quiescence depth. The Gurobi mathematical solver (version 9.5,
   Linux) was used to solve for the reaction fluxes on a high-performance
   computing cluster ([185]65).

   While flux sampling has advantages, including the absence of a
   presupposed objective function and a more comprehensive accounting of
   alternate flux solutions, it also takes considerable computational
   resources and time. Because of this, we also simulated reaction fluxes
   with a simpler approach using a linear version of the iMAT algorithm
   and compared the results to flux sampling (Fig. [186]S4). This approach
   assumes a steady-state metabolism and solves for the reaction fluxes by
   optimizing an objective function ([187]66). Here, reactions associated
   with upregulated genes are assigned higher flux values while reactions
   associated with downregulated genes are assigned lower values. Up and
   down genes are determined using the fold change approach described
   above based on differential expression compared to the lowest
   quiescence depth (QDS = 2); hence, eight quiescence depth scores were
   used with three replicates each (since the first quiescence depth was
   used as the reference). The Gurobi mathematical solver (version 9.0,
   macOS) was used here to solve for the reaction fluxes ([188]65). Based
   on a subsystem analysis, we found that the predicted metabolic changes
   during quiescence deepening were largely similar between the linear
   iMAT algorithm and flux sampling (Fig. [189]S4).

   The linear version of the iMAT algorithm included three input
   parameters: kappa (weight for downregulated genes), rho (weight for
   upregulated genes), and epsilon (minimum absolute value for upregulated
   flux). To determine which parameter values to use, different ranges of
   values were tested to determine which parameter set minimized
   three-fold cross-validated mean squared error in predicting quiescence
   depth through elastic net regression. The same approach was used to
   choose the objective function to optimize for during flux balance
   analysis (FBA) (biomass, adenosine triphosphate [ATP], nicotinamide
   adenine dinucleotide [NADH], and glutathione were tested, and biomass
   was chosen) and the optimal fold change threshold for selecting
   upregulated and downregulated genes from the quiescence deepening
   transcriptomics to input into iMAT (1.008695652). The bounds of the
   fold change threshold optimization range were set close to 1 (1 to 1.2)
   to maximize the impact of genes on the fluxes and separate conditions
   in external datasets.

Histone methylation and acetylation simulation

   To simulate histone methylation and acetylation, reactions were added
   to the rat-GEM metabolic model representing these histone modifications
   as described in Fig. [190]4A ([191]24, [192]25). This simulation
   modeled the substrate metabolite abundance to predict histone
   modification flux. Then the objective value for this reaction was
   modified to a nonzero value and flux predictions were made for each QDS
   using the linear version of the iMAT algorithm described above. Linear
   iMAT was chosen over flux sampling as quiescence-associated fluxes were
   similar between flux sampling and iMAT and because this approach
   requires an objective function value for the histone-modification
   reaction to be a nonzero value (since the histone-modification reaction
   does not contribute to biomass optimization). The exact objective
   function value for the histone-modification reaction was determined by
   identifying the range of values that separated the histone-modification
   fluxes across QDS (i.e. fluxes that were not all identical) and
   resulted in biomass flux values that were on the same order of
   magnitude or higher than the flux through the histone-modification
   reaction (meaning the histone objective was not overoptimized relative
   to biomass). Figure [193]S5 shows histone-modification fluxes and
   biomass fluxes meeting these criteria for multiple objective function
   values.

Subsystem enrichment analysis

   To perform the metabolic subsystem enrichment analysis, the proportion
   of reactions belonging to a particular subsystem within the highly
   correlating-with-quiescence depth reactions was divided by the
   proportion of reactions associated with that particular subsystem out
   of the total set of reactions. This was done separately for the
   positively correlating and negatively correlating reaction sets. The
   hypergeometric function was used to calculate the P-value for the
   enrichment of each subsystem. These P-values were corrected with the
   Benjamini–Hochberg FDR adjustment.

rMTA gene knockout screening

   For the rMTA gene knockout screening ([194]29, [195]64), we generated
   metabolic models for shallow quiescence (QDS of 2 to 3 days) and deep
   quiescence (QDS of 12 to 14 days) using INIT and selecting up and down
   genes as described above in the Flux prediction section. We performed
   two rounds of rMTA: one in which the source state was shallow
   quiescence and the target state was deep quiescence and another in
   which the source state was deep quiescence and the target state was
   shallow quiescence. For each round, the following procedure was used:
    1. For the source state, we generated fluxes through flux sampling
       with CHRR using the same parameters described in the Flux
       prediction section ([196]30, [197]64).
    2. Using differential transcriptomics, we determined the transcripts
       that changed between the source state and target state at an FDR
       value of 0.05 (Benjamini–Hochberg-corrected P-value of two-sample t
       test between the source and target state).
    3. We mapped these changes in transcripts to changes in metabolic
       reactions using gene-reaction mapping in the metabolic model.
    4. We knocked out each gene in the metabolic model and determined its
       potential to shift these metabolic reactions from the source to the
       target state.

   The rMTA screening was performed using the Gurobi mathematical solver
   (version 9.5, Linux) with default parameters on a high-performance
   computing cluster ([198]65). Gene knockout predictions were validated
   using CancerMine as described below in CancerMine validation.

CancerMine validation

   For the CancerMine database approach, gene annotation as a tumor
   suppressor, oncogene, or cancer driver was used to validate predictions
   ([199]30).

   To validate genes whose knockdown was predicted to lower quiescence
   depth (i.e. raise proliferative potential), the following criteria were
   used:
    1. In the CancerMine database, a majority of annotations for this gene
       were tumor suppressors (genes whose downregulation increases cancer
       growth)

   To validate genes whose knockdown was predicted to raise quiescence
   depth (i.e. lower proliferative potential), the following criteria were
   used:
    1. In the CancerMine database, a majority of annotations for this gene
       were either oncogenes or cancer drivers (genes whose downregulation
       lowers cancer growth)

   If no annotations were present or if annotations were split 50% tumor
   suppressor, 50% oncogene, or driver, validation for that gene was
   considered inconclusive.

Machine learning model creation and optimization

   To build the flux-based QDS model, all GEM reaction fluxes generated by
   the linear version of iMAT were set as the input, and the output was
   set as QDS (days since serum starvation) without normalization to [0,
   1]. The linear version of iMAT was chosen as the input because its
   predictions aligned well with flux sampling (see Fig. [200]S4) while
   using considerably fewer computational resources. An elastic net
   architecture was chosen to strike a balance between underfitting and
   overfitting the reaction coefficients. Underfitting would impair the
   accuracy of the model in the training dataset and overfitting would
   hinder the generalizability of the model in external datasets, both
   outcomes we sought to avoid. The elastic net architecture accomplishes
   this by minimizing model error while also minimizing both the sum of
   the magnitude of coefficients and the sum of the square of coefficients
   (so-called L1 and L2 penalties, also known as lasso and ridge
   penalties). This ensures that only the most salient and generalizable
   features are incorporated into the model.

   The model was implemented in MATLAB with the lasso function and
   accepted two key parameters: lambda, the regularization coefficients,
   and alpha, the ratio between L1 and L2 optimization (or lasso and ridge
   optimization). Both alpha and lambda were optimized to minimize
   three-fold cross-validated mean squared error in predicting QDS. The
   fold change threshold, objective function, and inputs to the linear
   version of iMAT were optimized similarly, as described in the Flux
   prediction section (Table [201]S6). In total, 838 metabolic reactions
   were selected by elastic net regression in this model (i.e. they had
   nonzero coefficient values) out of 13,028 total reactions used in the
   training set.

   To ensure the flux-based model was robust to the data integration
   method, we predicted quiescence depth across the entire range of values
   for fluxes generated via INIT and flux sampling (Fig. [202]S7) as
   described in the Flux prediction section. Here, the model accurately
   predicted a deepening of quiescence for these fluxes despite vastly
   different transcriptomics incorporation methods.

   To build the multi-cell type transcriptomics QDS predictor, we
   aggregated three transcriptomics datasets capturing quiescence
   deepening over time ([203]5, [204]48, [205]67). Each dataset was
   converted to transcripts per million (TPM) and quantile-normalized to
   align the distributions between datasets (Fig. [206]S6). A count filter
   was applied to omit any transcripts with a coverage of <10.
   Proliferating controls were also omitted. Then, principal component
   analysis was performed to identify outlier conditions, and any outliers
   were removed (Fig. [207]S6). Finally, each transcript was z-score
   normalized across conditions. This resulted in 79 total samples across
   the three datasets. Normalized gene expression was set as the input to
   the model and QDS was set as the output.

   For each sample, QDS was defined as the days since quiescence induction
   began. Quiescence depth was normalized to the maximum of each dataset,
   so all values were in the range of 0 to 1 (Fig. [208]S6).

   To construct the machine learning model, an elastic net regression
   model was built and optimized to predict QDS based on transcript
   levels. As before, an elastic net architecture was chosen to strike a
   balance between underfitting and overfitting the model coefficients.
   The model was also implemented in MATLAB with the lasso function and
   accepted two key inputs: lambda, the regularization coefficients, and
   alpha, the ratio between ridge and lasso optimization. Both alpha and
   lambda were optimized to minimize 10-fold cross-validated mean squared
   error in predicting QDS (Table [209]S7). In total, 1,306 genes were
   selected by elastic net regression in this model out of 9,304 genes in
   the training set.

   To evaluate both models internally, we performed 10-fold
   cross-validation in which a random 90% of the data was used to train
   and validate the model, and the remaining 10% of the data was used as a
   test set to evaluate the model's accuracy. The trained and optimized
   model had never seen this last 10%. This process was repeated 10 times
   until all samples had been in the test group.

Enrichment analysis of transcriptomic elastic net model coefficients

   To perform GO enrichment of the multi-cell type elastic net model QDS
   predictor, we used the R package clusterProfiler ([210]68). We searched
   for GO biological processes that were enriched in the positive
   coefficients (up with quiescence deepening) and negative coefficients
   (down with quiescence deepening). For the background gene set, we used
   the list of genes used for training the elastic net model.

Application of models to external datasets

   For external datasets, raw counts were filtered to omit genes with a
   count of <10, and each condition was size-normalized. For single-cell
   RNA-sequencing data, low coverage values were imputed using the
   algorithm Markov affinity-based graph imputation of cells (MAGIC)
   ([211]69).

   If fewer than 40% of model coefficient genes were found in a dataset
   after filtering, the elastic net model was reconstructed using genes
   shared between the training set and the external dataset using the same
   approach described in the Machine learning model creation and
   optimization section. This procedure was performed only for the dataset
   [212]GSE109546.

L1000 perturbation screening

   Level 5 (consensus z-score signature) L1000 gene expression data from
   the LINCS phase II ([213]52) dataset was read into MATLAB using
   CMAP-associated scripts ([214]70). Genes were filtered to include only
   the directly measured landmark gene set. The multi-cell type QDS
   predictor was reconstructed using only the genes common to the training
   set and the landmark gene set with the same approach described in the
   Machine learning model creation and optimization section.

Small molecule screening

   One thousand seven hundred and thirteen small molecules were tested,
   and conditions were controlled to a 24-h
   post-small-molecule-administration timepoint and 10 μM dose. Multiple
   cell lines were used for each small molecule. QDS predictions were made
   with the multi-cell type QDS predictor for each small molecule and
   compared to QDS predictions for a vehicle control with a two-sample t
   test. Small molecules with a significant difference in mean QDS between
   the treatment and control group were highlighted.

Gene knockout screening

   Fifty-three gene knockouts were tested, and all conditions were
   measured 96 h postgene-knockout. Multiple cell lines were used for each
   gene knockout. QDS predictions were made with the multi-cell type QDS
   predictor for each gene and compared to QDS predictions for an empty
   gene vector control with a two-sample t test. Gene knockouts with a
   significant difference in the mean QDS between the treatment and
   control group were highlighted and validated as described in the
   CancerMine validation section.

Quantification and statistical analysis

   Statistical tests were performed in MATLAB (version R2022A and R2022B)
   and R (version 4.3.0). All correlations were computed with the Pearson
   method. A two-tailed t test was used to determine the statistical
   significance of differences between groups based on the independence of
   observations in each sample, the normality of samples, similarity in
   sample variance, and random sampling of data. The sample size was
   determined based on the number of replicates in publicly available
   datasets and listed in figure captions for each comparison (no power
   analyses were performed). A paired t test was used to compare
   quiescence depth scores generated through the fold change approach (as
   these were paired samples). Enrichment significance was calculated
   using the hypergeometric test. P-values were corrected for multiple
   hypotheses through the Benjamini–Hochberg FDR correction. A P-value of
   0.05 was used as a threshold for significance.

Supplementary Material

   pgae013_Supplementary_Data
   [215]Click here for additional data file.^ (9.8MB, zip)

Contributor Information

   Alec Eames, Department of Biomedical Engineering, University of
   Michigan, Ann Arbor, MI 48109, USA.

   Sriram Chandrasekaran, Department of Biomedical Engineering, University
   of Michigan, Ann Arbor, MI 48109, USA; Program in Chemical Biology,
   University of Michigan, Ann Arbor, MI 48109, USA; Rogel Cancer Center,
   University of Michigan Medical School, Ann Arbor, MI 48109, USA; Center
   for Bioinformatics and Computational Medicine, University of Michigan,
   Ann Arbor, MI 48109, USA.

Supplementary Material

   [216]Supplementary material is available at PNAS Nexus online.

Funding

   This work was supported by faculty start-up funds from the University
   of Michigan (UM), Camille and Henry Dreyfus Foundation, and the
   National Institute of General Medical Sciences (NIGMS) R35 GM13779501
   to S.C.

Author Contributions

   A.E. and S.C. conceived the study and wrote the manuscript. A.E.
   performed research with supervision by S.C.

Data Availability

   All datasets used in this study were downloaded from public
   repositories. Transcriptomics data used to generate fluxes and train
   the quiescence depth predictors are available from GEO under accession
   numbers [217]GSE124109 ([218]5), [219]GSE129964 ([220]67), and
   [221]GSE179848 ([222]48). Omics data used to validate the models,
   explore quiescence in other contexts, and perform perturbation
   screening are available under the following GEO accession numbers:
   [223]GSE117444 ([224]45), [225]GSE74620 ([226]46), [227]GSE122927
   ([228]41), [229]GSE112295 ([230]44), [231]GSE109546 ([232]49),
   [233]GSE104651 ([234]51), [235]GSE136714 ([236]71), [237]GSE59114
   ([238]50), and [239]GSE70138 ([240]52). All code used to generate the
   results of this study is available at the following Github repository:
   [241]https://github.com/sriram-lab/quiescence-depth-modulation.

References