Graphical abstract

   graphic file with name fx1.jpg
   [49]Open in a new tab

Highlights

     * •
       A trans-omic network of insulin action in Drosophila cells was
       constructed
     * •
       Insulin co-regulates various anabolic processes in a time-dependent
       manner
     * •
       The trans-omic network and a CRISPR screen for cell proliferation
       were integrated
     * •
       A Myc-mediated subnetwork promoting anabolic processes is required
       for cell growth
     __________________________________________________________________

   Systems biology; In silico biology; Omics

Introduction

   The insulin signaling pathway regulates growth and metabolism and is
   highly evolutionarily conserved from Drosophila to mammals ([50]Oldham
   and Hafen, 2003; [51]Teleman, 2009). In both Drosophila and mammals,
   mutations in components of the insulin pathway are associated with
   diabetes and growth defects ([52]Baker and Thummel, 2007; [53]Oldham
   and Hafen, 2003; [54]Teleman, 2009). Furthermore, dysregulation of
   insulin signaling causes systemic disorders such as dyslipidemia,
   hypertension, cardiovascular disease, stroke, blindness, kidney
   disease, female infertility, and neurodegeneration ([55]White, 2003).

   To regulate growth, insulin signaling coordinately promotes various
   anabolic metabolic processes such as glycogenesis, lipid synthesis,
   nucleic acid synthesis, and protein synthesis ([56]Saltiel and Kahn,
   2001; [57]Valvezan and Manning, 2019; [58]Zhu and Thompson, 2019).
   Insulin activates signaling molecules including the Akt kinase,
   extracellular-signal-regulated kinase (Erk), and mechanistic target of
   rapamycin (mTOR) by phosphorylation and protein–protein interactions
   (PPIs), and regulates downstream transcriptional and translational
   events required for anabolic metabolism. This growth program requires
   precise co-regulation of the number of metabolic enzymes through
   insulin signaling-dependent transcriptional and translational
   mechanisms.

   There have been many attempts to characterize the insulin-regulated
   network of signaling molecules, transcription factors (TFs), metabolic
   enzymes, and other proteins regulating cellular functions such as
   proteins synthesis, and metabolites. Various “omic” studies of insulin
   action have been reported focusing on the phosphoproteome ([59]Humphrey
   et al., 2013, [60]2015; [61]Kawata et al., 2018, [62]2019; [63]Krüger
   et al., 2008; [64]Krycer et al., 2017; [65]Matsuzaki et al., 2021;
   [66]Monetti et al., 2011; [67]Ohno et al., 2020; [68]Vinayagam et al.,
   2016; [69]Zhang et al., 2017), PPIs ([70]Friedman et al., 2011;
   [71]Glatter et al., 2011; [72]Vinayagam et al., 2016), the
   transcriptome ([73]Dupont et al., 2001; [74]Hectors et al., 2012;
   [75]Kawata et al., 2018; [76]Kim and Lee, 2014; [77]Matsuzaki et al.,
   2021; [78]Rome et al., 2003; [79]Sano et al., 2016; [80]Versteyhe
   et al., 2013), and the metabolome ([81]Everman et al., 2016; [82]Kawata
   et al., 2018; [83]Krycer et al., 2017; [84]Matsuzaki et al., 2021;
   [85]Noguchi et al., 2013; [86]Ohno et al., 2020; [87]Yugi et al.,
   2014). To provide a more comprehensive view than what can be gained
   from a single type of omic data alone, we have previously proposed
   “trans-omics” as a discipline for constructing molecular interaction
   networks from multi-omic data sets using direct molecular interactions
   rather than indirect statistical relationships ([88]Yugi and Kuroda,
   2017; [89]Yugi et al., 2014, [90]2016). For example, in our previous
   studies, we constructed trans-omic networks responding to insulin
   stimulation in mammalian cells ([91]Kawata et al., 2018; [92]Kokaji
   et al., 2020; [93]Matsuzaki et al., 2021; [94]Ohno et al., 2020;
   [95]Yugi et al., 2014), and responding to glucose administration in the
   liver of healthy and obese mice ([96]Kokaji et al., 2020). Although
   these studies have expanded our view of insulin signaling, functional
   aspects (i.e., effects on phenotypes) remain to be explored for a large
   part of the network components (i.e., molecules and the regulatory
   relationships between them) and their relationships with phenotypes.

   Insulin signaling is required for cell growth and proliferation in both
   Drosophila and mammals in a context-dependent manner ([97]Straus, 1981;
   [98]Wu et al., 2007). For example, some of the effects of insulin on
   cell proliferation in culture require non-physiological concentrations
   of insulin ([99]Straus, 1981), and hormones and nutrients in the
   culture medium can affect the growth stimulatory effects of insulin
   ([100]Straus, 1981; [101]Wu et al., 2007). In Drosophila cells, insulin
   signaling can promote cell proliferation by inhibiting Foxo, which is a
   TF that causes cell-cycle arrest ([102]Puig et al., 2003). However,
   insulin stimulation significantly promotes cell proliferation only when
   Pvr, a receptor tyrosine kinase regulating cell proliferation, is
   deficient in Drosophila Kc cells ([103]Sopko et al., 2015).
   Furthermore, although insulin promotes cell growth ([104]March and
   Bentley, 2006; [105]Wu et al., 2007), insulin inhibits cell cycle
   progression through G2/M ([106]Wu et al., 2007), indicating that
   insulin regulates cell proliferation in a context-dependent manner.

   Here, we constructed a trans-omics network of insulin action that
   regulates gene expression and metabolism in Drosophila S2R + cells by
   integrating PPI, phosphoproteomic, transcriptomic, and metabolomic data
   following insulin stimulation. Next, as cell growth is required for
   cell proliferation and insulin can stimulate proliferation in a
   context-dependent manner, we developed a framework for integrating the
   trans-omic network with data from a genome-scale cell proliferation
   screen using CRISPR ([107]Viswanatha et al., 2018). This analysis
   highlights, in particular, the role of the Myc TF in cell growth
   through coordinated activation of genes involved in anabolic processes.

Results

Construction of the trans-omic network regulated by insulin and the
integration with CRISPR knockout screening data

   Using time-series data of PPIs, phosphoproteome, transcriptome, and
   metabolome ([108]Figure 1A), together with various bioinformatic
   resources ([109]Figure 1B), we constructed a regulatory trans-omic
   network for insulin-responsive gene expression and metabolic reactions
   in Drosophila S2R + cells (“trans-omic network of insulin action”;
   [110]Figure 1C). The trans-omic network is composed of five layers:
   insulin signaling molecules (insulin signal), TFs, insulin-responsive
   genes (IRGs), metabolic reactions, and insulin-responsive metabolites
   (IRMs), with intra/inter-layer regulatory connections between molecules
   in the same and different layers ([111]Figure 1C). We analyzed
   time-series multi-omic data (PPI, phosphoproteomic, transcriptomic, and
   metabolomic) measured in insulin-stimulated Drosophila S2R + cells
   ([112]Figure 1D). Whereas PPI, phosphoproteomic, and transcriptomic
   data sets were obtained from previous studies [PPIs of 20 insulin
   signaling molecules by affinity purification-MS (AP-MS) ([113]Vinayagam
   et al., 2016); phosphoproteome LC-MS data ([114]Vinayagam et al.,
   2016), and transcriptome ([115]Zirin et al., 2019)], we also generated
   a new dataset of metabolome data by liquid chromatography-mass
   spectrometry (LC-MS) following insulin stimulation ([116]STAR Methods).
   The metabolome and transcriptome were measured up to 180 min after
   insulin stimulation and the phosphoproteome and the PPIs were measured
   up to 30 min after insulin stimulation ([117]Figure 1D). We defined as
   “insulin-responsive” molecules that were quantitatively changed by
   insulin stimulation in the phosphoproteome, transcriptome, or
   metabolome data sets. We also determined the direction of
   insulin-responsive molecules as either increased or decreased in the
   amount based on the maximum and minimum absolute log[2] fold change
   values across time-series compared to time 0 ([118]Figure 1E).

Figure 1.

   [119]Figure 1
   [120]Open in a new tab

   Construction of the trans-omic network regulated by insulin and the
   integration with CRISPR knockout screen data

   (A–C) We used the time-series multi-omic data measured in
   insulin-stimulated Drosophila S2R + cells (A) and database and software
   (B) for constructing the trans-omic network of insulin action that
   involved Steps I–VI (C).

   (D) Time-series multi-omic data measured in insulin-stimulated
   Drosophila S2R + cells. PPI data were measured for 20 insulin signaling
   molecules by affinity purification-mass spectrometry ([121]Table S2)
   ([122]Vinayagam et al., 2016).

   (E) Definition of changes in the insulin-responsive molecules. We
   defined an insulin-responsive molecule as “increased” (red) when the
   absolute value of maximum (MAX) log[2] fold change compared to time 0
   (without insulin stimulation) across the time-series is larger than the
   absolute value of minimum (MIN) log[2] fold change, and as “decreased”
   (blue) otherwise.

   (F and G) We integrated the trans-omic network of insulin action and
   the CRISPR screen data for cell proliferation in Drosophila S2R + cells
   ([123]Viswanatha et al., 2018) (F) to identify subnetworks required for
   cell growth (G).

   Construction of the trans-omic network involved five steps
   ([124]Figure 1C). In Step I, we identified IRGs from RNA-seq data sets.
   In Step II, we predicted TFs regulating clusters of IRGs using
   information from a TF binding motif database and a motif scanning
   software including Cis-BP and FIMO ([125]Grant et al., 2011;
   [126]Weirauch et al., 2014). In Step III, we predicted upstream
   signaling pathways regulating the TFs using phosphoproteomic data
   ([127]Vinayagam et al., 2016), PPI data of 20 insulin signaling
   molecules ([128]Vinayagam et al., 2016), the MIST PPI database ([129]Hu
   et al., 2018), and the NetPhorest software for kinase prediction
   ([130]Horn et al., 2014; [131]Miller et al., 2008). In Step IV, we
   identified IRMs by metabolomic analysis. Finally, in Step V, we
   connected the IRMs/IRGs and the metabolic reactions using the KEGG
   database of metabolic pathways ([132]Kanehisa et al., 2017) and
   connected the IRMs and metabolic reactions using the BRENDA database
   for allosteric regulations ([133]Chang et al., 2021).

   Next, we integrated the trans-omic network with results from a
   genome-wide pooled CRISPR knockout screen in Drosophila S2R + cells
   that identified 1,235 genes essential for cell proliferation
   ([134]Figure 1F) ([135]Viswanatha et al., 2018) to identify potential
   subnetworks required for cell growth ([136]Figure 1G).

Step I: identification of IRGs

   A previous RNA-seq study of insulin-stimulated Drosophila S2R + cells
   identified IRGs that were either increased or decreased following
   insulin stimulation ([137]Table S1) ([138]Zirin et al., 2019). This
   study focused on 163 genes, which is a subset of the IRGs that
   overlapped with 750 genes affecting nucleolar size from a previous RNAi
   screen ([139]Neumüller et al., 2013). To further extend this study, we
   analyzed all the 1,212 IRGs ([140]Figure 2A; [141]Table S1). Based on
   our analysis ([142]Figure 1E), 58.8% of the IRGs (713 genes) are
   increased, whereas 41.2% of the IRGs (499 genes) are decreased
   ([143]Figure 2A).

Figure 2.

   [144]Figure 2
   [145]Open in a new tab

   Identification of IRGs and prediction of TFs that regulate the IRGs

   (A) Numbers of increased and decreased IRGs and not responsive genes.
   We divided the IRGs into increased and decreased IRGs as shown in
   [146]Figure 1E. See also [147]Table S1.

   (B) Heatmap and hierarchical clustering of the Z-score normalized time
   course of gene expression of the IRGs. We performed hierarchical
   clustering using Euclidean distance and Ward’s method. Numbers on the
   tree diagram indicate the cluster identity. The significantly enriched
   KEGG pathways and TF binding motifs are shown on the right. See also
   [148]Table S1 and [149]STAR Methods.

   (C) Distribution of the Spearman correlation coefficients of gene pairs
   among the IRGs in the same KEGG pathway. We tested whether the average
   Spearman correlation coefficients of IRG pairs in the same KEGG pathway
   were significantly higher than randomly sampled IRG pairs. ∗p < 0.05,
   ∗∗p < 0.01, ∗∗∗p < 0.001. See also [150]Table S1 and [151]STAR Methods.

   (D) Time course of the IRGs in significantly co-regulated KEGG
   pathways. Only significantly enriched pathways for any of the IRG
   clusters are shown. Data are shown as the log[2] fold change values
   compared to time 0 that are calculated from the mean of biological
   replicates. Different line colors indicate different IRGs. We tested
   whether the average Spearman correlation coefficients of the IRG pairs
   in each of the KEGG pathways were significantly higher than randomly
   sampled IRG pairs. We analyzed KEGG pathways containing 10 or more
   IRGs. See also [152]Figure S1A, [153]Table S1, and [154]STAR Methods.

   Next, we clustered the IRGs by hierarchical clustering with Euclidean
   distance and Ward’s method ([155]Figure 2B; [156]Table S1). We analyzed
   all clusters containing 100 genes or more in hierarchical clustering.
   Clusters 2 and 3 were obtained by dividing the dendrogram at the top
   level ([157]Figure 2B). The majority of the genes in cluster 2 (96%)
   are increased IRGs. By contrast, the majority of the genes in cluster 3
   (94%) are decreased IRGs. We categorized the IRG clusters into
   (i) cluster 2 and its sub-clusters as the increased clusters, and (ii)
   cluster 3 and its sub-clusters as the decreased clusters.

   To identify the functional characteristics of each cluster, we
   performed a KEGG pathway enrichment analysis (q value <0.05, Fisher’s
   exact test; [158]Table S1). Because the clusters generated in this
   study were not mutually exclusive, a KEGG pathway can be enriched in
   overlapping clusters. Therefore, we selected non-overlapping and
   significant clusters for each KEGG pathway based on the significance (q
   value) of Fisher’s exact test and the hierarchical structure of the
   dendrogram ([159]STAR Methods) ([160]Buehler et al., 2004). Increased
   (clusters 7, 9, 15, and 16) and decreased (clusters 3, 8, and 13)
   clusters were enriched for distinct pathways (upper right in
   [161]Figure 2B). Increased clusters were enriched for multiple anabolic
   pathways such as Ribosome biogenesis in eukaryotes (cluster 7), Protein
   processing in endoplasmic reticulum (cluster 9), Spliceosome (cluster
   9), Purine metabolism (cluster 15), Pyrimidine metabolism (cluster 15),
   RNA polymerase (cluster 15), and D-Glutamine and D-glutamate metabolism
   (cluster 16). Consistent with these observations, the insulin/mTOR
   pathway has been reported to promote the expression of ribosome
   biogenesis genes in both mammals and Drosophila using transcriptomic
   analyses of mutant or overexpressed cells or animals of components of
   the insulin/mTOR signaling pathway (PI3K, Foxo, Myc, S6K, or Tor),
   starved and fed conditions, or rapamycin-treated cells ([162]Chauvin
   et al., 2014; [163]Guertin et al., 2006; [164]Li et al., 2010;
   [165]Teleman et al., 2008). Decreased clusters were enriched for DNA
   replication (cluster 3), Tryptophan metabolism (cluster 8), and FoxO
   signaling pathway (cluster 13). This result is consistent with a
   previous report that Foxo is an evolutionarily conserved TF that is
   negatively regulated by insulin signaling ([166]Puig et al., 2003).

   Interestingly, increased and decreased clusters are enriched in
   distinct exclusive pathways, suggesting that IRGs belonging to the same
   pathway show a similar time course and are co-regulated by insulin
   signaling. To quantitatively evaluate whether IRGs in the same KEGG
   pathway are co-regulated in a time-dependent manner, we examined
   whether average Spearman correlation coefficients between the IRGs in
   the same KEGG pathway are higher than those of randomly sampled pairs
   from all the IRGs ([167]Table S1; [168]STAR Methods). Strikingly, the
   average Spearman correlation coefficients of the IRG pairs in the same
   KEGG pathways were significantly higher than randomly sampled pairs
   from the IRGs ([169]Figure 2C). We also examined the distributions of
   Spearman correlation coefficients between IRGs within individual KEGG
   pathways ([170]Figures 2D and [171]S1A; [172]Table S1). For most of the
   anabolic pathways enriched for increased clusters such as Purine
   metabolism, Pyrimidine metabolism, Protein processing in the
   endoplasmic reticulum, Spliceosome, Ribosome biogenesis in eukaryotes,
   and RNA polymerase, the average Spearman correlations of the IRG pairs
   within the pathways were significantly higher than those of randomly
   sampled IRG pairs (Bonferroni-adjusted p value <0.05). Although we
   identified co-regulated pathways following insulin stimulation using
   the correlation analysis of the IRG pairs within the same KEGG pathway,
   it is possible that we underestimated the average Spearman correlation
   coefficients of some pathways that contain multiple functional modules
   (e.g. modules related to synthesis and degradation of a product within
   a metabolic pathway). Thus, we analyzed the Spearman correlation
   coefficients of IRG pairs within KEGG modules. Interestingly, both the
   average and median Spearman correlation coefficients of the IRG pairs
   within some KEGG modules were significantly higher than those within
   the KEGG pathways ([173]Figure S1B; [174]Table S1), suggesting the
   existence of co-regulated modules showing distinct time-series
   expression patterns from other modules within the same KEGG pathway.
   For instance, whereas the average Spearman correlation within the
   Purine and Pyrimidine metabolism pathways was relatively low (0.13 and
   0.17, respectively; [175]Figure 2D), we observed high Spearman
   correlation coefficients of IRG pairs within the Inosine monophosphate
   biosynthesis module in Purine metabolism and the Uridine monophosphate
   biosynthesis module in Pyrimidine metabolism, which are related to de
   novo purine synthesis and de novo pyrimidine synthesis, respectively
   ([176]Figures S1C and S1D; [177]Table S1). These results suggest that
   IRGs belonging to the same pathway are temporally co-regulated by
   insulin. Altogether, insulin signaling coordinately increased IRGs
   involved in anabolic pathways and coordinately decreased IRGs involved
   in DNA replication in a time-dependent manner.

Step II: prediction of TFs that regulate IRGs

   Next, we predicted TFs regulating the IRG clusters using a motif
   enrichment analysis ([178]Figure 2B; [179]Table S1; [180]STAR Methods).
   If a TF binding motif was enriched in the regions from −1,000 bp
   to +100 bp of the transcription start site of the genes in a cluster,
   we predicted the regulatory relationships between the TF and the genes
   in that cluster. We selected non-overlapping and significant clusters
   for each TF binding motif based on the significance (q value) of
   Fisher’s exact test and the hierarchical structure of the dendrogram
   ([181]STAR Methods) ([182]Buehler et al., 2004).

   We predicted TFs for the IRG clusters (lower right in [183]Figure 2B).
   Interestingly, Myc was identified as a putative regulator of increased
   cluster 23 that is a sub-cluster of cluster 7 where ribosome biogenesis
   genes are enriched. This result is consistent with previous reports
   that Myc is regulated by insulin signaling in flies and ribosome
   biogenesis is one of the main cellular functions of Myc target genes
   ([184]Grewal et al., 2005; [185]Li et al., 2010; [186]Orian et al.,
   2003; [187]Teleman et al., 2008). Among the predicted TFs for any of
   the IRG clusters, gce, Myc, and tai were contained in the IRGs. To
   validate whether the predicted regulatory relationships are detected in
   ChIP-seq data, we compared our findings with the TF target gene
   regulatory relationships available in the ChIP-Atlas database
   ([188]Table S1) ([189]Oki et al., 2018). ChIP-Atlas contains ChIP-seq
   information for Myc and Max in Drosophila cell lines among the
   predicted TFs. The regulatory relationships registered in ChIP-Atlas
   for both factors significantly overlapped with our predictions
   (Bonferroni-adjusted p value < 0.05, Fisher’s exact test;
   [190]Figure S2). We also confirmed that KEGG pathways and TF motifs
   enriched in any of the analyzed clusters remained similar regardless of
   a subtle change of the threshold of cluster size around 100 genes
   ([191]Figure S3). These results indicate that the TF prediction in our
   analysis is valid and robust.

Step III: prediction of upstream signaling pathways regulating the predicted
TFs

   We connected the TFs to insulin signaling using phosphoproteomic and
   PPI data ([192]Vinayagam et al., 2016), NetPhorest software ([193]Horn
   et al., 2014; [194]Miller et al., 2008), and MIST PPIs ([195]Hu et al.,
   2018) ([196]Figure 3). We used phosphoproteomic data to infer possible
   kinases responsible for phosphorylated proteins with NetPhorest and
   Fisher’s exact test. We used PPIs, detected by AP-MS measurement, to
   predict PPIs of TFs and insulin signaling molecules. Finally, we
   integrated the predicted kinase-substrate interactions, PPIs, and MIST
   PPIs, and constructed a merged network that allowed us to extract paths
   from the insulin receptor to the TFs.

Figure 3.

   [197]Figure 3
   [198]Open in a new tab

   Prediction of upstream signaling pathways regulating the predicted TFs

   (A) Number of phosphorylated peptides (left) and phosphorylated
   proteins (right) measured in the phosphoproteome obtained from
   [199]Vinayagam et al. (2016). We divided the IRpPs into increased (red)
   and decreased (blue) IRpPs as shown in [200]Figure 1E. See also
   [201]Table S2.

   (B) Enrichment analysis of possible substrates of each kinase for the
   increased and decreased IRpPs (q value <0.05, Fisher’s exact test). See
   also [202]Table S2 and [203]STAR Methods.

   (C) Number of PPIs measured at three different time points after
   insulin stimulation (0, 10, and 30 min after insulin stimulation)
   ([204]Vinayagam et al., 2016). AP-MS data were measured for the 20
   canonical insulin signaling proteins ([205]Table S2). Note that the
   number of interactions, not the number of the preys, are shown. See
   also [206]Table S2.

   (D) Method used for predicting upstream signaling pathways regulating
   the predicted TFs. See also [207]STAR Methods.

   (E) Upstream signaling pathways regulating the predicted TFs. We
   predicted the signaling pathways from the insulin receptor to each of
   the TFs following the procedure shown in [208]Figure 3D and obtained
   the insulin signal layer by merging the pathways. The border colors of
   the nodes in the layer correspond to the colors shown in
   [209]Figure 3D. The pie charts on the nodes indicate the downstream TFs
   of the nodes. Data sources are shown on the right side of the nodes
   contained in the PPI and phosphorylation network with the color-coded
   for increased or decreased IRpPs ([210]Figure 3A). The edges derived
   from the PPIs of [211]Vinayagam et al. (2016) are colored by the
   detected time point(s) ([212]Figure 3C), and the others are shown in
   grey. The network diagram was created using Cytoscape ([213]Shannon
   et al., 2003). See also [214]Table S2.

   (F) KEGG signaling pathway enrichment analysis of the upstream
   signaling pathways of each predicted TF. See also [215]Table S2 and
   [216]STAR Methods.

   We predicted the kinases responsible for the phosphorylated peptides.
   Specifically, we identified 268 phosphopeptides (193 proteins), whose
   phosphorylation levels were increased or decreased following insulin
   stimulation among the 3,037 phosphorylated peptides (1,225 proteins)
   measured in the phosphoproteomic data ([217]Figure 3A; [218]Table S2)
   ([219]Vinayagam et al., 2016). According to our previous study,
   phosphopeptides that showed an absolute log[2] fold change larger than
   0.5 at any time point compared to time 0 were defined as the
   insulin-responsive phosphopeptides (IRpPs). We divided the IRpPs
   selected by this criterion into increased and decreased IRpPs using the
   method shown in [220]Figure 1E, of which 177 are increased IRpPs (135
   proteins), whereas 91 are decreased IRpPs (66 proteins). Eight proteins
   contained both increased and decreased IRpPs. Based on the amino acids
   sequences of the phosphopeptides measured in the phosphoproteomic data
   set, we predicted possible kinases that regulate the phosphopeptides
   using NetPhorest, a software for predicting kinase classifiers for
   input phosphopeptide sequences ([221]Table S2; [222]STAR Methods)
   ([223]Horn et al., 2014; [224]Miller et al., 2008). Using these
   predicted kinase–substrate relationships (KSRs), we tested whether the
   predicted substrates of each kinase were enriched in the increased and
   decreased IRpPs to identify significantly activated or inhibited
   kinases, respectively (q value <0.05, Fisher’s exact test). In sum, 16
   and 8 kinases were enriched in the increased and decreased IRpPs,
   respectively, including S6k, which is among the baits of the AP-MS
   measurement, and CamKII, which is a prey identified by the AP-MS
   measurement ([225]Figure 3B). If the predicted target phosphopeptides
   of a kinase were enriched in the increased or decreased IRpPs, then we
   selected the KSRs between the kinase and its predicted substrates
   contained in the increased or decreased IRpPs, respectively, and used
   them for the following analysis ([226]Table S2; [227]STAR Methods).

   Next, to infer PPIs of TFs and insulin signaling molecules, we
   identified PPIs of insulin-stimulated Drosophila S2R + cells from our
   previous study ([228]Figure 3C) ([229]Vinayagam et al., 2016). PPIs
   were measured by AP-MS for the 20 insulin signaling proteins at
   baseline (without insulin treatment) and after insulin stimulation for
   10 or 30 min. The PPIs consist of 555 proteins and 1,807 interactions.
   We classified the PPIs into seven types based on the detected time
   point(s) according to our previous study ([230]Figure 3C)
   ([231]Vinayagam et al., 2016).

   To construct the merged network, we integrated the IRpPs
   ([232]Figure 3A), the predicted regulatory relationships from the
   kinases to the IRpPs ([233]Figure 3B), PPIs ([234]Figure 3C), and the
   MIST PPI network into a new network of insulin signaling from receptor
   to downstream TFs ([235]Figure 3D). We first merged all the data from
   these data sources (“Input data” in [236]Figure 3D) and extracted the
   subnetworks containing the signaling molecules extracted from KEGG, the
   kinases predicted by NetPhorest, or the predicted TFs to capture the
   comprehensive regulatory relationships from the insulin receptor to the
   TFs through the phosphorylation and the PPIs (“Step III-I” in
   [237]Figure 3D).

   We extracted paths from the insulin receptor to the TFs from the merged
   network (“Step III-II” and “Step III-III” in [238]Figure 3D;
   [239]Table S2; [240]STAR Methods). We considered the extracted paths
   from the insulin receptor to a TF as the upstream signaling pathways of
   the TF because the intracellular insulin signaling originates from the
   activation of the insulin receptor and is transmitted through its
   downstream signaling cascade to the TF. We denoted nodes and edges in
   the union of the upstream signaling pathways for the TFs as the insulin
   signal. Among the 14 predicted TFs, 10 TFs were connected to the
   upstream signaling pathways (“TFs” in [241]Figure 3E). The resulting
   insulin signal layer consists of 101 nodes and 357 edges
   ([242]Table S2). In the insulin signal layer, we can distinguish the
   common upstream signaling molecules for all 10 TFs and the specific
   upstream signaling molecules for a subset of the TFs by counting the
   number of downstream TFs for each of the nodes in the layer. In sum, 8
   molecules (6.9%) were found in the upstream paths of all 10 TFs,
   whereas 65 molecules (64.4%) were upstream of two or more TFs, and 36
   molecules (35.6%) were upstream of only one TF. The molecules that are
   commonly found in the upstream paths of all 10 TFs include InR, chico
   (Drosophila ortholog of insulin receptor substrates), Pi3K21B, and
   Pi3K92E. Importantly, our predictions captured the regulatory
   relationships that have been reported in previous studies. In
   particular, in Drosophila cells, Myc is stabilized by insulin/mTOR
   signaling followed by increased phosphorylation of Sgg/Gsk3-beta
   ([243]Parisi et al., 2011). Consistent with this study, Sgg is present
   in the pathway upstream of Myc ([244]Figures 3E, [245]S4A, and S4B). We
   further analyzed which upstream signaling molecules directly regulate
   the TFs, as our method involves direct and indirect associations
   (interactions) between upstream signaling molecules and the TFs
   ([246]Figure S4A). We found that RpS6, Msn, Act57B, CaMKII, CkIIalpha,
   Pont, and Vha26 directly interact with Myc only at 30 min following
   insulin stimulation ([247]Figures S4A and S4B), suggesting that Myc is
   regulated by insulin signaling through the changes in the PPIs with
   these proteins. To identify the pathways enriched in the upstream
   signaling pathways of each TF, we performed pathway enrichment analysis
   using KEGG signaling pathway annotation (Fisher’s exact test, q value
   <0.05; [248]Table S2; [249]STAR Methods). Interestingly, the Hippo
   pathway is enriched in the upstream signaling pathway of Mnt
   ([250]Figure 3F), suggesting that insulin possibly regulates TF
   activities through non-canonical insulin signaling pathways.

Step IV: identification of IRMs

   We measured the metabolome in Drosophila S2R + cells following insulin
   stimulation using LC-MS/MS ([251]Table S3; [252]STAR Methods). In sum,
   236 metabolites were measured in three replicates or more at all time
   points (0, 60, 120, and 180 min). We confirmed the high reproducibility
   of our metabolomic measurements (>0.98 Pearson correlation coefficient
   between any two biological replicates; [253]Figure S5), and identified
   53 IRMs with 26.4% (14) and 73.6% (39) metabolites that increased and
   decreased, respectively ([254]Figure 4A; [255]Table S4). Many IRMs
   showed a monotonic increase or decrease ([256]Figure 4B).

Figure 4.

   [257]Figure 4
   [258]Open in a new tab

   Identification of IRMs and connection of the IRMs/IRGs and metabolic
   reactions

   (A) Number of metabolites measured in the metabolomic data. Metabolites
   that showed an absolute log[2] fold change larger than 0.5 and an
   FDR-adjusted p value (q value) of less than 0.1 at any time point were
   defined as IRMs. The q values were calculated by Storey’s procedure
   ([259]Storey and Tibshirani, 2003). We divided the IRMs into increased
   and decreased IRMs as shown in [260]Figure 1E. See also [261]Table S4.

   (B) Heatmap of the log[2] scaled fold change (relative to time 0)
   normalized time courses of the IRMs.

   (C) Increasing, decreasing, and non-responsive metabolites projected
   onto the KEGG metabolic pathways. Substrate-product relationships
   between the mapped metabolites (increased, decreased, and not
   responsive metabolites) were extracted from a KEGG Markup Language file
   (dme01100) downloaded from KEGG.

   Next, we mapped the metabolites on the KEGG metabolic pathways
   ([262]Figure 4C). Out of 236 metabolites (45 out of 53 IRMs), 200 were
   mapped on KEGG, and pathway enrichment analysis identified the
   metabolic pathways containing a significantly large number of the IRMs
   ([263]Table S4). Although no pathways were significantly enriched in
   the increased or decreased IRMs (Fisher’s exact test, q value <0.05),
   metabolites in purine metabolism, a sub-metabolic pathway of nucleotide
   metabolism, showed a moderate enrichment in the decreased IRMs
   (Fisher’s exact test, uncorrected p value <0.05), and the 12 IRMs
   contained in this pathway were the highest of all the KEGG pathways
   ([264]Figure S6). Also, eight IRMs mapped to the pyrimidine metabolism
   pathway that is another sub-pathway of nucleotide metabolism. In
   addition, IRMs contained TCA cycle intermediates (succinyl-CoA and
   2-oxoglutarate decreased, and oxaloacetate increased), urea cycle
   intermediates (L-arginine, L-argininosuccinate, and L-ornithine
   decreased), proteinogenic amino acids (L-lysine, L-phenylalanine, and
   L-arginine decreased), and lactate (increased) ([265]Figure 4C).

   IRMs in purine and pyrimidine metabolism (nucleotide metabolism)
   pathways accounted for 12/32 (the IRMs in the pathway/the measured
   metabolites in the pathway) and 8/28 of the metabolites detected in
   each pathway ([266]Figure S6). In addition, enzymes of purine and
   pyrimidine metabolism were enriched in the increased IRG clusters
   ([267]Figure 2B). The majority of the IRMs in the nucleotide metabolism
   pathway decreased (15/20 metabolites), which contain nucleosides
   (Guanine, cytosine), NDP (GDP), NTPs (ATP, CTP, GTP, TTP, and UTP), and
   dNTPs (dATP, dCTP, dGTP, and dTTP). Increased IRMs in the nucleotide
   metabolism pathway contained dNMP (dTMP), dNDP (CDP), and
   N-carbamoyl-L-aspartate, which is a metabolite involved in de novo
   pyrimidine synthesis catalyzed by carbamoyl-phosphate synthetase2,
   aspartate transcarbamylase, dihydrooratase (CAD). In mammalian cells,
   it has been reported that insulin increases the metabolic flux of the
   reaction of CAD and the abundance of N-carbamoyl-L-aspartate
   ([268]Ben-Sahra et al., 2013; [269]Robitaille et al., 2013). These
   results indicate that NTPs and the proteinogenic amino acids, required
   for transcription and protein synthesis, decreased, which is consistent
   with the increase in gene expression of the enzymes involved in these
   processes.

Step V: connection of the IRMs/IRGs and metabolic reactions

   We constructed the trans-omic network by connecting IRGs, metabolic
   reactions, and IRMs based on the annotation of enzymatic reactions in
   the KEGG database, and allosteric regulations from the BRENDA database
   ([270]STAR Methods) ([271]Chang et al., 2021).

   Using enzymatic reactions in the KEGG database, we connected IRGs,
   metabolic reactions, and IRMs to examine the effect of transcriptional
   regulation on metabolism ([272]Table S4). We identified 120 IRGs
   encoding metabolic enzymes. These contain the increased IRG rudimentary
   (r, CAD ortholog of Drosophila), which is a rate-limiting metabolic
   enzyme of de novo pyrimidine synthesis, consistent with the increase of
   N-carbamoyl-L-aspartate, which is the product of the reaction catalyzed
   by rudimentary. The phosphorylation level of the rudimentary metabolic
   enzyme was also increased by insulin stimulation ([273]Tables S2 and
   [274]S4), consistent with previous studies on mammals ([275]Ben-Sahra
   et al., 2013; [276]Robitaille et al., 2013).

   Using allosteric regulations extracted from the BRENDA database, we
   connected IRMs to metabolic reactions ([277]Table S4). In total, we
   identified 1,543 allosteric regulations from IRMs to metabolic
   reactions. Altogether, our analysis identified inter-omic
   transcriptional and allosteric regulation for metabolic reactions that
   can affect the responses of the IRMs.

Step VI: construction of the trans-omic network of insulin action

   We integrated the networks of Steps I-V and constructed a regulatory
   trans-omic network for insulin-responsive gene expression and metabolic
   reactions consisting of five layers, which are the insulin signal, TFs,
   IRGs, metabolic reactions, and IRMs ([278]Figure 5; [279]Table S5). The
   connections between the layers are representing regulatory events. The
   insulin signal layer contains the signaling molecules on the predicted
   signaling pathways from the insulin receptor to the TFs
   ([280]Figure 3E). The TF layer contains the 14 predicted TFs regulating
   the IRGs ([281]Figure 2B). The IRG layer contains the 1,212 IRGs
   ([282]Figure 2A). The metabolic reaction layer contains 1,376 metabolic
   reactions regulated by the IRMs and/or the IRGs. The IRM layer contains
   53 IRMs ([283]Figures 4A and 4B). We then determined intra- and
   inter-layer regulatory connections between insulin-responsive
   molecules. The intra-layer regulatory connections of the insulin signal
   layer and the inter-layer regulatory connections from the insulin
   signal layer to the TF layer were the predicted signaling pathways from
   the insulin receptor to the candidate TFs ([284]Figure 3E). The
   inter-layer regulatory connections from the TF layer to the IRG layer
   were determined by the predicted regulatory connections between TFs and
   IRGs ([285]Figure 2B). The inter-layer regulatory connections from the
   IRG to the metabolic reaction layer were determined by matching
   metabolic reactions to the corresponding metabolic enzymes encoded
   by IRGs according to KEGG annotation. The inter-layer regulatory
   connections between the metabolic reaction layer and the IRM layer
   consisted of two types: (1) regulatory connections mediated by
   allosteric regulators assigned according to BRENDA and (2) regulatory
   connections mediated by the substrate or product of the reaction
   according to KEGG.

Figure 5.

   [286]Figure 5
   [287]Open in a new tab

   Construction of the trans-omic network of insulin action

   The trans-omic network contains five layers and the regulatory
   relationships among them. The representative insulin-responsive
   molecules or significantly co-regulated pathways enriched in the IRG
   clusters in the trans-omic network are shown. The numbers of each type
   of insulin-responsive node and edge are shown on the right of the
   network diagram. The network diagram was created using Cytoscape
   ([288]Shannon et al., 2003). See also [289]Table S5.

   In our network, insulin activates the insulin receptor in the insulin
   signal layer and regulates 10 TFs in the TF layer, which contains Myc
   that is connected through the signaling pathway components layer that
   includes InR, chico, Pi3K, S6K, Akt, and Erk. In the IRG layer, the
   increased and decreased clusters of the IRGs are enriched for the
   distinct KEGG pathways and the functionally similar IRGs are
   significantly co-regulated ([290]Figures 2B–2D). The multiple anabolic
   pathways related to nucleotide synthesis, transcription, and
   translation are enriched for the increased clusters of the IRGs and the
   genes in each of those pathways are co-regulated. On the other hand,
   genes related to DNA replication are enriched for the decreased
   clusters and co-regulated in a time-dependent manner. Insulin regulates
   the metabolic reactions in the metabolic reaction layer through the
   regulation of the expression levels of the metabolic enzyme genes in
   the IRG layer and changes the concentration of the metabolites in the
   IRM layer. For example, insulin increases the expression level of the
   rudimentary metabolic enzyme gene, which is a rate-limiting metabolic
   enzyme of de novo pyrimidine synthesis, and the increase of rudimentary
   is consistent with the increase in the concentration of
   N-carbamoyl-aspartate, which is catalyzed by the rudimentary metabolic
   enzyme ([291]Figure 4C). In the IRM layer, the five NTPs and the three
   proteinogenic amino acids (L-Arginine, L-Lysine, and L-Phenylalanine)
   are decreased, which is consistent with increases in the expression of
   genes related to transcription (e.g., RNA polymerase and spliceosome)
   and translation (e.g., ribosome biogenesis). It is also possible that
   the IRMs regulate various metabolic processes through allosteric
   regulations. Altogether, the construction of the trans-omic network of
   insulin action revealed the functional and temporal characteristics of
   insulin-responsive molecules and the regulatory relationships among
   them.

Step VII: integration of the trans-omic network of insulin action with a
CRISPR screen for cell proliferation

   We previously performed a genome-wide pooled CRISPR knockout screening
   for cell proliferation in Drosophila S2R + cells and identified 1,235
   genes essential for cell proliferation (hereafter denoted as screen
   hits) ([292]Table S6) ([293]Viswanatha et al., 2018). Thus, as cell
   growth is required for cell proliferation and insulin can stimulate
   proliferation in a context-dependent manner, we hypothesized that part
   of the trans-omic network of insulin action includes the essential
   genes identified in the CRISPR screen for cell proliferation. Next, we
   compared the screen hits and the molecules in the trans-omics network,
   specifically the molecules in the insulin signal, TFs, and IRGs layers.
   As expected, the screen hits and the molecules in the trans-omic
   network were overlapping significantly (249 genes; p value = 2.90e-33,
   Fisher’s exact test; [294]Figure 6A).

Figure 6.

   [295]Figure 6
   [296]Open in a new tab

   Integration of the trans-omic network of insulin action with a CRISPR
   screen for cell proliferation

   (A) Numbers of molecules in the trans-omic network and screen hits. The
   p value of Fisher’s exact test is shown above the Venn diagram. See
   also [297]Table S6.

   (B) KEGG pathway enrichment analysis for the insulin-responsive and
   insulin non-responsive screen hits. See also [298]Table S6.

   (C) Distribution of Spearman correlation coefficients of the pairs
   within hit-IRGs, between hit-IRGs and non-hit IRGs, and within non-hit
   IRGs. To test co-regulation of IRGs within the above categories, we
   examined whether the average Spearman correlation coefficients of the
   IRG pairs in each of the categories were significantly higher than
   randomly sampled IRG pairs. We also compared the distribution of
   Spearman correlation between the categories by the Wilcoxon rank-sum
   test. ∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001. See also [299]STAR Methods
   and [300]Table S6.

   (D) Time course of the hit-IRGs in significantly co-regulated KEGG
   pathways. Data are shown as the log[2] fold change values compared to
   time 0 that are calculated from the mean of biological replicates.
   Different line colors indicate different hit-IRGs. We tested the
   significance of the average Spearman correlation in each group as
   described in the [301]STAR Methods. We analyzed KEGG pathways
   containing 10 or more hit-IRGs. See also [302]Table S6.

   (E) We predicted subnetworks required for cell growth in an
   insulin-dependent manner in the following two steps (Step VII-I and
   VII-II). In Step VII-I, we extracted TF-centric subnetworks each
   containing a TF, its upstream signaling pathways, and its target IRGs
   from the trans-omic network of insulin action. We mapped the Z-score
   obtained from the CRISPR KO screen to the extracted subnetworks. In
   Step VII-II, we selected subnetworks required for cell growth by the
   following three criteria: (i) the results of Gene Set Enrichment
   Analysis (GSEA) ([303]Subramanian et al., 2005) is significant for
   upstream signaling molecules of a TF (FDR <0.05); (ii) the result of
   GSEA is significant for the target IRG for the TF (FDR <0.05); and
   (iii) the TF itself is a screen hit.

   (F) The scatterplot shows the Z-score of the CRISPR screen for each of
   the screened genes. A smaller Z-score indicates a lower cell
   proliferation activity. Genes are sorted by the Z-score. The barcode
   plots below the scatterplot show the distribution of the upstream
   signaling molecules (“Sig.”) of the TFs ([304]Figure 3E), and the
   target IRGs (“IRGs”) regulated by the TFs ([305]Figure 2B). Bar graphs
   indicate the normalized enrichment score (NES) of GSEA and the color of
   a bar indicates the significance of the NES (FDR< 0.05). Labels of TFs
   included in subnetworks involved in cell growth defined in
   [306]Figure 6E are shown in red.

   (G) The predicted subnetwork required for cell growth consists of a TF
   Myc, its upstream signaling molecules (insulin signal; [307]Figure 3E),
   and its target hit-IRGs (IRGs; [308]Figures 2B and [309]6A). The TF
   layer contains the Myc TF that is required for cell growth as shown in
   red in [310]Figure 6F. The pathway layer contains KEGG pathways that
   contain three or more members that overlap with the hit-IRGs regulated
   by Myc. Pathways containing two or fewer members that overlap with the
   hit-IRGs regulated by Myc were categorized as “Other functions.” See
   also [311]Table S6.

   To identify cellular functions involved in cell growth and
   proliferation that are regulated in an insulin-dependent manner in
   S2R + cells, we analyzed the KEGG pathways enriched for the screen hits
   overlapping with the trans-omic network (hereafter denoted as
   insulin-responsive screen hits) as well as the remaining screen hits
   that are not overlapping with the trans-omic network (hereafter denoted
   as insulin non-responsive screen hits) ([312]Table S6). In sum, 35
   pathways were enriched in either the insulin-responsive screen hits or
   the insulin non-responsive screen hits (q value <0.05, Fisher exact
   test; [313]Figure 6B). Only three pathways (mRNA surveillance pathway,
   Protein processing in endoplasmic reticulum, and RNA degradation) were
   enriched in both insulin-responsive and non-responsive screen hits
   (shown in black in [314]Figure 6B). In sum, 16 pathways were
   specifically enriched for the insulin-responsive screen hits (shown in
   dark green in [315]Figure 6B), among which are the signaling pathways
   and multiple anabolic pathways such as Purine metabolism, Pyrimidine
   metabolism, RNA polymerase, and Ribosome biogenesis in eukaryotes. All
   of those anabolic pathways were enriched in the increased IRG clusters
   ([316]Figure 2B). This is consistent with previous studies that those
   anabolic pathways are activated by insulin/IGF signals to support cell
   growth and proliferation ([317]Valvezan and Manning, 2019;
   [318]Wullschleger et al., 2006; [319]Zhu and Thompson, 2019). On the
   other hand, 16 pathways were specifically enriched for the insulin
   non-responsive screen hits (shown in light green in [320]Figure 6B).
   Some of these pathways are involved in glucose and energy metabolism
   (TCA cycle, and Oxidative phosphorylation), protein catabolism
   (Proteasome and Ubiquitin mediated proteolysis), DNA metabolism, RNA
   metabolism, and protein synthesis. In contrast to the pathways
   specifically enriched in the insulin-responsive screen hits, the
   pathways specifically enriched in insulin non-responsive screen hits
   are relevant to catabolic processes and DNA replication, none of which
   are signaling pathways. Some of the anabolic pathways that were
   over-represented only for insulin non-responsive screen hits were also
   enriched for the preys of the PPI data that were not used for the
   network construction (i.e., the preys that are not contained in
   signaling pathways of the KEGG database) ([321]Figure S7;
   [322]Table S6). Those pathways contain Ribosome and Spliceosome,
   suggesting that these pathways are regulated at either
   post-transcriptional or post-translational levels. The association of
   ribosomal proteins and mTORC2, which is a component of the signaling
   pathways regulated by growth factors such as insulin, has previously
   been reported ([323]Zinzalla et al., 2011). The result of pathway
   enrichment analysis indicates that the insulin signal regulates a large
   part of the genes involved in signaling pathways and anabolic pathways
   that are needed for cell growth and proliferation in S2R + cells. In
   addition, the regulations by insulin signal happen at both the
   transcriptional level and the post-transcriptional, or
   post-translational level in S2R + cells.

   In this transcriptomic analysis, we found that the expression levels
   for the gene pairs of the IRGs in the same KEGG pathways show
   significantly higher Spearman correlation coefficients, i.e., the
   functionally similar IRGs tend to be co-regulated by insulin signaling
   ([324]Figure 2D). We further investigated whether this trend was also
   found in the gene pairs from the screen hits. We hypothesize that the
   pairs between the screen hits should show significantly higher Spearman
   correlation coefficients if co-regulation of the functionally similar
   IRGs is important in the context of cell growth and proliferation. We
   first divided the IRGs into two groups as (1) “hit-IRGs,” which contain
   IRGs overlapping with the screen hits; (2) “non-hit IRGs,” which are
   the IRGs not overlapping with the screen hits. Then we calculated
   Spearman correlation coefficients for the gene pairs within each group,
   and the gene pairs between hit IRGs and non-hit IRGs groups as well as
   randomly sampled pairs of all IRGs. The average and median Spearman
   correlation coefficients of the pairs within the hit-IRGs (0.19 and
   0.56, respectively) were the highest among all the three groupings
   shown in [325]Figure 6C and the average Spearman correlation
   coefficients of pairs within hit-IRGs were significantly higher than
   randomly sampled IRG pairs (Bonferroni-adjusted p value <0.05;
   [326]Table S6; [327]STAR Methods). In addition, the distributions of
   the correlation coefficients were significantly different between the
   groups (Bonferroni-adjusted p value <0.05, Wilcoxon rank-sum test;
   [328]Figure 6C; [329]Table S6). These results indicate that the gene
   pairs within hit-IRGs are more likely to be co-regulated than other
   pairs, suggesting that the co-regulation of IRGs is important in the
   context of cell growth and proliferation. We examined which genes of
   the KEGG pathways were co-regulated among hit-IRGs. Only Purine
   metabolism and Ribosome biogenesis in eukaryotes contained 10 or more
   hit-IRGs. Both of these pathways significantly show high average
   Spearman correlation coefficients calculated within the hit-IRGs in
   each of the pathways (Bonferroni-adjusted p value <0.05;
   [330]Figure 6D; [331]Table S6; [332]STAR Methods).

   Molecules in the trans-omics network and the screen hits significantly
   overlap ([333]Figure 6A), suggesting that at least a part of the
   trans-omics network is required for cell growth in the context of cell
   proliferation. Therefore, we aimed to identify subnetworks that are
   required for cell growth in an insulin-dependent manner. We
   hypothesized that TFs regulating cell growth in an insulin-dependent
   manner are likely to be regulated by genes that negatively/positively
   affect cell growth and likely regulate them ([334]Figure 6E). We
   predicted subnetworks that are potentially required for cell growth as
   shown in [335]Figure 6E ([336]STAR Methods). We identified the
   subnetwork containing Myc, the TF, and its upstream signaling pathways,
   as well as its target IRGs as a potential subnetwork required for cell
   growth in an insulin-dependent manner ([337]Figures 6F and 6G;
   [338]Table S6). Myc has been reported to regulate cell growth in both
   Drosophila and mammals ([339]Dang, 2013; [340]Johnston et al., 1999),
   and in addition, Myc has also been reported regulated by insulin/IGF
   signaling in Drosophila ([341]Demontis and Perrimon, 2009; [342]Parisi
   et al., 2011; [343]Teleman et al., 2008). In the predicted subnetwork
   surrounding Myc, the upstream signaling molecules contain Rheb and Tor,
   which are components of mTOR signaling regulating cell growth
   ([344]Oldham and Hafen, 2003; [345]Wullschleger et al., 2006), rl
   (Drosophila ortholog of Erk) and pdk1, which are well-known components
   of the insulin signaling pathway ([346]Teleman, 2009), and
   Sgg/Gsk3-beta, which has been reported as a potential Myc regulator
   ([347]Parisi et al., 2011). Pathways regulated by Myc contained
   anabolic pathways such as protein folding, nucleotide metabolism, and
   translation. These pathways also contain Prat, a rate-limiting enzyme
   of the purine metabolism pathway, and genes involved in ribosome
   biogenesis that were increased and highly co-regulated by insulin at
   the gene expression levels ([348]Figures 2B, 2D, and [349]6D). Thus,
   integration of the trans-omic network and the CRISPR screening data
   suggests that insulin is connected to cell growth through a subnetwork
   containing Myc, its upstream signaling pathway, and its target IRGs
   involved in anabolic processes.

Discussion

   In this study, we constructed a trans-omic network of insulin action in
   Drosophila S2R + cells by integrating time course PPI,
   phosphoproteomic, transcriptomic, and metabolomic data following
   insulin stimulation with bioinformatics resources. In the trans-omic
   network, 14 TFs, including Myc, and their upstream signaling pathways
   coordinately upregulated multiple anabolic processes such as nucleotide
   synthesis, transcription, and translation in a time-dependent manner.
   These responses may contribute to the decrease in metabolites such as
   NTPs and proteinogenic amino acids that are required for transcription
   and translation. We further analyzed how insulin signaling regulates
   cell growth in the context of cell proliferation by integrating the
   trans-omic network and the results from a previous CRISPR knockout
   screen for cell proliferation. Among the subnetworks surrounding
   various TFs in the trans-omic network, we identified a subnetwork
   including Myc, its upstream signaling pathways, and its downstream
   target genes as being involved in anabolic processes regulating cell
   growth in an insulin-dependent manner.

   We identified a coordinated upregulation of IRGs in multiple anabolic
   processes such as nucleotide synthesis, transcription, and translation
   in the time series. The IRGs related to these processes were also
   enriched in the screen hits of the CRISPR screen for cell
   proliferation. To precisely regulate cell growth, cells need to
   coordinately promote macromolecular syntheses of the substrates (e.g.
   nucleotides and amino acids), as well as the components of the
   transcriptional and translational machinery. Imbalances in these
   processes can have detrimental effects. In our previous study
   ([350]Zirin et al., 2019), we showed that inhibition of aminoacyl-tRNA
   synthases selectively kills Myc-overexpressed HMEC cells, which has
   been known to increase the activity of ribosome biogenesis. In
   addition, it has been reported that PI3K activation, which is known to
   activate protein synthesis, together with knockdown of either protein
   synthesis- or protein catabolism-related genes, causes synthetic
   lethality. However, simultaneous inhibition of both pathways causes no
   lethality ([351]Davoli et al., 2016). In another study,
   MYC-overexpressing human cells have been reported to show higher
   sensitivity to splicing inhibition than normal cells ([352]Hsu et al.,
   2015; [353]Kessler et al., 2012). The increase in the IRGs in the
   multiple anabolic pathways that are essential for cell growth and
   proliferation and the co-regulation of the IRGs within each of the
   anabolic pathways are reasonable regulatory strategies to balance the
   activities between anabolic pathways and maintain stoichiometry of the
   genes within each anabolic pathway. The significant co-regulation of
   the hit-IRGs further supports this interpretation. Although this study
   alone does not provide a causal relationship between the co-regulation
   of the IRGs related to macromolecule synthesis and the capacity of cell
   growth and proliferation, our results may reflect a design principle of
   the regulation of genes related to macromolecule synthesis under the
   constraints described above.

   The trans-omic network and the CRISPR knockout screening data are
   complementary. The trans-omic network provides the mechanistic
   regulatory relationships between insulin-responsive molecules, however,
   it does not contain the functional information on how each node in the
   network affects phenotypes, such as cell growth and proliferation. On
   the other hand, the CRISPR screen data provide quantitative information
   about the phenotypic effects of gene knockout on cell proliferation,
   but it does not directly provide the underlying molecular mechanisms.
   Therefore, integrative analysis of the trans-omic network and the
   CRISPR screen data can provide insights into molecular mechanisms. This
   approach is useful for narrowing down for example important subnetworks
   for the regulation of cell growth. Importantly, loss-of-function
   phenotypes are less likely to be buffered by functionally redundant
   paralogs in Drosophila than in mammals ([354]Ewen-Campen et al., 2017;
   [355]Viswanatha et al., 2018). In this study, the subnetwork
   surrounding Myc, which contains potential downstream effectors involved
   in anabolic pathways such as protein processing/synthesis and
   nucleotide metabolism, was predicted as key for the regulation of cell
   growth in the context of cell proliferation. Myc is an evolutionarily
   conserved TF that regulates cell growth and proliferation from
   Drosophila to mammals and is regulated by insulin/mTOR signaling
   ([356]Bellosta and Gallant, 2010). Our data-driven approach
   recapitulated the role of Myc as an important regulator of cell growth
   in an insulin-dependent manner, supporting the versatility of our
   approach for identifying evolutionarily conserved networks regulating
   phenotypes. In addition, our network provides a resource for the
   identification of novel regulatory relationships. Finally, the pipeline
   constructed in this study can be applied to other stimulatory signals
   and phenotypes.

   In our previous studies, we have analyzed the insulin action on the
   metabolism in mammalian cells ([357]Kawata et al., 2018; [358]Krycer
   et al., 2017; [359]Noguchi et al., 2013; [360]Ohno et al., 2020;
   [361]Yugi et al., 2014). In these studies, glycolytic intermediates
   such as glucose-1-phosphate, glucose-6-phosphate, fructose-6-phosphate
   (F6P), fructose-1,6-bisphosphate (F1,6BP), dihydroxyacetone phosphate
   (DHAP), 3-phosphoglycerate (3PG), and phosphoenolpyruvate (PEP), were
   reported to either increase or decrease following insulin stimulation.
   On the other hand, although all the above glycolytic intermediates were
   also measured in our metabolomic data, none of them were found to
   respond to insulin stimulation. In Drosophila, insulin, as is the case
   in mammals ([362]Saltiel and Kahn, 2001), has been shown to regulate
   glucose metabolism ([363]Teleman, 2009). Furthermore, we found that in
   our metabolomic analysis, that lactate, a product of glycolysis,
   increased after insulin stimulation, suggesting that the glycolytic
   flux may be increased by insulin stimulation in S2R + cells. Thus, one
   possible reason for the small number of IRMs belonging to the
   glycolytic pathway in our study may be owing to the time point
   measurements of our metabolome data. We have previously shown that, in
   rat FAO cells, some glycolytic intermediates (e.g. F1,6BP, DHAP, 2PG,
   3PG, and PEP) transiently increased following insulin stimulation and
   returned to basal levels 60 min afterward ([364]Yugi et al., 2014).
   Here, we measured the metabolome at 60, 120, and 180 min after insulin
   stimulation, which may not have captured the transient response
   occurring within 60 min. Further, the glycolytic flux can increase
   without changes in metabolite concentration. Metabolic flux analysis
   using ^13C-labeled glucose would address this issue in the future. To
   elucidate the rapid responses of metabolites and regulatory mechanisms
   of glucose metabolism in Drosophila, additional data in a shorter
   timescale than 60 min will be required.

   In our transcriptomic analysis, the IRGs in the pathways related to
   nucleotide synthesis, transcription, and translation were upregulated
   by insulin stimulation. In addition, the gene expression and
   phosphorylation level of rudimentary, a rate-limiting metabolic enzyme
   of de novo pyrimidine metabolism, and the abundance of
   N-carbamoyl-L-aspartate, which is the product of the reaction catalyzed
   by rudimentary, increased after insulin stimulation. Consistent with
   our findings, in mammals, syntheses of macromolecules such as nucleic
   acids, lipids, and proteins are promoted by mTORC1 depending on growth
   factors (e.g. insulin and IGF) and nutrient and energy status (e.g.
   amino acids, glucose, and ATP) ([365]Valvezan and Manning, 2019;
   [366]Wullschleger et al., 2006; [367]Zhu and Thompson, 2019). These
   results suggest that macromolecule syntheses increased after insulin
   stimulation in S2R + cells. NTPs and proteinogenic amino acids are
   consumed by transcription and protein synthesis, and the intermediate
   metabolites of the TCA cycle are used as substrates for nucleotide and
   amino acid synthesis. The decrease in the NTPs, the amino acids, and
   the TCA cycle intermediates observed in our metabolomic analysis may
   reflect the increased activity of transcription and protein synthesis
   through the increased gene expression of these pathways. Furthermore,
   in Drosophila KC cells, insulin stimulation has been reported to
   activate the pentose phosphate pathway that is an important source of
   nucleotide synthesis ([368]Ceddia et al., 2003). Altogether, these data
   suggest that the trans-omic network constructed in this study mainly
   captures the anabolic effects of insulin signaling that are important
   for cell and organismal growth from Drosophila to mammals ([369]Oldham
   and Hafen, 2003).

   In this study, we extended the framework of the trans-omic analyses of
   insulin action developed in our previous studies mainly on two points
   ([370]Kawata et al., 2018; [371]Yugi et al., 2014). First, we
   previously constructed trans-omic networks involving kinase-substrate
   interactions and networks in the KEGG database without PPIs
   ([372]Kawata et al., 2018; [373]Yugi et al., 2014). Here, we newly
   integrated into the trans-omic network, the PPI network built using
   AP-MS following insulin stimulation. Therefore, we expect that our
   Drosophila trans-omic study captures context-specific interactions that
   we were not able to address previously. Second, we explored the
   mechanisms regulating cell growth in an insulin-dependent manner by
   integrating the trans-omic network and the CRISPR cell proliferation
   screen, providing new insights into the functional aspects of the
   trans-omic network. Finally, as insulin signaling is evolutionarily
   conserved, we anticipate that many aspects of the Drosophila insulin
   network presented here will be relevant to the corresponding mammalian
   network.

Limitations of the study

   In this study, our integrative analysis of the trans-omic network of
   insulin action and the CRISPR screening data for cell proliferation
   successfully identified Myc as a key regulator of cell growth in an
   insulin-dependent manner, and identified the potential novel upstream
   regulators and target genes of Myc. However, further experiments are
   needed to validate the causal relationships of the changes in the
   activity of Myc and these molecules following insulin stimulation.
   Furthermore, our study does not directly capture some important aspects
   of insulin action such as changes in metabolic flux,
   post-transcriptional regulations, and epigenetic regulations. Future
   studies will be required to expand the trans-omic network to include
   other data types. Regardless of these limitations, we provide a
   framework for integrating a trans-omic network and CRISPR screen data.

STAR★Methods

Key resources table

   REAGENT or RESOURCE SOURCE IDENTIFIER
   Chemicals, peptides, and recombinant proteins
     __________________________________________________________________

   Insulin from bovine pancreas Sigma-Aldrich Cat#I6634
     __________________________________________________________________

   Deposited data
     __________________________________________________________________

   Metabolomic data This paper [374]Table S3
   Transcriptomic data [375]Zirin et al. (2019) GEO: [376]GSE129292
   Phosphoproteomic data [377]Vinayagam et al. (2016) [378]Table S4
   Affinity purification MS data [379]Vinayagam et al. (2016)
   [380]Table S1
   CRISPR knockout screening data [381]Viswanatha et al. (2018)
   [382]https://doi.org/10.7554/eLife.36333.012
     __________________________________________________________________

   Experimental models: Cell lines
     __________________________________________________________________

   D. melanogaster: Cell line S2R+ Laboratory of Norbert Perrimon N/A
     __________________________________________________________________

   Software and algorithms
     __________________________________________________________________

   Source code This paper [383]https://doi.org/10.5281/zenodo.6414309
   Python version 3.8.3 Python Software Foundation
   [384]https://www.python.org; RRID:[385]SCR_008394
   R version 3.6.3 [386]R Core Team (2021) [387]https://www.r-project.org/
   FlyBase version FB2018_04 [388]Larkin et al. (2021)
   [389]https://flybase.org/; RRID:[390]SCR_006549
   KEGG [391]Kanehisa et al. (2017) [392]http://www.kegg.jp/;
   RRID:[393]SCR_012773
   Cis-BP version 1.02 [394]Weirauch et al. (2014)
   [395]http://cisbp.ccbr.utoronto.ca; RRID:[396]SCR_017236
   MEME suite version 4.12.0 [397]Bailey et al. (2015)
   [398]http://meme-suite.org/; RRID:[399]SCR_001783
   FIMO version 4.12.0 [400]Grant et al. (2011)
   [401]https://meme-suite.org/meme/meme_5.3.2/doc/fimo.html
   ChIP-atlas [402]Oki et al. (2018) [403]https://chip-atlas.org/;
   RRID:[404]SCR_015511
   BedTools version 2.21.0 [405]Quinlan and Hall (2010)
   [406]https://github.com/arq5x/bedtools2; RRID:[407]SCR_006646
   PyBedTools version 0.8.1 [408]Dale et al. (2011)
   [409]https://daler.github.io/pybedtools/#; RRID:[410]SCR_021018
   NetPhorest human version 2.1 [411]Horn et al. (2014); [412]Miller
   et al. (2008) [413]https://netphorest.info/
   NetWorKIN version 3.0 Linding et al. (2008)
   [414]http://networkin.info/; RRID:[415]SCR_007818
   MyGene.info version 3 [416]Xin et al. (2016); [417]Wu et al. (2013)
   [418]https://mygene.info/; RRID:[419]SCR_018660
   DIOPT Ortholog Finder version 8 [420]Hu et al. (2011)
   [421]https://www.flyrnai.org/cgi-bin/DRSC_orthologs.pl
   MIST version 4.0 [422]Hu et al. (2018)
   [423]https://fgrtools.hms.harvard.edu/MIST/
   Cytoscape version 3.8.2 [424]Shannon et al. (2003)
   [425]https://cytoscape.org/; RRID:[426]SCR_003032
   BRENDA version 2021.2 [427]Chang et al. (2021)
   [428]http://www.brenda-enzymes.org; RRID:[429]SCR_002997
   GSEA version 4.0.3 [430]Subramanian et al. (2005)
   [431]http://www.broadinstitute.org/gsea/; RRID:[432]SCR_003199
   [433]Open in a new tab

Resource availability

Lead contact

   Further information and requests for resources and reagents should be
   directed to and will be fulfilled by the lead contact, Shinya Kuroda
   ([434]skuroda@bs.s.u-tokyo.ac.jp).

Materials availability

   This study did not generate new unique reagents.

Experimental model and subject details

   Drosophila S2R + cells (sex: male) were cultured in Schneider’s
   Drosophila medium (21,720–024; Thermo Fisher Scientific) supplemented
   with 10% fetal bovine serum at 25 °C. For insulin treatment, cells were
   incubated overnight in serum-free Schneider’s Drosophila medium.

Method details

Targeted mass spectrometry and data analyses

   Drosophila S2R + cells were incubated overnight in serum-free
   Schneider’s Drosophila medium (21,720–024; Thermo Fisher Scientific).
   Cells were then treated with 25 μg/mL insulin from bovine pancreas
   (I6634; Sigma-Aldrich) for 0, 60, 120, or 180 min 1 × 10^6 cells per
   sample (5 biological replicates) were collected on ice and rapidly snap
   frozen in liquid nitrogen. The intracellular metabolites were extracted
   using 1 mL of cold (−80 °C) 80% (v/v) aqueous methanol. The insoluble
   material in the lysates was centrifuged at 5,000 g for 5 min. The
   resulting supernatant was evaporated using a speed vac. Samples were
   re-suspended using 20 μL HPLC grade water for mass spectrometry. 10 μL
   were injected and analyzed using a 5500 QTRAP triple quadrupole mass
   spectrometer (AB/SCIEX) coupled to a Prominence UFLC HPLC system
   (Shimadzu) via selected reaction monitoring (SRM) of a total of 287
   endogenous water-soluble metabolites for steady-state analyses of the
   samples. Some metabolites were targeted in both the positive and
   negative ion mode for a total of 287 SRM transitions using
   positive/negative polarity switching. ESI voltage was +4900 V in the
   positive ion mode and −4500 V in the negative ion mode. The dwell time
   was 3 ms per SRM transition, and the total cycle time was 1.55 s.
   Approximately 10-14 data points were acquired per detected metabolite.
   Samples were delivered to the mass spectrometer via normal phase
   chromatography using a 4.6 mm i.d x 10 cm Amide Xbridge HILIC column
   (Waters Corp.) at 350 μL/min. Gradients were run starting from 85%
   buffer B (HPLC grade acetonitrile) to 42% B from 0-5 min; 42% B to 0% B
   from 5-16 min; 0% B was held from 16-24 min; 0% B to 85% B from
   24-25 min; 85% B was held for 7 min to re-equilibrate the column.
   Buffer A was comprised of 20 mM ammonium hydroxide/20 mM ammonium
   acetate (pH = 9.0) in 95:5 water:acetonitrile. Peak areas from the
   total ion current for each metabolite SRM transition were integrated
   using Multi-Quant v2.0 software (AB/SCIEX).

Quantification and statistical analysis

Step I: identification of IRGs

Identification of IRGs

   We identified IRGs from time-series RNA-seq data obtained in
   insulin-stimulated Drosophila S2R + cells as described in our previous
   study ([435]Zirin et al., 2019). Briefly, we modeled the expression
   levels of each gene as the added sum of 3 parts: (i) the time effect
   after insulin stimulation, modeled as a cubic polynomial of time t;
   (ii) potential confounding components, such as batch effects, extracted
   using surrogate variables; and (iii) random white noise, modeled as a
   normal distribution with a mean of 0. We applied an F-test statistical
   framework to test each gene for the null hypothesis that the gene is
   not differentially expressed over time vs. the alternative hypothesis
   that the gene is temporally differentially expressed.

Hierarchical clustering of the IRGs

   We subtracted the effects of the surrogate variables estimated by the
   statistical model used in the identification of the IRGs from the
   log[2]scaled normalized read count for each IRG. We calculated the
   Z-score from the obtained values and performed hierarchical clustering
   using Euclidean distance and Ward’s method. We analyzed all clusters
   containing 100 genes or more in hierarchical clustering.

KEGG pathway enrichment analysis

   The enrichment of the genes in each pathway was determined using
   one-tailed Fisher exact test, and KEGG pathways with q values of less
   than 0.05 were defined as significantly enriched. The q values were
   calculated by the Benjamini-Hochberg procedure ([436]Benjamini and
   Hochberg, 1995). We used the genes measured in the RNA-seq data as a
   background. Because the clusters analyzed in this study were not
   mutually exclusive, a pathway can be enriched in overlapping clusters.
   We selected non-overlapping and significant clusters based on the
   significance (q value) of Fisher exact test and the hierarchical
   structure of the dendrogram (see “[437]Finding non-overlapping and
   statistically significant clusters of a hierarchical clustering”).

Finding non-overlapping and statistically significant clusters of a
hierarchical clustering

   We selected non-overlapping and significant clusters for each of the
   pathways or TF motifs based on the significance (q value) of Fisher
   exact test and the hierarchical structure of the dendrogram. If the
   genes included in each of the two clusters overlapped, we defined the
   pair of clusters as “overlapping clusters” (e.g. clusters 1 and 2 in
   [438]Figure 2B). If not, we defined a pair of clusters as
   “non-overlapping clusters” (e.g. clusters 2 and 3 in [439]Figure 2B).
   We used “computational recognition and analysis of statistically
   significant subtrees (CRASSS)”, which is an algorithm for finding the
   most statistically significant and non-overlapping clusters of
   hierarchical clustering ([440]Buehler et al., 2004).

Identification of significantly co-regulated groups of IRG pairs

   To identify significantly co-regulated groups of IRG pairs (e.g. pairs
   among the IRGs within a KEGG pathway), we tested whether the average
   Spearman correlation coefficient of IRG pairs was significantly higher
   or not. The same number of IRG pairs as the one that is used for
   hypothesis testing were randomly sampled 1000 times from all the pairs
   among the IRGs. For each sampling, the average Spearman correlation of
   the sampled IRG pairs was calculated. We obtained a p value by
   calculating the percentage of random sampling exhibiting larger average
   Spearman correlations than that calculated from the tested pairs. We
   defined a group of IRG pairs with Bonferroni-corrected p value less
   than 0.05 as significantly co-regulated. A similar method has been used
   in a previous study ([441]Hansson et al., 2012). ∗p < 0.05, ∗∗p < 0.01,
   ∗∗∗p < 0.001.

Step II: prediction of TFs that regulate IRGs

Prediction of TF binding motif and inference of regulatory connections
between TFs and genes

   We predicted TFs regulating the IRG clusters based on the method
   developed in previous studies ([442]Mina et al., 2015a; [443]2015b).
   The sequences of the flanking regions of genes were downloaded from
   FlyBase (version FB2018_04) ([444]Larkin et al., 2021). The region from
   −1000 bp to +100 bp of the transcription start site was defined as the
   flanking region. Although distal regulatory DNA elements can affect
   gene expression levels, information on interaction with such regions is
   available only for a subset of the genes in the Drosophila genome.
   Furthermore, a prediction of TF binding motifs for all potential distal
   regulatory regions can increase false positives. For these reasons, we
   used the regions from −1000 bp to +100 bp of the transcription start
   site for our analysis. The TF binding motifs in each flanking region
   were predicted using Cis-BP (version 1.02), a TF binding motif
   database, and FIMO (version 4.12.0), a TF binding motif prediction tool
   ([445]Grant et al., 2011; [446]Weirauch et al., 2014). The information
   on TF binding motifs in Cis-BP was downloaded from the MEME suite
   (version 4.12.0) ([447]Bailey et al., 2015). We used FIMO with the
   default parameters. We used only the binding motifs of TFs detected in
   any of the omic data analyzed in this study (PPI, phosphoproteome, and
   transcriptome). For the prediction of regulatory connections between
   TFs and IRGs, we performed TF motif enrichment analysis of the genes in
   each cluster. The enrichment of TF binding motif in the flanking
   regions of genes in each cluster was determined by one-tailed Fisher
   exact test, and TF binding motifs with q values less than 0.05 were
   defined as significantly enriched. The q values were calculated by the
   Benjamini-Hochberg procedure ([448]Benjamini and Hochberg, 1995). We
   used the genes measured in the RNA-seq data as a background. If a TF
   binding motif was enriched in the promoter regions of the genes in a
   cluster, then we inferred the regulatory connections between the
   corresponding TF and the genes in the cluster. Because the clusters
   analyzed in this study were not mutually exclusive, a TF binding motif
   can be enriched for overlapping clusters. We selected the
   non-overlapping and significant clusters based on the significance (q
   value) of Fisher exact test and the hierarchical structure of the
   dendrogram (see “[449]Finding non-overlapping and statistically
   significant clusters of a hierarchical clustering”).

Confirmation of the TF predictions using data from ChIP-Atlas

   For the validation of the predicted regulatory connections, we examined
   the overlap between the predicted target genes of each TF and those
   predicted from experimental ChIP data from the ChIP-Atlas database
   ([450]Oki et al., 2018). Genes whose flanking region around the
   transcription start sites were detected in ChIP-sequencing peaks were
   obtained using BedTools (v2.21.0) with PyBedTools ([451]Dale et al.,
   2011; [452]Quinlan and Hall, 2010). We used the flanking regions from
   −1000 bp to +100 bp of the transcription start sites. The overlap
   between the predicted genes and genes from ChIP data was determined by
   one-tailed Fisher exact test, and those with Bonferroni adjusted p
   value less than 0.05 were defined as significant.

Step III: prediction of upstream signaling pathways regulating the predicted
TFs

Identification of IRpPs

   We extracted phosphopeptides whose phosphorylation levels were
   increased or decreased by insulin stimulation from phosphoproteome data
   measured in our previous study ([453]Vinayagam et al., 2016).
   Accordingly, phosphopeptides that showed an absolute log[2] fold change
   larger than 0.5 at any time point compared to time 0 were defined as
   IRpPs.

Identification of PPIs from the AP-MS data

   PPIs and their detected time points were extracted from the AP-MS data
   of insulin-stimulated Drosophila S2R + cells measured in our previous
   study ([454]Vinayagam et al., 2016).

Prediction of protein kinases for protein phosphorylation

   We predicted kinases regulating the IRpPs using the NetPhorest software
   and Fisher exact test. First, we predicted possible kinases for the
   phosphopeptides measured in the phosphoproteomic dataset based on the
   amino acid sequences of the proteins corresponding to the
   phosphopeptides using a standalone version of NetPhorest for humans
   with the default parameters ([455]Horn et al., 2014; [456]Miller
   et al., 2008). The input data for NetPhorest are fly protein sequences
   in FASTA format. The outputs for NetPhorest are posterior probabilities
   of an amino acid residue being recognized by a protein kinase
   classifier (kinases with similar substrate recognition motifs). Among
   the candidate classifiers, we selected the classifier with the
   posterior probability value larger than 0.035 as well as the posterior
   to be higher than the prior as the kinase classifier related to the
   amino acid sequence. The threshold of the posterior probability value
   was applied from previous studies ([457]Buljan et al., 2020;
   [458]Freschi et al., 2014; [459]So et al., 2015; [460]Tan et al.,
   2009). A predicted KSR is represented as an edge between a kinase
   classifier as one node and a phosphorylation site as the other node. We
   also extracted individual kinases within each classifier from a table
   provided by NetworKIN software and defined these kinases as possible
   protein kinases regulating the predicted phosphosites for the kinase
   classifier. We converted the protein IDs (Ensembl) of the responsible
   protein kinases to the gene IDs (HGNC) using MyGene.info (version 3)
   ([461]Wu et al., 2013; [462]Xin et al., 2016)
   ([463]http://mygene.info/). We converted the obtained human gene IDs of
   the responsible protein kinases to the fly gene IDs by DIOPT (DIOPT
   score >2) ([464]Hu et al., 2011). Next, we performed Fisher exact test
   of the predicted target phosphopeptides of each kinase for the
   increased and decreased IRpPS. If the predicted target phosphopeptides
   of a kinase were enriched in the increased or decreased IRpPs (q value
   <0.05), then we selected the KSRs between the kinase and its predicted
   substrates contained in the increased or decreased IRpPs, respectively.
   The q values were calculated by the Benjamini-Hochberg procedure
   ([465]Benjamini and Hochberg, 1995).

Prediction of upstream signaling pathways regulating the predicted TFs

   We predicted upstream insulin signaling pathways regulating the
   predicted TFs using the IRpPs, the PPI network following insulin
   stimulation extracted from our previous study ([466]Vinayagam et al.,
   2016), the kinases-IRpPs relationships predicted by NetPhorest
   ([467]Horn et al., 2014; [468]Miller et al., 2008) and Fisher’s exact
   test, and the MIST PPI network ([469]Hu et al., 2018). We performed the
   prediction in the following three steps as shown in [470]Figure 3D. In
   Step III-I, we constructed a network by merging the IRpPs
   ([471]Figure 3A), the PPI network following insulin stimulation
   ([472]Figure 3C), and the predicted KSRs ([473]Figure 3B), and the
   predicted TFs ([474]Figure 2B). Hereafter, we denoted the obtained
   network as the ”PPI and phosphorylation network”. We also merged the
   MIST PPI network and the PPI and phosphorylation network. To obtain the
   sub-network involved in signal transduction, we extracted the
   subnetworks that consisted only of the proteins in signaling pathways
   in the KEGG database, the predicted kinases, or the predicted TFs from
   the obtained network. The signaling pathways in the KEGG database were
   defined as pathways including the character string of “signaling
   pathway” in their names. We denoted this subnetwork as the “merged
   network”. In Step III-II, we extracted the shortest paths from the
   nodes in the PPI and phosphorylation network to each of the TFs from
   the merged network. Then we merged the PPI and phosphorylation network
   and the network consists of the extracted shortest paths. In Step
   III-III, to predict the signaling pathways activated/inhibited in an
   insulin-dependent manner, we extracted the paths from the insulin
   receptor to each of the TFs in the network obtained in Step III-II. We
   defined the paths from the insulin receptor to a TF as the insulin
   signaling pathway regulating the TF. In this study, the maximum
   shortest path length from the insulin receptor to each TF is 5. We
   extracted the paths whose lengths were 5 or less to reduce the false
   positives while maximizing the number of the TFs with predicted
   upstream signaling pathways.

KEGG signaling pathway enrichment analysis for the upstream signaling
pathways of each TF

   The significance of the overlap between proteins in the signaling
   pathways of the KEGG database and the upstream signaling pathways of
   each TF was determined by one-tailed Fisher exact test, and signaling
   pathways with q values less than 0.05 were defined as significantly
   enriched. The q values were calculated by the Benjamini-Hochberg
   procedure ([475]Benjamini and Hochberg, 1995). We used the genes in any
   of the signaling pathways in the KEGG database as a background.

Step IV: identification of IRMs

   Metabolites that were detected in less than 60% of replicates (3
   replicates) at any time point following insulin stimulation (0, 60,
   120, and 180 min) were removed from the analysis. To correct the bias
   among samples, the area under the peak obtained by the LC-MS
   measurement of each metabolite was normalized by the median of the area
   under the peak of all metabolites for each sample. The significance of
   the change at each time point compared to time 0 was tested by
   two-tailed Welch’s t-test for each metabolite. Metabolites that showed
   an absolute log[2] fold change larger than 0.5 and an FDR-adjusted p
   value (q value) less than 0.1 at any time point were defined as IRMs.
   The q values were calculated by Storey’s procedure ([476]Storey and
   Tibshirani, 2003).

Step V: connection of the IRMs/IRGs and metabolic reactions

Identification of allosteric regulation

   We identified IRMs that function as allosteric regulators for metabolic
   enzymes using the BRENDA database, which is a database with information
   regarding allosteric effectors and their target enzymes ([477]Chang
   et al., 2021). A metabolite can operate as an activator for some
   enzymes and as an inhibitor for others. We obtained the entries for
   metabolic enzymes from the KEGG database and extracted their allosteric
   effector (activator and inhibitor) information, as reported for at
   least one organism in BRENDA. Then, we associated the standard compound
   names of allosteric effectors used in BRENDA with metabolite names that
   were used in KEGG to obtain the KEGG compound ID related to each
   allosteric effector.

Identification of substrates, products, and metabolic enzyme genes involved
in metabolic reactions

   We extracted the substrates, products, and metabolic enzymes related to
   each metabolic reaction from the KEGG database. Because the
   reversibility of metabolic reactions was not available in a
   comprehensive manner, metabolic reactions were presumed to be regulated
   by both the substrate and product.

Step VII: integration of the trans-omic network of insulin action with a
CRISPR screen for cell proliferation

Identification of screen hits of the CRISPR knockout screen for cell
proliferation

   We identified screen hits that significantly affect cell proliferation
   from the CRISPR screen for cell proliferation measured in Drosophila
   S2R + cells as described in our previous study ([478]Viswanatha et al.,
   2018).

Identification of differences in the distributions of Spearman correlation
coefficients between groups of IRG pairs

   Differences in the distribution of Spearman correlation coefficients
   between groups of IRG pairs were tested using the Wilcoxon rank-sum
   test. We defined distributions between groups with Bonferroni-corrected
   p values less than 0.05 as significantly different. ∗p < 0.05,
   ∗∗p < 0.01, ∗∗∗p < 0.001.

Prediction of subnetworks required for cell growth in an insulin-dependent
manner

   We aimed to identify subnetworks that are required for cell growth in
   an insulin-dependent manner by using the trans-omic network of insulin
   action and the CRISPR screen data for cell proliferation
   ([479]Viswanatha et al., 2018). We hypothesized that both the upstream
   signaling molecules and the target IRGs of a TF which is required for
   cell growth in an insulin-dependent manner are likely to
   negatively/positively affect cell growth. We predicted such TFs by GSEA
   ([480]Subramanian et al., 2005) using the Z-scores of the CRISPR
   screening and the information of the upstream signaling molecules and
   the target IRGs of each TF in the trans-omic network. We first
   extracted TF-centric subnetworks each containing a TF, its upstream
   signaling molecules ([481]Figure 3E), and its target IRGs
   ([482]Figure 2B). Next, we performed GSEA using the Z-score of the
   CRISPR screening for both the upstream signaling molecules and the
   target IRGs of the TF in each TF-centric subnetwork. When the results
   of GSEA were significant for both the upstream signaling molecules and
   the target IRGs for a TF (FDR <0.05), and the TF itself is a screen
   hit, we considered the subnetwork potentially involved in cell growth.
   It has been proposed that the activity of a regulatory protein (e.g. a
   TF and a kinase) can be estimated based on the statistics (e.g.
   abundance and fold change) of their target molecules (e.g. target genes
   regulated by a TF and phosphopeptides regulated by a kinase)
   ([483]Alvarez et al., 2016; [484]Dugourd and Saez-Rodriguez, 2019),
   which is the foundation for the method used in this study to estimate
   the importance of each TF to cell growth by using the Z-scores of the
   CRISPR screening of the upstream molecules and downstream target genes
   of the TFs.

Implementation

   Statistical analyses and trans-omic network analysis were performed
   using Python 3.8.3 ([485]https://www.python.org) or R 3.6.3 ([486]R
   Core Team, 2021). Network diagrams were visualized using Cytoscape
   3.8.2 ([487]Shannon et al., 2003).

Acknowledgments