Abstract The study of systems genetics is changing the way the genetic and molecular basis of phenotypic variation, such as disease susceptibility and drug response, is being analyzed. Moreover, systems genetics aids in the translation of insights from systems biology into genetics. The use of systems genetics enables greater attention to be focused on the potential impact of genetic perturbations on the molecular states of networks that in turn affects complex traits. In this study, we developed models to detect allele-specific perturbations on interactions, in which a genetic locus with alternative alleles exerted a differing influence on an interaction. We utilized the models to investigate the dynamic behavior of an integrated molecular network undergoing genetic perturbations in yeast. Our results revealed the complexity of regulatory relationships between genetic loci and networks, in which different genetic loci perturb specific network modules. In addition, significant within-module functional coherence was found. We then used the network perturbation model to elucidate the underlying molecular mechanisms of individual differences in response to 100 diverse small molecule drugs. As a result, we identified sub-networks in the integrated network that responded to variations in DNA associated with response to diverse compounds and were significantly enriched for known drug targets. Literature mining results provided strong independent evidence for the effectiveness of these genetic perturbing networks in the elucidation of small-molecule responses in yeast. Introduction Elucidation of the molecular and genetic basis of phenotypic variation has been a longstanding goal in genetics, including all aspects of morphology, physiology, behavior, disease susceptibility, and drug response. Genome-wide association studies (GWAS) mapping quantitative trait loci (QTLs) have enabled the identification of regions of the genome in which genetic variations are associated with phenotypic variation. However, the computational understanding of the biological mechanisms underlying genetic variations remains unclear. Numerous previous studies have dissected the downstream effects of genetic perturbations on RNA intermediates, proteins, metabolites and other molecular endophenotypes [43][1], [44][2], [45][3], [46][4], [47][5]. Whole-genome expression QTL (eQTL) analysis in yeast, mice, and humans has demonstrated that gene expression traits are highly inheritable and exhibit surprisingly complex underlying genetic architecture [48][6], [49][7], [50][8]. By layering gene expression phenotypes as intermediate phenotypes, many studies have combined eQTL and disease GWAS to identify causal relationships between genes and disease (reviewed by Ertekin-Taner [51][5]). Therefore, further elucidation of changes in molecular states that directly respond to changes in DNA could potentially fill in the information gaps left by GWAS, and serves as an excellent first step to understanding the drivers of a complex phenotype. However, most proteins perform their functions through interactions with other proteins, or as part of biochemical pathways and networks. Thus, some genes may respond as groups due to their membership in networks. As an alternative to the studies assuming that genes act independently, certain previous studies have utilized an effective approach that assessed higher-order a priori defined gene network responses caused by genetic variation [52][9], [53][10], [54][11], [55][12], [56][13], [57][14], [58][15], [59][16]. However, these studies only considered the variation of the synthetic expression of genes in a priori defined networks, and failed to model how genetic variations are mediated by a network of molecular interactions in the cell. In addition, the association of these networks that are affected by genetic variations with the resultant phenotypic variation remains poorly understood. The use of systems genetics is changing the future of genetics. Moreover, it is a corollary to the basic insight of systems biology in that most complex traits of living things are properties generated by dynamic networks of interacting genes and molecules [60][17], [61][18]. In recent years, our ability to interpret the phenotypic variation in model systems, and ultimately in humans, has benefited from systems genetics approaches [62][17], [63][18], [64][19], [65][20], [66][21], [67][22], [68][23]. Chen et al. [69][21] constructed co-expression networks through the combination of gene expression and genotype data, and uncovered components of the co-expression networks that respond to genetic variations associated with disease-associated traits. This study confirmed that complex traits, such as obesity, are potentially emergent properties of molecular networks modulated by complex genetic loci and environmental factors. Zhong et al. [70][23] proposed a model to describe the effect of disease-causing mutations in Mendelian disorders on systems or interactome properties. Equipped with the tools emerging from the genomics revolution, the study of the effects of genetic perturbations on molecular networks will serve as an important aspect of systems genetics research. Various biochemical or biophysical interaction(s) are the building blocks of biological functions and processes and are also the basic units of molecular networks; thus, it is important to investigate the effects of genetic variations on molecular interactions. Recently, several reports have explored the effects of genetic variations on protein-DNA interactions [71][24] or protein-protein interactions [72][23] from the perspective of the structural properties of physical binding. In this study, through integrating gene expression data, genotype data, and molecular interactions, we developed models to detect two types of allele-specific perturbations on interactions, termed allele-specific co-perturbation (ASCP) and allele-specific dys-perturbation (ASDP). In both cases, a genetic locus exerted a different influence on the transcriptional program of interactions for cases in which the locus exhibited distinct (alternative) alleles. In this study, using a well-controlled system, we aimed to investigate how molecular networks respond to DNA perturbations based on the interaction perturbation models (ASCP and ASDP), and thus, drive variations in physiological states associated with phenotypic variation. Therefore, we used a panel of 112 genotyped and expression-profiled yeast strains [73][6] which were also treated with a collection of 100 diverse compounds, termed small-molecule perturbagens (SMPs) [74][25], to investigate the effect of genetic variations (such as Single Nucleotide Polymorphisms, SNPs) on the dynamic behavior of an integrated interactome of yeast, and to elucidate the molecular mechanisms underlying individual differences in response to small-molecule drugs in yeast based on networks that exhibit genetic perturbation. Results EQTL Mapping We performed eQTL mapping using gene expression and SNP genotyping data on 112 segregants generated in a cross between laboratory (BY) and wild (RM) strains of Saccharomyces cerevisiae [75][6]. We sought to identify an initial set of potential associations between eQTLs and their target genes for further analysis. In order to determine the statistical significance more accurately, we merged adjacent markers that exhibited the same genotype profile to obtain a total of 1118 representative markers and selected 4500 transcripts with significantly high heritability ( Inline graphic >0.669 at a false discovery rate, FDR of 0.05) (see [76]Materials and Methods). Finally, we performed linkage calculations between 1118 genetic markers and 4500 transcript levels using the Student’s t-test, and assessed significance via permutations (see eQTL mapping section in [77]Materials and Methods for algorithmic details). As a result, we detected 30,793 associated transcript-locus pairs (FDR<0.05). There were 3175 transcripts (3141 ORFs) linked to at least one QTL and an average of 9.7 QTLs per transcript. A total of 1112 markers were linked to at least one target transcript. To determine whether the loci identified by linkage act in cis or in trans, we investigated transcripts whose levels were linked to markers within 10 kb of their own gene. We found that 656 (21%) of the 3175 transcripts fell into the cis-acting category and most expression differences mapped to trans-acting loci. Our findings are in concordance with other reports in which trans-acting loci appear to be responsible for most differences in gene expression between yeast strains [78][26], [79][27]. Determining Allele-specific Perturbation of Interactions To assess how variations at the genetic level that affected gene expression levels (eQTLs) would alter molecular interaction states in the cell, we introduced allele-specific perturbation of interaction; we defined this as: a genetic locus with alternative alleles that exerts a different influence on an interaction. Two models were developed to detect the allele-specific behavior of interactions: allele-specific co-perturbation (ASCP) and allele-specific dys-perturbation (ASDP) (a detailed description can be found in the [80]Materials and Methods section). ASCP denotes that an eQTL synchronously regulated the expression of two interacting genes. ASDP denotes that an eQTL triggered significant changes in expression correlation between two interacting genes. Each interaction in an integrated interactome, containing protein-protein interactions (PPI), protein-DNA interactions (PDI), kinase-protein interactions (KPI), and enzyme-enzyme interactions (EEI) (a detailed description of the interaction types can be found in the [81]Materials and Methods section), was then analyzed to determine which demonstrated allele-specific behavior based on the two models. As a result, we identified 10440 ASCP relationships between 408 eQTLs and 2184 interactions, and 5774 ASDP relationships between 758 eQTLs and 2416 interactions (P<0.001, FDR<0.082). A total of 70.86% (788/1112) of the eQTLs perturbed at least one interaction. In addition, we analyzed the cis-acting or trans-acting regulation between the genetic loci and their perturbed ASCP and ASDP interactions. Five different conditions were investigated and detailed information is shown in [82]Table S1. Interestingly, we found that many interacting genes in ASCP and ASDP interactions are located on the same chromosomes, even located adjacent to their regulators (eQTLs). This phenomenon may reflect the linear arrangement and aggregation of yeast genes on the same chromosomes which are required for coordinate expression of genes involved in related metabolic or regulatory pathways [83][28], [84][29], [85][30]. In addition, among the 3141 target genes associated with at least one eQTL, approximately 65% exhibited allele-specific perturbation of interactions with their partners in the interactome. It has been suggested that for a considerable portion of target genes, the eQTLs that disturbed their expression levels would also cause a degree of dynamic variation in their local network. Various biochemical or biophysical interaction(s) are the basic units of a molecular network and also the building blocks of biological functions and processes; thus, exploration of the allele-specific behavior of interactions makes it possible to test the downstream effects of genetic perturbations on molecular networks. Comparison of the Effect of ASCP and ASDP on Networks [86]Table 1 shows the overall statistics of perturbed interactions detected by ASCP and ASDP models. Firstly, we found that 13.53% (4443/32831) of the interactions between target genes (of eQTLs) and their partners in the integrated network are perturbed by at least one eQTL. This finding indicates that many of the interactions (∼86%) may behave consistently across individuals with different genotypes and represent a cellular network ‘backbone’. Secondly, we found that PPIs in complexes exhibited ASCP more frequently. When annotated to 305 protein complexes obtained from the CYC database [87][31], 162 of the 939 (17.3%) ASCP PPIs were found to be in the same complexes, whereas only 56 of the 819 (6.8%) ASDP PPIs were located in the same complexes (Fisher’s exact test, P = 1.184e–11). For PDIs, regulatory relationships between TFs (TF-TF) were also observed more frequently in ASCP (Fisher’s exact test, P = 0.03809), whereas phosphorylation events between two kinases (K-K) were observed more frequently in ASDP (P = 0.03245) ([88]Figure S1A). Table 1. Summary of the global effect of ASCP and ASDP on the four studied molecular interaction networks and the integrated interactome. NETWORK Edges ASCP ASDP Both Percentage PPI 12067 939 819 1703 14.11% PDI 6815 524 616 1056 15.50% KPI 13532 550 842 1336 9.87% EEI 1912 261 237 421 22.02% COMBINED 32831 2184 2416 4443 13.53% [89]Open in a new tab The second column denotes the total number of interactions detected for allele-specific behavior. The third and the fourth column represent the number of ASCP and ASDP interactions detected respectively. The fifth column represents the total number of perturbed interactions when considering both ASCP and ASDP, while the proportion they occupy in all interactions tested is listed in the sixth column. In order to facilitate analysis, we constructed two networks called CPN and DPN that contained all of the ASCP and ASDP interactions, respectively. As shown in [90]Figure S1B, most interactions in CPN are connected and form a large connecting component that contains ∼90% of all nodes; the same is observed in DPN (∼93% of all nodes). These results indicate that the transmission of the effect of genetic variations is mediated by network continuity. By merging these two types of perturbed interactions, a larger system is formed, thus, the coordination of these two disturbance effects is evident. We evaluated the degree distribution of genes and observed a power-law in both CPN and DPN ([91]Figure S1C). This suggests that both models of genetic perturbing networks display scale-free characteristics [92][32]. CPN and DPN shared approximately 39.8% (910/2286) proteins but only approximately 3.53% edges, indicating that a large quantity of interactions is rewired between CPN and DPN ([93]Figure S1D). Thus, these two models represent various genetic perturbing actions on a molecular network. Hence, ASCP and ASDP could potentially reveal the condition-specific dynamic information hidden among otherwise common static interactions. The Interactome Exhibits a Modular Changing Pattern Under Allele-specific Context As discussed above, when genetic variations occurred on DNA sequences, some gene expression levels are affected (eQTL mapping), and cause the alteration of interaction states (allele-specific perturbation on interactions). Next, dynamic changes in networks would be triggered. In this study, we introduced an allele-specific context to capture the changing patterns of a network from one allele state to another. Due to the linkage disequilibrium (LD) of adjacent eQTLs, we identified 263 haplotype LD blocks (B1–B263) defined by the non-random association of genetic variants at two or more loci using Haploview software [94][33]. We then assembled all ASCP and ASDP interactions associated with the eQTLs located in the same block region. As a result, we found 246 blocks associated with at least one interaction; of these, 180 were associated with at least three edges. Next, we built an association matrix that represented the allele-specific perturbation relationships among all of the blocks and their perturbed interactions, and performed hierarchical clustering on the perturbed interactions. [95]Figure 1A shows the clustering results of four sub-matrices extracted from the overall association matrix. These sub-matrices represent the associations between 32 blocks and 1446 PPIs, 30 blocks and 937 PDIs, 23 blocks and 1078 KPIs, and 14 blocks and 366 EEIs, respectively. Our findings indicated that for all four types of networks, each block primarily regulated a small portion of the network and that distinct parts of the network are primarily modulated by a few particular blocks. Thus, a complicated combination is evident, in terms of both target specificity and cooperativity of blocks. Moreover, the overall integrated network presented the same changing pattern under an allele-specific context ([96]Figure S2). Figure 1. The changing pattern of molecular networks under allele-specific context determined by LD blocks. [97]Figure 1 [98]Open in a new tab (A) Hierarchical clustering on the perturbed interactions based on the association matrix between blocks and their perturbing interactions in PPI, PDI, KPI, and EEI networks, respectively, in which rows represent blocks and columns represent interactions. In each of the four sub-matrices, blocks associated with less than ten interactions are filtered out. The columns are reordered according to hierarchical clustering. The rows colored correspond to B20. (B) The allele-specific sub-network associated with B20. The sub-network is composed of 142 PPIs, 54 PDIs, 93 KPIs, and 26 EEIs, which are marked with green, red, blue and orange, respectively. Nodes marked with red are proteins constituting a GO cellular component (nuclear lumen), and PPIs among these genes are regulated by B20. Nodes marked with green are enzymes from a KEGG pathway (DNA-directed RNA polymerase activity), and their EEIs are regulated by B20. To further analyze the role that individual blocks play in the regulation of the integrated network, we mapped an allele-specific sub-network for each block. Each sub-network corresponded to a minimal component in the original integrated network that contained all the genes consisting of the interactions associated with a block. It is guaranteed that all of the interactions perturbed by the block are located in this network component. We investigated the sub-graph properties of these sub-networks to determine their topological features in the integrated network. A summary of the results is listed in [99]Table 2 and [100]Figure 2A. Computer simulation results demonstrated that the topological features of these sub-networks were significantly different from randomized networks (details of the randomization test are provided in [101]Materials and Methods). The results indicated that the characteristic path length between genes in the same sub-network was significantly shorter, and the average density of these sub-networks was significantly greater. Moreover, the average in-degree ratio of the sub-networks was significantly higher, indicating that proteins composing the allele-specific sub-network jointly exhibit significant modularity in the integrated network. The in-degree ratio is defined as the ratio (R) of in-degree to out-degree of a sub-network, in which “in-degree” represents the number of its connections within the sub-network, and “out-degree” represents its connections outside sub-network [102][34]. In total, 162 blocks (89.5% of 180 blocks investigated) were demonstrated to perturb interactions located in a module of the integrated network (FDR<0.05). In other words, the alteration of allele state of a block potentially leads to changes inside a network module. In addition, modules in the integrated network are composite modules; thus elucidation of the complex relationships underlying multiple biological interaction types will further the understanding of complex cell processes. For example, we found that B20 on chromosome II regulates 318 interactions among 246 genes, including 93 KPIs, 54 PDIs, 142 PPIs, and 26 EEIs ([103]Figure 1B). Most interactions (92%) perturbed by B20 were connecting components gathered in a network module, moreover, various types of interactions were gathered in certain areas; the PPIs in the area marked by orange constitute a cellular component called nuclear lumen, and the EEIs in the area marked by green are derived from a pathway that controls DNA-directed RNA polymerase activity. These findings suggest that the information exchange and collaboration occurs among multi-layer molecular networks. Table 2. Summary of topological properties of the allele-specific sub-networks and their Z-scores. All ASN Random Mean Mean p-value Mean Char.path length 2.857 2.0548 −12.34 5.57e–035 4.27 Density 0.004 0.2226 21.51 1.18e–102 0.046 In-degree ratio N.A. 0.0175 18.42 8.43e–076 0.0065 [104]Open in a new tab Topological properties of the yeast integrated network (All), the allele-specific sub-networks associated with blocks (ASN), and the random sub-networks (Random). The average Z-scores and the p-values for the observed properties of allele-specific sub-networks are also listed, respectively. N.A. not applicable. Figure 2. A summary of the sub-graph properties of the sub-networks associated with LD blocks. [105]Figure 2 [106]Open in a new tab (A) The characteristic path length among genes in the same allele-specific sub-networks is significantly shorter compared to a randomization test. Moreover, the density and in-degree ratio of these sub-networks are significantly greater compared to a randomization test. Boxes of light color represent the distribution of the sub-graph properties (characteristic path length, density, in-degree ratio) of allele-specific sub-networks, and the black boxes correspond to random ones. (B) The panel shows the difference in CSR values between block pairs on the same chromosomes and on different chromosomes. P-values are calculated using the Wilcoxon rank-sum test. Proximal Blocks Regulating Similar Network Components It has been observed previously that different chromosomal regions (LD blocks) regulate common network modules together. Here, we found a total of 423 pairs of blocks associated with overlapping allele-specific sub-networks. We introduced a score termed Component Share Ratio (CSR) to measure the degree of overlap between two network components regulated by two distinct blocks ([107]Materials and Methods). This value ranges from 0 to 1; the higher the score, the more overlap between two network components, with 1 indicating that two blocks regulate identical network components. We divided 423 pairs of blocks that exhibited a CSR value into two groups: 211 block pairs were located on the same chromosome and 212 on different chromosomes. Next, we calculated the significance of the differences between the two groups and found that block pairs on the same chromosome exhibited significantly higher CSR values (log-transformed) compared to block pairs on different chromosomes (P = 4.9165e-051, [108]Figure 2B). In this study, a higher CSR value reflects a higher gene composition and function similarity between two blocks’ perturbed network modules. It is observed that the regulator-pairs (LD block-pairs) on the same chromosome have much higher CSR values. Moreover, the network modules regulated by the blocks on the same chromosomes, especially on adjacent regions, are enriched in metabolic processes, signaling pathways and enzyme catalyzed processes. It is suggested that LD blocks on the same chromosomes, especially adjacent blocks tend to involve in the regulation of common or correlated functions. This phenomenon may reflect the coordinated regulation requirement of genes involved in related metabolic or regulatory processes [109][29], [110][30]. In addition, some network modules perturbed by block-pairs on separate chromosomes also have high CSR values; this may imply the universality and genetic complexity of the yeast genome in the regulation of molecular networks. Biological Features of the Allele-specific Sub-networks Associated with Blocks We evaluated the biological features of these allele-specific sub-networks. We classified 2286 genes from all of the sub-networks into 85 functionally related gene groups (or classes) (F1–F85) with enrichment score > = 1.3 (equivalent to non-log scale 0.05) using the Functional Classification Tool provided by DAVID [111][35], [112][36], with 833 genes not included in any group. Next, we performed functional enrichment analysis using hypergeometric distribution and detected 158 enrichment relationships (FDR<0.05 P<0.0039) between 87 sub-networks and 48 functional groups. Using blocks to represent their associated sub-networks, these enrichment relationships are presented as a bipartite graph between 87 blocks (triangle) and 48 functional groups (circle) ([113]Figure S3). To create an intuitive display of the role that different chromosome segments (LD blocks) played in molecular network regulation or perturbation, we filtered out 57 relationships between 36 blocks and 16 functional groups (57 red edges in [114]Figure S3). For each enrichment relationship between a block and a functional group, we extracted the perturbed interactions associated with the block and belonging to the enriched functional group. We re-constructed a network, shown in [115]Figure 3, by combining all of the interactions extracted; the functional modular regulation mode of genetic variations on the multi-layer integrated molecular network can be seen. Different chromosomal regions (blocks) that control or affect different parts or modules of the network exhibit intuitive biological features and trigger the modification of specific cellular functions. For instance, blocks on chromosome III (B29–B31) and X (B144) (yellow rectangles) are associated with a network module containing 14 enzymes (black box). This module is generally annotated as the “methionine biosynthetic process” (enrichment significance: P = 8.0E–27), and nine enzymes are located in the metabolism pathway-“sulfur metabolism: reduction and fixation”. Moreover, some functional groupings are found to be under the control of common blocks. For example, blocks on chromosome XIV (B219–B222) are associated with the cellular component “mitochondrial ribosome” and also regulate other functional classes, such as glycolysis, ATP binding, transcription, and the cell cortex. The mitochondrial ribosome is responsible for the biosynthesis of protein components crucial to the generation of ATP in the eukaryotic cell. Similar cases are the perturbation of blocks B237–B238 (simultaneously influencing oxidative phosphorylation and mitochondrial inner membrane), B111 (nuclear chromatin and nuclear division), and B29–B30 (methionine biosynthetic process and branched chain family amino acid biosynthetic processes). Most of the joint perturbations by blocks reflect the coordination among different biological processes, molecular functions, and cellular components in a certain cellular function, and ATP (ATP binding) is one of the most generic sites (targeted by B179–B184, B228, etc.) in these processes ([116]Figure 3). Figure 3. Biological features of the allele-specific sub-networks associated with LD blocks. [117]Figure 3 [118]Open in a new tab A reconstructed network created by combining all of the allele-specific perturbed interactions associated with the 36 blocks and belonging to their enriched 16 functional groups. The genes belonging to different functional groups are marked with the same colors as in [119]figure S3; also, chromosome blocks are marked using the same colors as their regulating functional classes. Functional annotation was conducted for each dyed gene module using DAVID based on GO and KEGG, and the most enriched functional item covering total genes (100%), along with its significance level (p-value), is presented near the dyed gene module. The gene module marked yellow is mainly associated with blocks on chromosome III (B29–B31) and X (B144); the lower right corner is a schematic representation of the metabolic pathway: sulfur metabolism, reduction and fixation, and nine enzymes in the module annotated to the pathway are marked with red rectangles. Application in Understanding Small-molecule Drug Response in Yeast We utilized the model of genetic perturbing networks discussed above to gain a systematic understanding of small molecule perturbagen (SMP) response in genetically distinct yeast individuals. Firstly, we conducted linkage analysis between 324 phenotypes (segregant final yields of 100 SMPs at multiple time points and concentrations) and 2,956 genetic markers previously genotyped in the segregants [120][6], [121][25]. We identified 1470 QTLs with an absolute value t statistic score >3.28 (FDR<0.05) using a method identical to eQTL mapping, and 201 QTLs were in concordance with previous studies [122][25], covering nearly 92% of the original result (219 QTLs with a logarithm of the odds (lod) score Inline graphic 4). Because all time points and concentrations of each SMP were used as independent phenotypes, we combined QTLs associated with the same SMP but different concentrations and time points, and obtained QTLs for 92 SMPs. The position distribution for the QTLs along the genome is shown in [123]Figure 4A. Furthermore, for 91 SMPs, we obtained a sub-network perturbed by QTLs associated with each SMP response trait. Among these SMP response associated sub-networks, 16 were significantly enriched for known drug targets based on the entire chemical–protein interactions set obtained from the STITCH database [124][37], such as hydrogen peroxide (H2O2) (P = 5.70E–06) and menadione (P = 3.50E–07) ([125]Figure 4B and [126]Table 3). In total, 22 high-confidence (SCORE Inline graphic 700) yeast chemical–protein interactions were found in SMP associated networks, such as hydrogen peroxide, which targets the proteins (SOD2, CYB2); rapamycin, which targets the proteins (TOR2, LST8); and staurosporine, which targets the protein (YPL236C). This suggests that these sub-networks potentially reflect the molecular mechanisms of the drug action. Taking hydrogen peroxide (H2O2) as an example, the sub-network response to it is highly enriched for the GO biological processes ‘ergosterol metabolic process’, ‘lipid biosynthetic process’, and ‘oxidation reduction’ (P<5.7E–24, P<2.1E–19, and P<2.1E–7). The expression levels of the majority of the genes annotated to these GO terms were down-regulated in response to H2O2 in wild type ([127]Figure 5, green triangles in left branch under allele state from RM). This observation is in concordance with the report by Pedroso and Folmer [128][38], [129][39] in which the adaptation to H2O2 was observed to repress the expression of genes (ERG3, ERG6, ERG7, ERG25, FEN1) coding for enzymes involved in both ergosterol biosynthesis and lipid metabolism in wild strains. The QTLs associated with H2O2 are located on chromosome XII and chromosome XIII. A LD block on chromosome XII contains the candidate gene HAP1 (YLR256W), which encodes a heme-responsive zinc-finger transcription factor whose expression varies according to oxygen levels in the cell, thus regulating oxygen dependent gene expression. As shown in [130]Figure 5, HAP1 serves as an important regulator and it regulates most of the proteins associated with H2O2 response. It has been previously reported that HAP1 with a Ty1 insertion induces expression changes of sterol biosynthesis genes [131][40]. Additionally, it is a known drug target of H2O2 and is included in the STITCH database. By utilizing our proposed model, we were able to identify molecular network components that respond to small-molecule drugs; this provided insight into how networks states changed from wild-type to mutant-type when responding to SMPs ([132]Figure 5). Characterizing molecular network states that underlie complex traits, such as drug response, potentially provides a comprehensive view, which in turn could potentially lead to the direct identification of genes or pathways underlying drug response processes and aid in the elucidation of the functional roles of these genes with respect to drug response. Thus, based on the modeling of genetic perturbing networks that respond to SMP, we investigated the potential biological application of drug function and drug target prediction, and investigated whether our results would uncover new and possibly non-intuitive relationships between biochemical pathways. Figure 4. The QTLs distribution and drug targets enrichment of SMPs. [133]Figure 4 [134]Open in a new tab (A) The QTLs distribution along the whole genome for 53 small-molecule drugs. QTLs associated with different SMPs are marked with various colors. (B) The sub-networks responding to the SMP response of hydrogen peroxide and menadione. Red nodes are known drug targets included in the STITCH database. Table 3. Summary of enrichment analysis of known drug targets in the sub-networks associated with 16 small-molecule drug responses. Compound response m k x N p q (<0.05) hexylresorcinol 1 96 1 6063 0 0 rapamycin 265 966 76 5799 2.59E–08 2.59E–07 menadione 149 637 36 5915 3.50E–07 2.10E–06 staurosporine 123 755 35 5941 4.21E–07 2.10E–06 cycloheximide 188 365 28 5876 2.04E–06 8.15E–06 hydrogenperoxide 314 205 26 5750 5.70E–06 1.90E–05 anisomycin 33 450 10 6031 1.44E–05 4.11E–05 ascomycin 10 1007 6 6054 0.000257504 0.000643761 tamoxifen 33 147 4 6031 0.001073139 0.002384754 doxorubicin 109 123 7 5955 0.001565578 0.003131156 cerulenin 57 391 9 6007 0.003105336 0.005646066 mastoparan 10 206 2 6054 0.003885136 0.005977133 niguldipine 4 611 2 6060 0.003767185 0.005977133 tomatine 9 238 2 6055 0.004206531 0.00600933 gliotoxin 24 124 2 6040 0.01234541 0.016460546 clotrimazole 26 70 1 6038 0.035718741 0.044648426 [135]Open in a new tab x represents the number of known targets for the SMP drawn from the sub-network associated with the SMP; k represents the number of genes in the sub-network associated with the SMP; m represents the number of known drug targets for the SMP in the background gene set; n represents the number of genes in the background gene set excluded the known drug targets for the SMP; p-value is calculated as 1-phyper(x, m, n, k), where phyper is a function used in R. Figure 5. The state of the sub-networks in response to hydrogen peroxide in segregants inherited alleles from RM and BY. [136]Figure 5 [137]Open in a new tab Solid edges indicate correlated expression between proteins in one allele state, whereas that organization is lost in another allele state (dotted line). Edge colors represent diverse interaction types between proteins, whereas node colors represent changes in gene expression between wild and mutant groups; green denotes down-regulation and red denotes up-regulation. Triangle nodes present genes annotated in ‘ergosterol metabolic process’, ‘lipid biosynthetic process’, and ‘oxidation reduction’. Nodes with dark edges are known drug targets of hydrogen peroxide based on Stitch2 database. In the SMP associated networks, we observed many perturbed interactions (ASCP and ASDP) between known drug targets and other uncharacterized genes in drug response, allowing for potential drug target predictions. Comparing between known drug target genes and other genes, we found that, in response to SMPs, ASCP or ASDP interactions tend to occur among drug target genes. Therefore, we predicted candidate drug targets based on their connecting degree with known drug targets in SMP associated networks. The candidate genes listed in [138]Table S2 were all connected to Inline graphic known drug targets. Among them, 59% (10/17) were found to exhibit resistance to the chemicals analysed in this study [139][41], [140][42], [141][43], [142][44], [143][45], [144][46], [145][47], [146][48], [147][49]. INO4, as an example, which is a transcription factor associated with phospholipids synthesis, shows differential interactions with nine genes targeted by eight drugs. Of these drugs, rapamycin, cycloheximide, lycorine, cerulenin, nocodazole, and resveratrol are classical cell proliferation inhibitors that are extensively utilized in cancer therapy and cell synchronization in molecular biology experiments [148][41], [149][42], [150][46], [151][47]. It has also been reported that gene INO4 may be affected by bleomycin and fenpropimorph, and the mutant type will produce resistance. These two drugs are both antibiotics and have been considered for use in cancer treatment. INO4 is seen as a potential drug target, particularly because it potentially plays a role in cell growth inhibition, and even cancer pathology. Regarding the other candidate genes, RPN4, STE12, FKH2 and CIN5 are also transcription factors that regulate important biological processes, such as RNA transcriptional elongation, MAPK signaling pathway, and proteasome genes expression. YCK3, PKP2, PKP1, and SKM1 are important kinases and exhibit serine/threonine kinase, casein kinase and mitochondrial protein kinase activity. FBA1 is fructose 1,6-biophosphate aldolase, which is required for glycolysis and gluconeogenesis. These genes exhibit certain functional features as drug targets, and published reports demonstrate that they exhibit reactions with certain known drugs and compounds; the natural or induced mutations will lead to mutant phenotypes. These annotations suggest that our genetic perturbing network results may be effective in drug target prediction. Moreover, we conducted pathway enrichment analysis for each SMP associated network using DAVID based on KEGG. We detected 207 enrichment relationships (Beniamini<0.05) between 45 SMPs and 22 KEGG pathways ([152]Figure S4). It has been demonstrated that multiple pathways are involved in a SMP response. This suggests that cooperation among pathways may exist in the process of small molecular drug responses and that there is the possibility of discovering new and non-intuitive relationships between biochemical pathways. We found that 124 pairs of pathways were involved in at least one common SMP response. Literature mining results based on PubMatrix [153][50] have provided strong independent evidence for the correlation among these pathways, 58% (72/124) of the pathway pairs co-occurred in at least one reference (detailed information for pathway pairs involved in >9 common SMPs is listed in [154]Table S3). Some pathway pairs are already recognized as correlated pathways based on KEGG. Ten SMPs (cycloheximide, parthenlide, menadione and rapamycin et al.) clustered in the center of SMP-pathway graph (red rectangle area in [155]Figure S4) targeted approximately the same pathways. These pathways all relate to protein biosynthesis, and are in concordance with the protein translation inhibition functional role of these SMPs. Some internal mechanisms of the SMP effects can be indicated in the SMP-pathway graph. For instance, glycolysis/gluconeogenesis is co-targeted by valinamycin, rapamycin, tunicamycin and fccp. However, the main effect of these four drugs is not regulation of the glucose distribution. A number of references suggest valinamycin plays a role in the glycolysis