Abstract Genome-wide association studies (GWASs) have discovered association of several loci with Type 2 diabetes (T2D), a common complex disease characterized by impaired insulin secretion by pancreatic β cells and insulin signaling in target tissues. However, effect of genetic risk variants on continuous glycemic measures in nondiabetic subjects mainly elucidates perturbation of insulin secretion. Also, the disease associated genes do not clearly converge on functional categories consistent with the known aspects of T2D pathophysiology. We used a systems biology approach to unravel genome to phenome correlation in T2D. We first examined enrichment of pathways in genes identified in T2D GWASs at genome-wide or lower levels of significance. Genes at lower significance threshold showed enrichment of insulin secretion related pathway. Notably, physical and genetic interaction network of these genes showed robust enrichment of insulin signaling and other T2D pathophysiology related pathways including insulin secretion. The network also overrepresented genes reported to interact with insulin secretion and insulin action targeting antidiabetic drugs. The drug interacting genes themselves showed overrepresentation of insulin signaling and other T2D relevant pathways. Next, we generated genome-wide expression profiles of multiple insulin responsive tissues from nondiabetic and diabetic patients. Remarkably, the differentially expressed genes showed significant overlap with the network genes, with the intersection showing enrichment of insulin signaling and other pathways consistent with T2D pathophysiology. Literature search led our genomic, interactomic, transcriptomic and toxicogenomic evidence to converge on TGF-beta signaling, a pathway known to play a crucial role in pancreatic islets development and function, and insulin signaling. Cumulatively, we find that GWAS genes relate directly to insulin secretion and indirectly, through collaborating with other genes, to insulin resistance. This seems to support the epidemiological evidence that environmentally triggered insulin resistance interacts with genetically programmed β cell dysfunction to precipitate diabetes. Introduction Our understanding of genetic basis of disease risk has greatly improved in recent years owing to the advent of genome-wide association studies (GWASs) [35][1], [36][2]. The genes influencing common complex or multifactorial diseases and quantitative traits were largely unknown before GWASs came into being in the year 2006 [37][2], [38][3]. Results obtained from these studies suggest that multiple genetic architectures, including common genetic variants with small effects and rare variants with large effect sizes, underlie susceptibility to common diseases [39][1]. Nearly 1,300 GWASs covering more than 650 diseases and traits have been reported over the past several years [40][4]. A typical GWAS involves typing hundreds of thousands of single nucleotide polymorphisms (SNPs) in thousands of control and affected individuals, and identifying SNPs that differ significantly between the two groups in terms of allele frequency as disease or trait associated [41][5]–[42][7]. Type 2 diabetes mellitus (T2D) is a common complex disease whose pathogenic mechanisms are known to a considerable extent [43][8], [44][9]. Several organs including pancreatic islets, liver, skeletal muscle, adipose tissues, gut, hypothalamus and the immune system play a role in its pathogenesis [45][10]. Numerous multifactorial mechanisms that include genetic and environmental factors related to obesity are involved in the development of insulin resistance and impaired insulin secretion [46][8], [47][9]. Insulin resistance is associated with inactivity, obesity and ageing [48][8]. The insulin secreting pancreatic islet β cells respond to insulin resistance by enhancing their mass and metabolic function. T2D however develops when increase in insulin secretion by β cells is not able to keep pace with the increase in insulin resistance [49][8], [50][11]. The latter thus characterizes both prediabetic condition and T2D. Prediabetic insulin resistance state however does not always lead to diabetes; enhanced secretion of insulin by β cells compensates for deficient insulin action in a considerable proportion of prediabetic individuals who do not develop T2D. Though the inability of β cells to secrete enough insulin primarily typifies T2D, the dysfunction can also be demonstrated in normoglycemic subjects [51][12]. Therefore, derangements in both insulin secretion and insulin signaling, involved in the regulation of several processes including glucose uptake into cells, seem necessary but not sufficient in causing T2D. Based on epidemiological findings, it has been proposed that interaction between environmentally triggered insulin resistance and genetically programmed pancreatic β cell dysfunction leads to the development of T2D [52][12]–[53][14]. Over 20 major GWASs for T2D have been performed and several are underway at present [54][2]. A majority of identified loci have been found consistently across studies [55][15]. However, effect of the risk variants on continuous glycemic measures in nondiabetic subjects shows that T2D susceptibility is primarily mediated through perturbation of insulin secretion rather than insulin signaling [56][2], [57][12], [58][16]–[59][21]. Also, genes associated with T2D poorly represent established pathways of insulin signaling [60][12]. Global approaches to find statistically significant overrepresentation of functional categories in T2D associated genes have, although not provided clear evidence of the potential disease mechanisms, nonetheless identified enrichment of cell cycle regulation [61][2], [62][17], [63][19]. Considering that T2D associated genes representing cell cycle regulation are expressed in pancreatic islets, and that their disease association is mediated mainly through β cell dysfunction, the genetic evidence in the disease may seem to converge, to some extent, on insulin secretion [64][19]. Other than this limited convergence, the associated genes do not clearly confirm other known aspects of T2D pathophysiology including insulin signaling. This has led to the suggestion that either the disease is markedly heterogeneous or the critical aspects of disease pathophysiology are insufficiently captured by the presently available databases [65][2], [66][17], [67][19]. Genes identified in GWASs when evaluated in the context of complementary systems level data such as that related to protein-protein interactions and to and gene expression can provide insights into the mechanisms underlying pathogenesis of complex traits [68][22]–[69][24]. Here, we have combined these approaches toward deciphering genome to phenome correlation in T2D ([70] Figure 1 ). Given that T2D GWAS genes do not directly relate to disease pathophysiology, our main aim was to examine if this genome to phenome correlation gap can be abridged by considering GWAS genes in conjunction with physical and genetic interaction, and gene expression data. Figure 1. Schematic representation of the workflow. [71]Figure 1 [72]Open in a new tab T2D GWAS genes do not directly relate (indicated by ‘X’ on the left side) to pathways associated with disease pathophysiology. Conspicuously, effect of identified risk variants on continuous glycemic measures in nondiabetic subjects chiefly explains only perturbation of insulin secretion, not insulin resistance. Further, the genes found as associated with the disease do not clearly relate to processes and pathways consistent with the known aspects of T2D pathophysiology. The main aim of the present study was to ask the question (indicated by ‘?’ on the right side) if GWAS data when considered in conjunction with interactome, toxicogenome and disease transcriptome data reveal genome to phenome correlation in T2D. Data available in public domain for GWAS, interactome and toxicogenome was used in the analysis. For disease transcriptome, new experimental data was generated. We specifically examined if interaction network of genes reported in T2D GWAS, genes showing altered expression after treatment with various antidiabetic drugs, and genes that are differentially expressed in insulin responsive tissues in male and female T2D patients do converge on insulin secretion, insulin resistance and other T2D associated pathophysiological pathways. Results GWAS Genes A catalog of SNP associations up to a p value cutoff of 1×10^−5, a threshold commonly used for preliminary selection of SNPs in GWASs, exists in public domain [73][25]. As genes at this cutoff are considered meaningful for enrichment analysis [74][26], we retrieved genes reported in T2D GWASs at p value thresholds up to 10^−5 (Dataset S1) and examined enrichment of Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways therein. Maturity onset diabetes of the young (MODY), a Mendelian form of diabetes in T2D, was the only pathway that showed enrichment in this analysis ([75] Figure 2 ). Whereas 10^−8 cutoff genes showed enrichment of MODY only at a borderline significance, those at 10^−5 threshold showed a robust overrepresentation. Of the five 10^−5 cutoff genes in MODY, HNF1A and HNF1B were absent in 10^−8 set while the latter alone was missing from 10^−7 and 10^−6 gene lists. These genes, like most of the other MODY genes, encode transcription factors that directly or indirectly affect expression of insulin and other proteins related to pancreatic β cell development and/or glucose metabolism [76][27]. Association of HNF1A and/or HNF1B with T2D has earlier been reported in candidate gene studies [77][28]–[78][30]. We thus focused on the gene list comprising 93 genes established from a GWAS cutoff of 10^−5. Dubbed “T2D genome” henceforth, we used this gene set in subsequent analysis. Figure 2. Enrichment of pathways in genes reported in T2D GWASs at various association p value thresholds. [79]Figure 2 [80]Open in a new tab Genes at different p value cutoffs were examined for pathway enrichment. The pathway along with corresponding genes and enrichment p values are indicated. Note highly significant enrichment of Maturity onset diabetes of the young in genes reported at 10^−5 p value threshold, dubbed “T2D genome” henceforth. Physical and Genetic Interaction Network Next, we retrieved all the direct interactions, both physical and genetic, of T2D genome, called “T2D interactome” hereafter, from Biological General Repository for Interaction Datasets (BioGRID) (Dataset S2). Genes in T2D interactome was examined for pathway enrichment. Several of the enriched pathways relate to known aspects of T2D pathophysiology ([81] Table 1 ). Other than insulin signaling and type II diabetes mellitus pathway that relates to both insulin secretion and insulin signaling, the other pathways such as ErbB, adipocytokine, Jak-STAT, chemokine, TGF-beta, Wnt, VEGF, Notch, MAPK, T cell receptor, B cell receptor, Toll-like receptor, p53 and mTOR signaling, and regulation of actin cytoskeleton have all been implicated in T2D [82][31]–[83][49]. Also, enrichment of cell cycle was consistent with previous global analysis of T2D associated genes [84][2], [85][17], [86][19]. Altogether, T2D interactome was found consistent with disease pathophysiology. Table 1. Enriched pathways in T2D interactome. Term Corrected p value[87]^* Pathways in cancer 4.52E-27 Prostate cancer 2.02E-17 Chronic myeloid leukemia 4.90E-17 B cell receptor signaling pathway 3.44E-15 Pancreatic cancer 1.38E-13 Glioma 3.24E-13 Focal adhesion 1.19E-12 Proteasome 3.38E-12 Acute myeloid leukemia 3.15E-12 Neurotrophin signaling pathway 1.39E-11 Insulin signaling pathway 2.57E-11 Small cell lung cancer 2.98E-11 T cell receptor signaling pathway 3.35E-10 ErbB signaling pathway 4.29E-10 Non-small cell lung cancer 4.09E-10 Colorectal cancer 7.45E-09 Cell cycle 9.14E-09 Fc epsilon RI signaling pathway 9.32E-09 Type II diabetes mellitus 2.00E-08 Adherens junction 4.16E-08 Melanoma 3.49E-07 Fc gamma R-mediated phagocytosis 3.46E-07 Adipocytokine signaling pathway 7.42E-07 Renal cell carcinoma 1.42E-06 Bladder cancer 1.72E-06 Jak-STAT signaling pathway 2.82E-06 Chemokine signaling pathway 3.28E-06 Endometrial cancer 3.65E-06 Notch signaling pathway 6.27E-06 TGF-beta signaling pathway 6.48E-06 Natural killer cell mediated cytotoxicity 6.30E-06 Aldosterone-regulated sodium reabsorption 7.51E-06 Thyroid cancer 1.01E-05 VEGF signaling pathway 1.44E-05 Progesterone-mediated oocyte maturation 8.87E-05 mTOR signaling pathway 9.80E-05 Leukocyte transendothelial migration 1.25E-04 Oocyte meiosis 1.57E-04 Pathogenic Escherichia coli infection 2.39E-04 Ubiquitin mediated proteolysis 3.04E-04 p53 signaling pathway 3.29E-04 Wnt signaling pathway 3.97E-04 NOD-like receptor signaling pathway 5.10E-04 MAPK signaling pathway 8.07E-04 Regulation of actin cytoskeleton 0.001 Toll-like receptor signaling pathway 0.002 Apoptosis 0.003 Epithelial cell signaling in Helicobacter pylori infection 0.004 Gap junction 0.01 RIG-I-like receptor signaling pathway 0.02 Tight junction 0.02 GnRH signaling pathway 0.02 Long-term depression 0.04 [88]Open in a new tab ^* Bejamini-Hochberg correction. Drug-gene Interactions We used a toxicogenomic approach to further investigate if T2D interactome represents pathophysiology related pathways including insulin signaling. The Comparative Toxicogenomics Database (CTD) documenting gene-environment relationships has been used previously towards understanding T2D etiology [89][24]. We searched for genes that interact with antidiabetic drugs using CTD (Dataset S3). We then examined if T2D interactome is enriched in genes interacting with these drugs. For enrichment analysis to be statistically meaningful, gene lists associated with only those drugs were considered which had a minimum of five genes in common with T2D interactome. Notably, significant enrichment or a trend for the same was observed for all the drugs tested ([90] Figure 3 ). Further, genes interacting with all the antidiabetic drugs combined showed enrichment of various pathways relevant in T2D pathophysiology ([91] Table 2 ), as observed previously for T2D interactome ([92] Table 1 ). Cumulatively, the toxicogenomic analysis supported the importance of T2D interactome in unraveling genome to phenome correlation. Figure 3. Overlap between antidiabetic drug interacting genes and T2D interactome. [93]Figure 3 [94]Open in a new tab Compared to total interactome of 14,306 genes, the T2D interactome of 561 genes represent significantly greater number of antidiabetic drug interacting genes. A statistically significant overrepresentation was observed for all the drugs except pioglitazone in hypergeometric test with Bonferroni adjustment of p values for multiple hypotheses testing. Overrepresentation with a borderline significance was nonetheless observed even for pioglitazone. The adjusted enrichment p values are indicated. Table 2. Enriched pathways in antidiabetic drug interacting genes. Term Corrected p value[95]^* Adipocytokine signaling pathway 4.16E-17 Pathways in cancer 3.30E-13 Prostate cancer 1.67E-10 Focal adhesion 4.23E-09 Bladder cancer 2.84E-08 Insulin signaling pathway 2.41E-08 p53 signaling pathway 2.51E-08 PPAR signaling pathway 3.05E-08 Pancreatic cancer 6.98E-08 Small cell lung cancer 6.79E-08 NOD-like receptor signaling pathway 2.84E-06 Drug metabolism 2.84E-06 Type II diabetes mellitus 4.93E-06 Colorectal cancer 5.63E-06 Retinol metabolism 6.51E-06 Metabolism of xenobiotics by cytochrome P450 6.30E-06 Glioma 1.32E-05 Cytokine-cytokine receptor interaction 1.44E-05 Chronic myeloid leukemia 4.34E-05 Melanoma 7.26E-05 mTOR signaling pathway 7.24E-05 Apoptosis 1.03E-04 Non-small cell lung cancer 1.09E-04 Linoleic acid metabolism 2.68E-04 Neurotrophin signaling pathway 2.75E-04 Toll-like receptor signaling pathway 2.67E-04 ErbB signaling pathway 2.94E-04 Thyroid cancer 3.24E-04 Cell cycle 7.72E-04 Acute myeloid leukemia 8.63E-04 Endometrial cancer 9.83E-04 MAPK signaling pathway 0.002 Allograft rejection 0.002 ABC transporters 0.003 Chemokine signaling pathway 0.004 ECM-receptor interaction 0.004 Arachidonic acid metabolism 0.006 Amyotrophic lateral sclerosis (ALS) 0.01 Jak-STAT signaling pathway 0.01 Steroid hormone biosynthesis 0.01 Fc epsilon RI signaling pathway 0.01 Renal cell carcinoma 0.01 Fatty acid metabolism 0.01 Pyruvate metabolism 0.01 Aldosterone-regulated sodium reabsorption 0.02 Intestinal immune network for IgA production 0.02 Type I diabetes mellitus 0.02 T cell receptor signaling pathway 0.02 Prion diseases 0.02 VEGF signaling pathway 0.02 Progesterone-mediated oocyte maturation 0.03 Fc gamma R-mediated phagocytosis 0.03 Glycerolipid metabolism 0.03 Citrate cycle (TCA cycle) 0.03 Gap junction 0.03 Glutathione metabolism 0.05 Starch and sucrose metabolism 0.05 TGF-beta signaling pathway 0.06 [96]Open in a new tab ^* Bejamini-Hochberg correction. Genome-wide Expression Profiling Next, we generated microarray expression profiles of skeletal muscle, visceral adipose and subcutaneous adipose from male and/or female T2D patients ([97] Figure 4 ). For each tissue tested, expression profiles were generated from three diabetic and three non-diabetic individuals. Also, each individual was profiled four times. The differentially expressed genes between diabetic and nondiabetic control groups were identified ([98] Figure 5 and Dataset S4). Significant genes at adjusted p value were subjected to pathway enrichment analysis. For all comparisons except female subcutaneous adipose, these genes showed enrichment of one or more pathways ([99] Table 3 ). Female visceral adipose showed a large number of enriched pathways including ErbB, Wnt, MAPK, T cell receptor, B cell receptor and Toll-like receptor signaling, and regulation of actin cytoskeleton, which, as mentioned above, have all been implicated in T2D. Enrichment of valine, leucine and isoleucine degradation in male visceral adipose is also consistent with metabolomic studies in T2D [100][50], [101][51]. Similarly, enrichment of ECM-receptor interaction in male skeletal muscle is in consonance with known changes in the composition of the extracellular matrix in insulin-resistant muscle [102][52]–[103][57]. Quantitative real time PCR (qRT-PCR) confirmed differential expression in the same direction, or a trend for that, of all the genes tested to validate microarray results ([104] Figure 6 ). The genes used in qRT-PCR validation represented all the enriched pathways ([105] Table 3 ) besides others. These results demonstrated the robustness of our genome-wide expression profiling. Figure 4. Dendrogram of samples based on gene expression profiling. [106]Figure 4 [107]Open in a new tab Correlations between all the eight groups of samples analyzed in microarrays are plotted as a dendrogram. As expected, muscle and adipose form separate clusters. Also, in adipose cluster, subgroups of adipose type and gender are observed. Globally normalized data was used for constructing the dendrogram. Con, control subjects; T2D, diabetic patients; SA, subcutaneous adipose; VA, visceral adipose; SM, skeletal muscle. Figure 5. Numbers of up- and down- regulated genes in multiple insulin responsive tissues in T2D patients. [108]Figure 5 [109]Open in a new tab Expression profiles of skeletal muscle, visceral adipose and subcutaneous adipose from male and/or female subjects were generated using Illumina HumanHT-12 v3 Expression BeadChip arrays that contain more than 25,000 annotated genes. IIlumina custom error model was used to identify up- and down- regulated genes in T2D as compared to controls, with or without Benjamini and Hochberg correction for multiple hypotheses testing. The differentially expressed genes were identified at ±13 Diff score threshold of Illumina custom algorithm, corresponding to a p value of 0.05. Table 3. Enriched pathways in differentially expressed genes between people with and without T2D. Term Corrected p value[110]^* Female visceral adipose Neurotrophin signaling pathway 1.8E-04 Renal cell carcinoma 0.002 Thyroid cancer 0.002 Pathogenic Escherichia coli infection 0.002 Prostate cancer 0.002 Fc gamma R-mediated phagocytosis 0.002 Non-small cell lung cancer 0.002 Pathways in cancer 0.002 Chemokine signaling pathway 0.003 Acute myeloid leukemia 0.004 Endometrial cancer 0.01 Glioma 0.01 Focal adhesion 0.01 Adherens junction 0.01 Wnt signaling pathway 0.01 MAPK signaling pathway 0.01 B cell receptor signaling pathway 0.02 Chronic myeloid leukemia 0.02 ErbB signaling pathway 0.02 Bladder cancer 0.02 Regulation of actin cytoskeleton 0.03 Colorectal cancer 0.03 Toll-like receptor signaling pathway 0.04 T cell receptor signaling pathway 0.04 Gap junction 0.04 Male visceral adipose Valine, leucine and isoleucine degradation 3.0E-05 Propanoate metabolism 0.01 Male skeletal muscle ECM-receptor interaction 0.007 [111]Open in a new tab ^* Bejamini-Hochberg correction. Figure 6. Validation of microarrays using qRT-PCR. [112]Figure 6 [113]Open in a new tab Mean±S.E.M of fold change in gene expression in T2D patients, as compared to controls, is shown for (A) female visceral adipose, (B) male visceral adipose, (C) male skeletal muscle, and (D) female subcutaneous adipose. Fold change was calculated for two to six technical replicates, each representing three biological replicates. The source of RNA used in qRT-PCR analysis was same as in microarray profiling. A total of 47 genes were used for validation. The rationale behind the subsets of genes selected was two fold. One, the genes should represent, wherever applicable, one or more enriched pathways ([114] Table 3 ) in a given condition. Second, the genes should maximally represent those which are differentially expressed at adjusted p value cutoff (Dataset S4) in more than one condition, so that validation of microarrays can be examined more widely. The selected genes, besides others, covered all the pathways that were enriched in the above microarray gene lists. Notably, up- or down- regulation observed in qRT-PCR, shown in red and green, respectively, was consistent with microarrays, for all comparisons. Convergent Pathway Although we found enrichment of several pathways consistent with T2D pathophysiology in our microarray analysis, insulin signaling was conspicuous by its absence in the list of overrepresented pathways ([115] Table 3 ). Statistical corrections may be overly conservative to the point of being counterproductive in interpreting microarrays with genetic knowledgebases [116][58]. We thus explained this counterintuitive result by arguing that application of such corrections for identifying differentially expressed genes in microarrays and for pathway enrichment analysis of these genes may be together responsible for causing false negatives in our results. Since our main goal was to decipher genome to phenome correlation, we examined if T2D interactome (Dataset S2) is enriched for differentially expressed genes in various samples at unadjusted p value threshold ([117] Figure 5 and Dataset S4), called “T2D transcriptome” from now on. Enrichment will be expected if there is a genome to phenome correlation. Also, if enrichment is indeed observed at this level then genes overlapping between T2D interactome and T2D transcriptome will be expected to overrepresent pathways consistent with the disease pathophysiology. Indeed, the overlaps between interactome and transcriptome gene sets were found to be statistically significant even after adjusting the p values for multiple testing ([118] Figure 7 ). Also, the overlapping genes were enriched in T2D pathophysiology related pathways including insulin signaling in female visceral adipose and male skeletal muscle ([119] Table 4 ). In male visceral adipose and female subcutaneous adipose, insulin signaling was however not enriched. Twelve pathways were common to all tissues and both genders ([120] Table 4 ). The common pathways were related to cancer, cell cycle, adherens junction, focal adhesion, pathogenic Escherichia coli infection and TGF-beta signaling. To examine known role of these pathway(s) in T2D pathophysiology, the PubMed was searched using the key words “diabetes AND type AND 2 AND insulin AND (signaling OR action OR resistance OR sensitivity) AND (secretion OR pancreatic OR islets OR beta)” in combination with term(s) representing each of these common pathway. The retrieved abstracts and/or papers and the references therein were