Abstract

   Previous studies have demonstrated the genetic correlations between
   type 2 diabetes, obesity and dyslipidemia, and indicated that many
   genes have pleiotropic effects on them. However, these pleiotropic
   genes have not been well-defined. It is essential to identify
   pleiotropic genes using systematic approaches because systematically
   analyzing correlated traits is an effective way to enhance their
   statistical power. To identify potential pleiotropic genes for these
   three disorders, we performed a systematic analysis by incorporating
   GWAS (genome-wide associated study) datasets of six correlated traits
   related to type 2 diabetes, obesity and dyslipidemia using Meta-CCA
   (meta-analysis using canonical correlation analysis). Meta-CCA is an
   emerging method to systematically identify potential pleiotropic genes
   using GWAS summary statistics of multiple correlated traits. 2,720
   genes were identified as significant genes after multiple testing
   (Bonferroni corrected p value < 0.05). Further, to refine the
   identified genes, we tested their relationship to the six correlated
   traits using VEGAS-2 (versatile gene-based association study-2). Only
   the genes significantly associated (Bonferroni corrected p value <
   0.05) with more than one trait were kept. Finally, 25 genes (including
   two confirmed pleiotropic genes and eleven novel pleiotropic genes)
   were identified as potential pleiotropic genes. They were enriched in 5
   pathways including the statin pathway and the PPAR (peroxisome
   proliferator-activated receptor) Alpha pathway. In summary, our study
   identified potential pleiotropic genes and pathways of type 2 diabetes,
   obesity and dyslipidemia, which may shed light on the common biological
   etiology and pathogenesis of these three disorders and provide
   promising insights for new therapies.

Introduction

   Type 2 diabetes is a serious chronic metabolic disorder characterized
   by hyperglycemia, insulin resistance (IR), destruction of pancreatic
   beta cells and impairment in insulin secretion [[44]1]. Obesity,
   another serious universal health problem characterized by excess
   visceral fat and high Waist-Hip Ratio (WHR)) and general obesity
   (defined as having a Body Mass Index (BMI) of 25 or higher, is related
   to several chronic diseases including type 2 diabetes [[45]2].(Both
   type 2 diabetes and obesity are associated with dyslipidemia [[46]2,
   [47]3], which is characterized by hypertriglyceridemia,
   hypercholesterolemia, decreased HDL (high-density lipoprotein) and/or
   increased LDL (low-density lipoprotein). The coexistence of the three
   diseases is common among the populations, which increases the
   prevalence of other serious fatal diseases such as CVD (cardiovascular
   diseases) [[48]4, [49]5] and stroke [[50]6–[51]8].

   These three disorders are closely connected. Firstly, clinical
   observation and epidemiological data show that a number of type 2
   diabetes patients are obese with dyslipidemia before receiving
   intervention or therapy [[52]9]. Secondly, these three disorders have
   common risk factors and pathophysiological bases. For instance, a long
   term high fat diet is a significant risk factor for type 2 diabetes
   [[53]10], obesity and dyslipidemia [[54]2]. IR is one of the common
   pathophysiological bases for these three diseases [[55]1, [56]11].
   Moreover, previous studies have reported genetic correlations among
   these three disorders and indicated that many genes have pleiotropic
   effects [[57]12–[58]14]. For example, the SREBF1 gene was identified as
   a pleiotropic gene in the progression of type 2 diabetes, obesity, and
   dyslipidemia [[59]12]. In addition, common pathways that could
   influence these three disorders were discovered in recent years, such
   as the AMPK signal pathway (AMP-activated protein kinase signal
   pathway) and the JNK-1 pathway [[60]11, [61]15, [62]16].

   The pleiotropic genes and pathways of the three diseases can partially
   explain the common biological pathogenesis of the three diseases. It is
   essential to identify pleiotropic genes that exert their influence on
   potentially common biological etiology and pathogenesis of these three
   disorders using systematic analysis approaches. Pleiotropic genes and
   their effects have been successfully identified in bivariate analyses
   of type 2 diabetes with obesity, type 2 diabetes with dyslipidemia, and
   obesity with dyslipidemia. Hasstedt et al [[63]17] performed bivariate
   analyses of type 2 diabetes with BMI and type 2 diabetes with WHR,
   which identified significant pleiotropy loci (chromosome 13 at 26–30
   MB) of type 2 diabetes with both. Li et al [[64]3] found a stronger
   correlation between dyslipidemia associated genes (APOB, APOE-C1-C2,
   CETP, CYP7A1 GCKR, MLXIPL, PLTP, TIMD4) and glycemic traits including
   FG (fasting glucose) and HOMA-IR (homeostasis model assessment for IR,
   an important index for evaluating IR, calculated by FG* FI(fasting
   insulin)/22.5), which revealed the pleiotropic effects of
   dyslipidemia-associated genes on glycemic traits. Despite the bivariate
   analyses of these three diseases which have identified pleiotropic
   genes in recent years, multivariate analyses have not yet been
   performed. Undertaking multivariate analysis is desirable because it
   allows us to more systemically explore the common underlying genetic
   architecture and common etiology of these three disorders. Another
   important advantage of multivariate analysis is that it increases the
   statistical power for identifying associated genes exerting influences
   on multiple traits, which leads to more novel insights for drug gable
   gene targets compared to univariate and bivariate analyses [[65]18].

   Meta-CCA (meta-analysis using canonical correlation analysis) [[66]19],
   a new systematical multivariate analysis tool recently proposed by Anna
   Cichonska, allows multivariate analyses between multiple SNPs and
   multiple traits [[67]19], which enriches the pleiotropic information by
   combining correlation signals among multiple traits.

   Inspired by this, we performed the current work to identify potential
   pleiotropic genes for type 2 diabetes, obesity, and dyslipidemia using
   Meta-CCA. We used six correlated quantitative traits reported to be
   established related factors for type 2 diabetes, obesity or
   dyslipidemia, including FG, FI, BMI, WHR, HDL and triglyceride (TG).
   Interestingly, our findings indicated that five pathways and 25 genes
   (including two confirmed pleiotropic genes and eleven novel pleiotropic
   genes) were potential pleiotropic genes for type 2 diabetes, obesity,
   and dyslipidemia. These findings yielded some genetic basis for the
   common biological etiology and pathogenesis, and thus provided
   promising insights for a potential common therapy for the three
   disorders.

Materials and methods

GWAS datasets and processing

Step1: Annotation genes to SNPs, and SNP prune

   The large-scale GWAS datasets of the six correlated traits in the
   present study were downloaded from
   [68]http://diagram-consortium.org/2015_ENGAGE_1KG/ [[69]20, [70]21].
   The glycemic traits (FG and FI) data [[71]20] were derived from a
   meta-analysis of 13 original GWAS studies, including FG data for 46,694
   individuals and FI data for 24,245 individuals. The obesity-related
   data (BMI and WHR) [[72]20] were derived from a meta-analysis of 22
   original GWAS studies, including BMI data for 87,048 individuals and
   WHR data for 54,572 individuals. The lipids (HDL and TG) data [[73]21]
   were derived from a meta-analysis of 22 original GWAS studies,
   including 62,166 individuals for both HDL and TG. The large-scale GWAS
   datasets were collected by the European Network for Genetic and Genomic
   Epidemiology (ENGAGE) Consortium, and all samples were from individuals
   of European ancestry (The details were shown in [74]Table 1). The
   large-scale GWAS datasets were the largest datasets that included all
   six correlated traits and contained all the information needed to
   conduct the analyses in the Meta-CCA framework in one ethnicity
   (Caucasians). We selected the overlapped SNPs (9,411,134 SNPs) of the
   six traits to perform the multivariate analysis.

Table 1. Details and phenotypic pearson correlation coefficients of the six
traits in European ancestry.

   Traits Number of SNPs Number of individuals The phenotypic correlation
   structures between traits
   FG FI BMI WHR HDL TG
   FG 9,967,161 46,694 1 0.35 0.24 0.17 -0.15 0.19
   FI 9,837,043 24,245 0.35 1 0.52 0.39 -0.37 0.40
   BMI 9,953,164 87,048 0.24 0.52 1 0.51 -0.32 0.30
   WHR 9,954,793 54,572 0.17 0.39 0.51 1 -0.30 0.33
   HDL 9,549,054 62,166 -0.15 -0.37 -0.32 -0.30 1 -0.52
   TG 9,544,498 62,166 0.19 0.40 0.30 0.33 -0.52 1
   [75]Open in a new tab

   FG Stands for: Fasting glucose.

   FI Stands for: Fasting insulin.

   BMI Stands for: Body Mass Index.

   WHR Stands for: Waist-Hip Ratio.

   HDL Stands for: High-density lipoprotein.

   TG Stands for: Triglyceride.

   The analytical workflow of our study is presented in [76]Fig 1.
   Firstly, we completed the gene annotation according to the 1000 Genome
   datasets using PLINK1.9. We downloaded the reference data, which
   contained 26,291 genes from the website:
   [77]https://www.cog-genomics.org/static/bin/plink/glist-hg19. We
   recognized the transcript including all SNPs (both exonic and intronic
   SNPs) in the region as genes. We selected the overlapped SNPs between
   the six traits and the reference data in our study. After the gene
   annotation of SNPs, we pruned the SNPs for each gene using the
   parameter r^2 = 0.01 [[78]22]. r is the Pearson correlation coefficient
   between any of the two SNPs in one region. The purpose of SNP pruning
   is to reduce potential biases caused by the linkage disequilibrium (LD)
   among SNPs [[79]22].

Fig 1. The analytical workflow of the present work.

   [80]Fig 1
   [81]Open in a new tab

   For the present work, the GWAS datasets we used provided not only the p
   value of the SNPs but also the regression coefficient β and the
   standard error (SE). We normalized the regression coefficient β before
   conducting the Meta-CCA using the following equation (n is the sample
   size in the corresponding GWAS dataset of each trait):
   [MATH:
   <mrow><msup><mi>β</mi><mrow><mi>n</mi><mi>o</mi><mi>r</mi><mi>m</mi><mi
   >a</mi><mi>l</mi></mrow></msup><mo>=</mo><mfrac><mi>β</mi><mrow><msqrt>
   <mi>n</mi></msqrt><mo>*</mo><mi>S</mi><mi>E</mi></mrow></mfrac></mrow>
   :MATH]
   (1)

Statistical analysis

Step 2: Meta-CCA analysis

   The present work was conducted by the Meta-CCA R package, and each gene
   as a unit for Meta-CCA analysis. We recognized the transcript including
   all SNPs (both exonic and intronic SNPs) in the region as genes. The
   program of Meta-CCA required three basic data inputs: the genotypic
   correlation structures between SNPs (∑XX), the correlation coefficients
   between SNPs and traits (∑XY), and the phenotypic correlation
   structures between traits (∑YY) ([82]Table 1) [[83]19]. In Meta-CCA,
   ∑XX was estimated using a reference SNP dataset such as the HapMap data
   or the 1000 Genomes data representing the study population. It is
   better if ∑XX was estimated from the target population or the same
   ethnicity from the database used [[84]19]. As all of the participants
   in our study were Caucasians, we calculated the ∑XX based on the 1000
   Genomes data and downloaded the reference data from 1000 Genomes
   Project (Phase 3 European (CEU) population reference data
   ([85]https://www.cog-genomics.org/static/bin/plink/glist-hg19). ∑XY was
   estimated by the normalized regression coefficient β:
   [MATH: <mrow><mstyle
   displaystyle="true"><mo>∑</mo><mrow><mi>X</mi><mi>Y</mi><mo>=</mo><mfra
   c><mrow><msup><mi>X</mi><mi>T</mi></msup><mi>Y</mi></mrow><mrow><mi>N</
   mi><mo>−</mo><mn>1</mn></mrow></mfrac></mrow></mstyle><mo>=</mo><mrow><
   mo>(</mo><mrow><mtable><mtr><mtd><mrow><msub><mi>β</mi><mrow><mn>11</mn
   ></mrow></msub></mrow></mtd><mtd><mrow><msub><mi>β</mi><mrow><mn>12</mn
   ></mrow></msub></mrow></mtd><mtd><mo>⋯</mo></mtd><mtd><mrow><msub><mi>β
   </mi><mrow><mn>1</mn><mi>P</mi></mrow></msub></mrow></mtd></mtr><mtr><m
   td><mrow><msub><mi>β</mi><mrow><mn>21</mn></mrow></msub></mrow></mtd><m
   td><mrow><msub><mi>β</mi><mrow><mn>22</mn></mrow></msub></mrow></mtd><m
   td><mo>⋯</mo></mtd><mtd><mrow><msub><mi>β</mi><mrow><mn>2</mn><mi>P</mi
   ></mrow></msub></mrow></mtd></mtr><mtr><mtd><mo>⋮</mo></mtd><mtd><mo>⋮<
   /mo></mtd><mtd><mo>⋱</mo></mtd><mtd><mo>⋮</mo></mtd></mtr><mtr><mtd><mr
   ow><msub><mi>β</mi><mrow><mi>G</mi><mn>1</mn></mrow></msub></mrow></mtd
   ><mtd><mrow><msub><mi>β</mi><mrow><mi>G</mi><mn>2</mn></mrow></msub></m
   row></mtd><mtd><mo>⋯</mo></mtd><mtd><mrow><msub><mi>β</mi><mrow><mi>G</
   mi><mi>P</mi></mrow></msub></mrow></mtd></mtr></mtable></mrow><mo>)</mo
   ></mrow></mrow> :MATH]
   (2)

   G and P are the number of genotypic and phenotypic variables,
   respectively.

   And ∑YY was estimated based on the Phenotypic Pearson Correlation
   Coefficients (YY), which was shown in [86]Table 1 [[87]23]. In the
   original study, all of the participants had the clinical data of the
   six traits (FG, FI, BMI, WHR, HDL and TG). The phenotypic Pearson
   correlation coefficients between the six traits were calculated from
   the same corresponding individuals [[88]23]. The details of the exact
   procedures of Meta-CCA program were described in Anna Cichonska’s paper
   [[89]19]. After Meta-CCA, we obtained the output data of the
   relationship between the genes and the six traits. We used a p value <
   0.05 (after Bonferroni correction) as the nominal significance
   threshold.

Step 3: Gene-based association analysis

   To refine the identified genes by Meta-CCA, we tested their specific
   relationships with the six traits respectively using VEGAS-2 (Versatile
   Gene-based Association Study–2) [[90]24, [91]25], a gene-based
   algorithm widely used for gene-based p value calculation using GWAS
   summary statistics. VEGAS-2, an approach provides the correlation of
   all the SNPs in one gene region for one single trait, also has lower
   false positive rates compared with other gene-based approaches [[92]26,
   [93]27]. We selected the potential pleiotropic genes significantly
   associated with more than one trait (P value< 0.05/2720, Bonferroni
   correction) [[94]28] after obtaining the gene-based p-value of each
   gene for the six traits using VEGAS-2.

Functional annotation and gene enrichment analyses

Step 4: Pathway and GO (Gene Ontology) term enrichment analyses of the
potential pleiotropic genes

   To explore the biological functions of the identified potential
   pleiotropic genes, we performed pathway enrichment analyses and GO
   enrichment analyses for the potential pleiotropic genes using Enrichr
   (a web server tool for gene set enrichment analysis:
   [95]http://amp.pharm.mssm.edu/Enrichr/) [[96]29]. The Pathway and GO
   Term enrichment analyses also provide a better understanding of the
   polygenic associations and the potential mechanisms of the biological
   process. With this program, we used hypergeometric tests and Fisher’s
   exact tests for the statistical analysis. Benjamini-Hochberg corrected
   p value <0.05 in the enrichment analysis is used as the threshold for
   significance.

Results

   After gene annotation and SNP pruning, 21,209 genes were left to
   conduct the Meta-CCA analysis in our study. The number of SNPs in each
   gene ranged from 1 to 280; the average was 13. We used the threshold of
   0.05/21209 (Bonferroni correction) as our target alpha level for the
   Meta-CCA analysis [[97]28]. For the Meta-CCA analysis, 2,720 genes with
   the p value < 0.05/21209 were identified as potential pleiotropic genes
   for the six correlated quantitative traits. After Meta-CCA analysis, we
   tested the 2,720 genes’ relationship to the six traits using VEGAS-2.
   Only the genes significantly associated (Bonferroni corrected p value <
   0.05) with more than one trait were kept [[98]24, [99]25]. There were
   31, 0, 75, 1, 225 and 185 significant genes (Bonferroni corrected p
   value < 0.05) for FG, FI, BMI, WHR, HDL, and TG respectively. By
   screening the genes based on the results of gene-based p value, a total
   of 25 associated genes related to more than one trait in VEGAS-2
   analysis were identified as potential pleiotropic genes for type 2
   diabetes, obesity and dyslipidemia. The details are shown in [100]Table
   2.

Table 2. The features of the significant potential pleiotropic genes.

   Genes r[101]^* P-value for one single trait
   BMI[102]^# WHR[103]^# FG[104]^# FI[105]^# HDL[106]^# TG[107]^#
   SIK3[108]^a[[109]30, [110]31] 2.63E-01 6.08E-01 9.69E-02 4.19E-01
   5.79E-01 1.00E-06 1.00E-06
   CETP[111]^a[[112]3, [113]32] 2.48E-01 2.70E-01 4.04E-01 5.90E-01
   6.25E-01 1.00E-06 1.00E-06
   LIPC[114]^a[[115]33, [116]34] 2.27E-01 2.02E-01 6.78E-01 3.58E-02
   2.05E-01 1.00E-06 1.00E-06
   GALNT2[117]^a[[118]21, [119]35] 2.05E-01 1.93E-01 6.87E-01 6.66E-01
   3.09E-01 1.00E-06 1.00E-06
   SNX17[120]^a[[121]36] 1.10E-01 3.08E-01 6.89E-03 6.00E-06 1.19E-03
   9.79E-02 1.00E-06
   GCKR[122]^b[[123]21, [124]37, [125]38] 9.73E-02 2.06E-01 6.89E-03
   1.00E-06 6.10E-05 1.09E-01 1.00E-06
   LIPC-AS1[126]^c 9.07E-02 5.01E-01 7.92E-01 1.09E-01 1.71E-01 1.00E-06
   1.00E-06
   LPL[127]^a[[128]39, [129]40] 9.05E-02 5.51E-01 3.83E-01 1.57E-01
   6.69E-01 1.00E-06 1.00E-06
   HLA-DQA1[130]^d[[131]21, [132]41] 8.42E-02 6.47E-01 2.18E-01 4.26E-01
   8.84E-01 1.20E-05 1.00E-06
   IFT172[133]^c 8.40E-02 3.57E-01 6.32E-03 4.00E-06 3.13E-04 8.69E-02
   1.00E-06
   KRTCAP3[134]^c 8.28E-02 3.30E-01 7.51E-03 1.00E-05 1.46E-03 1.30E-01
   1.00E-06
   CSGALNACT1[135]^c 8.26E-02 6.89E-02 7.66E-01 6.93E-01 2.44E-01 1.00E-05
   1.00E-06
   APOA5[136]^a[[137]42, [138]43] 8.23E-02 5.21E-01 3.53E-01 2.22E-01
   4.35E-01 1.00E-06 1.00E-06
   EIF2B4[139]^c 7.28E-02 3.48E-01 7.18E-03 5.00E-06 1.48E-03 7.09E-02
   1.00E-06
   GTF3C2[140]^c 7.07E-02 4.26E-01 1.19E-02 1.20E-05 1.61E-03 7.19E-02
   1.00E-06
   ZNF513[141]^c 7.00E-02 3.19E-01 6.02E-03 1.10E-05 1.41E-03 9.69E-02
   1.00E-06
   NRBP1[142]^c 6.21E-02 3.22E-01 9.40E-03 9.00E-06 1.60E-03 1.46E-01
   1.00E-06
   FNDC4[143]^c 6.19E-02 3.39E-01 5.16E-03 1.00E-06 1.00E-04 1.02E-01
   1.00E-06
   APOA1[144]^a[[145]44, [146]45] 5.52E-02 5.43E-01 2.43E-01 3.54E-01
   4.51E-01 1.00E-06 1.00E-06
   FADS1[147]^b[[148]21, [149]46, [150]47] 5.22E-02 8.79E-02 5.49E-01
   3.00E-06 6.44E-01 1.00E-06 1.00E-06
   TMEM258[151]^c 5.15E-02 7.69E-02 6.77E-01 1.00E-06 7.05E-01 1.00E-06
   1.00E-06
   FEN1[152]^d[[153]21, [154]48] 4.82E-02 1.35E-01 6.67E-01 1.00E-06
   6.54E-01 1.00E-06 1.00E-06
   ZPR1[155]^d[[156]21, [157]49] 4.36E-02 5.01E-01 3.51E-01 2.39E-01
   4.41E-01 1.00E-06 1.00E-06
   APOC2[158]^d[[159]21, [160]50] 4.04E-02 1.10E-01 5.32E-01 6.99E-02
   7.43E-01 1.00E-06 1.00E-06
   CLPTM1[161]^c 3.98E-02 1.96E-01 6.07E-01 9.79E-02 7.72E-01 1.00E-06
   1.00E-06
   [162]Open in a new tab

   * Stands for: Canonical correlation value for the six correlated traits
   which is the result of Meta-CCA.

   ^# Stands for: P-value for each trait which is the result of gene-based
   analysis.

   ^a Stands for: This gene hasn’t been identified by any previous GWAS
   studies for type 2 diabetes and obesity, but has been reported to be
   associated with hyperglycemia, obesity and dyslipidemia in other types
   of previous studies.

   ^b Stands for: This gene was previously reported to be associated with
   type 2 diabetes, obesity and dyslipidemia, which was confirmed by our
   present study.

   ^c Stands for: Novel pleotropic gene for type 2 diabetes, obesity and
   dyslipidemia.

   ^d Stands for: This gene hasn’t been identified by any previous studies
   for obesity, but has been reported to be associated with type 2
   diabetes and dyslipidemia in previous study.

   Interestingly, four of the top five significant genes (GALNT2, SNX17,
   CETP, LIPC) were regarded as dyslipidemia associated genes in the
   original GWAS study [[163]21]. In particular, two (GALNT2, SNX17) were
   also suggested to be associated with type 2 diabetes and obesity in
   previous studies [[164]35, [165]36]. All 25 potential pleiotropic genes
   were identified as the associated genes/loci (with at least one SNP p
   value < 5*10^−8) for TG in the original GWAS study, while eight of
   these 25 genes (GALNT2, GCKR, LPL, FADS1, LIPC, CETP, APOA5, ZPR1) were
   reported to be TG associated genes in the original GWAS study after
   validation [[166]21]. Specifically, two of these 16 genes (FADS1, GCKR)
   have been identified as susceptibility candidate genes for type 2
   diabetes in early GWAS studies [[167]37, [168]47, [169]51].

   For the results of the pathway enrichment analyses, significant
   enrichment was observed in five human pathways conforming to the
   up-to-date 2016 Wiki-pathway database ([170]Table 3) [[171]29,
   [172]52], such as Statin Pathway (WP430), Composition of Lipid
   Particles (WP3601), Triacylglyceride Synthesis (WP325), PPAR
   (Peroxisome proliferator-activated receptor) Alpha Pathway (WP2878),
   Fatty Acid Beta Oxidation (WP143). The most significant pathway was the
   statin pathway (WP430), which contains six potential pleiotropic genes
   (CETP, LIPC, APOC2, APOA1, LPL, APOA5), suggesting a close relationship
   between the statin pathway and the three disorders.

Table 3. Pathway enrichment analysis of the potential pleiotropic genes.

   Term (Pathway) P-value Benjamini-Hochberg P-value Genes
   Statin Pathway(WP430) 1.44E-12 4.03E-10 CETP, LIPC, APOC2, APOA1, LPL,
   APOA5
   Composition of Lipid Particles(WP3601) 1.44E-07 1.35E-06 CETP, LPL,
   APOA1
   Triacylglyceride Synthesis(WP325) 4.07E-04 1.89E-03 LIPC, LPL
   PPAR Alpha Pathway(WP2878) 4.79E-04 1.91E-03 APOA1, APOA5
   Fatty Acid Beta Oxidation(WP143) 8.21E-04 2.55E-03 LIPC, LPL
   [173]Open in a new tab

   GO enrichment analyses (conforming to the up-to-date 2017 database)
   [[174]29, [175]52] revealed that the biological functions of these
   pleiotropic genes were mainly involved in the metabolism of lipids. For
   the GO biological process, the top five significant GO terms were
   Triglyceride homeostasis (GO:0070328), Cellular triglyceride
   homeostasis (GO:0035356), Positive regulation of lipoprotein lipase
   activity (GO:0051006), Cholesterol homeostasis (GO:0042632) and Reverse
   cholesterol transport (GO:0043691). For the GO cellular component, the
   top five significant GO terms were Very-low-density lipoprotein
   particle (GO:0034361), Spherical high-density lipoprotein particle
   (GO:0034366), Early endosome (GO:0005769), Early endosome lumen
   (GO:0031905) and Integral component of Golgi medial cisterna membrane
   (GO:1990703). For the GO molecular function, the top five significant
   GO terms were Intermembrane cholesterol transfer activity (GO:0120020),
   Cholesterol transporter activity (GO:0017127), Cholesterol binding
   (GO:0015485), Phosphatidylcholine-sterol O-acyltransferase activator
   activity (GO:0060228), and High-density lipoprotein particle receptor
   binding (GO:0070653). The results of the GO enrichment analysis are
   summarized in [176]Table 4. GO Term is the gene collection of different
   arborescence types [[177]53]. Therefore, some GO Term is the branch of
   others. As a result, there is a considerable overlap of genes between
   related GO-terms such as “Triglyceride homeostasis” and “Cellular
   triglyceride homeostasis”.

Table 4. Top five significant GO term enrichment analysis of the potential
pleiotropic genes.

   Term (GO Biological Process) P-value Benjamini-Hochberg P-value Genes
   Triglyceride homeostasis (GO:0070328) 3.19E-16 2.44E-13 CETP, GCKR,
   LIPC, APOC2, LPL, APOA1, APOA5
   Cellular triglyceride homeostasis (GO:0035356) 2.21E-15 8.42E-13 CETP,
   GCKR, LIPC, APOC2, LPL, APOA1, APOA5
   Positive regulation of lipoprotein lipase activity (GO:0051006)
   2.29E-11 5.83E-09 LIPC, APOC2, LPL, APOA1, APOA5
   Cholesterol homeostasis (GO:0042632) 3.06E-11 5.83E-09 CETP, LIPC,
   APOC2, LPL, APOA1, APOA5
   Reverse cholesterol transport (GO:0043691) 1.58E-10 2.01E-08 CETP,
   LIPC, APOC2, APOA1, APOA5
   Term (GO Cellular Component) P-value Benjamini-Hochberg P-value Genes
   Very-low-density lipoprotein particle (GO:0034361) 7.77E-07 1.17E-04
   APOC2, APOA1, APOA5
   Spherical high-density lipoprotein particle (GO:0034366) 4.18E-05
   3.16E-03 APOC2, APOA1
   Early endosome (GO:0005769) 8.43E-04 2.09E-02 SNX17, APOC2, APOA1
   Early endosome lumen (GO:0031905) 1.11E-03 2.09E-02 SNX17, APOC2, APOA1
   Integral component of Golgi medial cisterna membrane (GO:1990703)
   1.08E-03 2.09E-02 CSGALNACT1, GALNT2
   Term (GO Molecular Function) P-value Benjamini-Hochberg P-value Genes
   Intermembrane cholesterol transfer activity (GO:0120020) 9.56E-07
   1.71E-04 CETP, APOA1, APOA5
   Cholesterol transporter activity (GO:0017127) 4.95E-06 4.43E-04 CETP,
   APOA1, APOA5
   Cholesterol binding (GO:0015485) 2.06E-05 6.69E-04 CETP, APOA1, APOA5
   Phosphatidylcholine-sterol O-acyltransferase activator activity
   (GO:0060228) 2.24E-05 6.69E-04 APOA1, APOA5
   High-density lipoprotein particle receptor binding (GO:0070653)
   2.24E-05 6.69E-04 APOA1, APOA5
   [178]Open in a new tab

   In summary, our present work identified twenty-five potential
   pleiotropic genes as well as the enriched pathways and GO terms of
   potential pleiotropic genes for type 2 diabetes, obesity and
   dyslipidemia.

Discussion

   The present study, the first systemically multivariate analysis of
   genomics data for type 2 diabetes, obesity and dyslipidemia jointly
   using Meta-CCA, identified potential pleiotropic genes as well as
   enriched pathways and GO terms. Importantly, two of the 25 identified
   genes (GCKR, FADS1) were reported to be associated with type 2
   diabetes, obesity and dyslipidemia in different prior studies [[179]21,
   [180]37, [181]38, [182]46, [183]47], validated by our present study.
   Other significant genes, excluding the genes that were reported to be
   associated with hyperglycemia, obesity and dyslipidemia in other types
   of previous studies, might be novel pleiotropic candidate genes (such
   as LIPC-AS1, IFT172, KRTCAP3, CSGALNACT1, EIF2B4, GTF3C2, ZNF513,
   NRBP1, FNDC4, TMEM258, and CLPTM1) for the three disorders. We
   preformed the functional protein-protein interaction network analysis
   for the potential pleiotropic candidate genes by using STRING 10.5
   ([184]https://string-db.org/cgi/input.pl). [185]Fig 2 shows that there
   are interactions between most of the potential pleiotropic candidate
   genes. The results not only revealed some of the shared genetic
   components but also provided novel insights for exploring the potential
   common biological pathogenesis of these three disorders.

Fig 2. The nodes represent proteins which were encoded by corresponding
genes, edges represent the protein-protein association, line color represents
types of interaction evidence (e.g., text mining, co-expression and so on).

   [186]Fig 2
   [187]Open in a new tab

   All of the interacting proteins with an interaction score ≥ 0.15 (based
   on previous study).

   Many genes and pathways have pleiotropic effects on more than one
   disease, a common phenomenon supported by this study of type 2
   diabetes, obesity and dyslipidemia. Recently, animal experiments and
   cross-sectional population-based studies have shown evidence of large
   shared gene components. Pleiotropic genes have been successfully
   identified in bivariate analyses of type 2 diabetes with obesity, type
   2 diabetes with dyslipidemia, and obesity with dyslipidemia. However,
   multivariate analysis had not previously been conducted for these three
   disorders simultaneously. Systemically exploring the pleiotropic genes
   and their effects on these three disorders is essential, and is
   possible because of the accessibility of the GWAS summary statistics.
   The advantages of Meta-CCA are listed as follows. Firstly, Meta-CCA can
   detect correlations between multiple variants and multiple traits based
   on GWAS summary statistics [[188]19], which might provide richer clues
   for finding novel gene targets in multivariate analyses compared to the
   univariate and bivariate analysis [[189]18]. For example, TMEM258, a
   gene for adipose tissue regulation, was not identified by any previous
   GWAS studies for type 2 diabetes and obesity, but was one of the novel
   pleiotropic candidate genes for type 2 diabetes, obesity and
   dyslipidemia identified in this study. Secondly and notably, Meta-CCA
   can identify novel candidates, since some of the associations become
   detectable only when multiple variants and multiple traits are tested
   jointly [[190]19]. For example, CETP, a well-known gene for
   dyslipidemia, was not identified by any previous GWAS studies for type
   2 diabetes, but it was one of the novel findings in our study. Last but
   not least, Meta-CCA is a cost-effective analytical method based on the
   data of GWAS summary statistics, which provides an enlarged larger
   effective sample size to detect potential pleiotropic genes for
   multivariate traits. Meta-CCA and similar types of analyses are an
   emerging and powerful tool for detection of pleiotropic genes of
   multiple correlated traits using GWAS summary statistics.

   Among the 25 potential pleiotropic genes, two genes, GCKR, and FADS1,
   were suggested to be pleiotropic genes for type 2 diabetes, obesity and
   dyslipidemia based on the results of previous studies [[191]21,
   [192]37, [193]38, [194]46, [195]47]. GCKR, located in 2p23.3, which
   encodes the protein belonging to the glucosidase regulatory subfamily,
   which in turn inhibits glucosidase by binding to enzymes in pancreatic
   islet cells and liver. A recent study found that the variants of GCKR
   were associated with obesity in postmenopausal women [[196]38]. FADS1,
   located in 11q12.2, encodes a protein that is a member of fatty acid
   desaturases. FADS1 was reported to be related to type 2 diabetes by a
   previous GWAS study, but the mechanism was still unknown [[197]51].

   From a biochemistry point of view, eight (APOA5, APOA1, APOC2, CETP,
   LPL, LIPC, GCKR, GALNT2) of the twenty-five potential pleiotropic genes
   were involved in important metabolic routes. Details are summarized in
   [198]Fig 3. APOA5, APOA1, and APOC2 encode lipoproteins which mainly
   ferry TG, HDL, and VLDL (very low density lipoprotein), respectively.
   CETP encodes cholesteryl ester-transfer protein, which transfers HDL
   into VLDL and IDL (intermediate density lipoprotein) by involving the
   transportation of cholesteryl ester. LPL is a lipoprotein lipase which
   plays a critical role in lipid metabolism such as transferring VLDL
   into IDL. The function of the protein hepatic triglyceride lipase
   encoded by LIPC is important in catabolism of lipids, including
   transferring IDL into LDL. GALNT2 and GCKR are involved in the
   metabolism of glucose as mentioned above.

Fig 3. Eight potential pleiotropic genes (the italic) affected the three
disorders through these important metabolic routes.

   [199]Fig 3
   [200]Open in a new tab

   From a biochemistry point of view, eight (APOA5, APOA1, APOC2, CETP,
   LPL, LIPC, GCKR, GALNT2) of the twenty-five potential pleiotropic genes
   were involved in important metabolic routes. APOA5, APOA1, and APOC2
   encode lipoproteins which mainly ferry TG, HDL, and VLDL, respectively.
   CETP encodes cholesteryl ester-transfer protein, which transfers HDL
   into VLDL and IDL by involving the transportation of cholesteryl ester.
   LPL is a lipoprotein lipase which plays a critical role in lipid
   metabolism such as transferring VLDL into IDL. The function of the
   protein hepatic triglyceride lipase encoded by LIPC is important in
   catabolism of lipids, including transferring IDL into LDL. GALNT2 and
   GCKR are involved in the metabolism of glucose. The dotted line stands
   for the complex metabolic routes of gluconeogenesis.

   Our results, as previously described, have identified 25 genes and five
   pathways associated with type 2 diabetes, obesity, and dyslipidemia.
   Interestingly, all 25 genes were identified as the associated
   genes/loci for TG in the original GWAS study [[201]21], though just
   eight of these 25 genes were refined in the validation stage [[202]21].
   All five pathways were associated with the metabolism of lipids.
   Specifically, two types of lipid-lowering drugs successfully targeted
   the statin pathway and PPAR Alpha pathway respectively, suggesting that
   abnormal plasma levels of lipids play a critical role in the common
   biological pathogenesis of the three disorders. Some drugs targeted on
   the statin pathway have been used successfully in therapy for type 2
   diabetes patients with dyslipidemia. Another significant pathway is the
   PPAR Alpha pathway. The PPAR pathway family, which includes the PPAR
   Alpha pathway, the PPAR Beta pathway, and the PPAR Gamma pathway, plays
   a key role in substance metabolism (including glucose metabolism, lipid
   metabolism, and protein metabolism). Specifically, PPAR Alpha was a
   core factor for fatty acid oxidation in liver, which was activated by
   ligands or drugs such as fibrates, resulting in a decrease in serum
   level of TG [[203]54]. PPAR Gamma was also an important factor for the
   etiology of IR [[204]55]. Drugs targeting PPAR Gamma, such as
   thiazolidinedione, were effective in the control of IR [[205]56].

   Our present study also indicated that TG played an important role in
   these three disorders, as all 25 potential pleiotropic genes were
   identified as associated genes/loci for TG in the original GWAS study
   (though eight of these 25 genes were reported to be TG associated genes
   after validation) [[206]21]. For type 2 diabetes, hypertriglyceridemia
   is the most common type of dyslipidemia [[207]57], which is mainly
   induced by IR and impairment in insulin secretion. Further, genomic
   studies [[208]57, [209]58] have indicated that hypertriglyceridemia has
   a higher genetic correlation with type 2 diabetes than other types of
   dyslipidemia. For obesity, most of the plasma TG is determined by the
   level of VLDL-TG (the balance between synthesis and clearance of
   VLDL-TG), and the synthesis of VLDL-TG is associated with total fat
   mass and liver fat [[210]59]. Thus, the large amount of fat mass in
   obese patients leads to increasing synthesis of VLDL-TG, but the
   clearance of VLDL-TG remains unchanged. Hypertriglyceridemia is a
   principal characteristic of dyslipidemia and is linked to many other
   types of dyslipidemia such as decreased HDL level and increased small
   dense LDL level [[211]60]. Above all, the metabolism of TG seems to
   play a core role in the common biological pathogenesis of these three
   disorders.

   Our study not only provides a better understanding of the shared
   genetic background for the three disorders, but also produced a list of
   potential novel pleiotropic candidate genes for follow-up study in
   further biological experiments. Some of the 25 pleiotropic genes
   (LIPC-AS1, IFT172, KRTCAP3, CSGALNACT1, EIF2B4, GTF3C2, ZNF513, NRBP1,
   FNDC4, TMEM258, and CLPTM1) were first reported to be associated with
   type 2 diabetes, obesity and dyslipidemia. The findings of our present
   work were not completely consistent with the findings in previous GWAS
   studies or other types of systematically analysis studies in type 2
   diabetes and other metabolic related diseases (The details of the
   overlapped identified genes and the novel potential pleiotropic
   candidate genes were shown in [212]Table 2). The reason for the
   different findings in type 2 diabetes might be the using of different
   datasets and different methods in these studies. For example, one
   published work that using integrative omics data shown that 15 SNPs and
   the corresponding genes were associated with type 2 diabetes [[213]61].
   However, these genes were not identified by our present work. Our
   present work did not identify all the genes that were identified in
   GWASs or other types of studies, as it was only a supplementary study
   to identify the potential pleiotropic genes for chronic complex
   diseases. We hope that the potential novel pleiotropic candidate genes
   can provide some clues for molecular biologists performing future
   functional validation studies to determine whether the findings truly
   have pathophysiological significance for type 2 diabetes, obesity and
   dyslipidemia.

Conclusion

   In this study, we identified and assessed some potential pleiotropic
   genes and pathways for type 2 diabetes, obesity and dyslipidemia using
   novel Meta-CCA analysis. The findings validated two previously
   identified pleiotropic genes (GCKR, FADS1) for these three disorders
   and highlighted another eleven significant genes (LIPC-AS1, IFT172,
   KRTCAP3, CSGALNACT1, EIF2B4, GTF3C2, ZNF513, NRBP1, FNDC4, TMEM258, and
   CLPTM1) as potential novel pleiotropic candidate genes for the three
   disorders. Further, the potential pleiotropic genes were significantly
   enriched in five pathways including the statin pathway and PPAR Alpha
   pathway. In conclusion, our findings may yield novel insights into
   exploring the common biological pathogenesis of these three disorders,
   which ultimately may lead to the development of effective drug
   therapies.

Acknowledgments