Abstract Obesity has become a major public health issue which is caused by a combination of genetic and environmental factors. Genome-wide DNA methylation studies have identified that DNA methylation at Cytosine-phosphate-Guanine (CpG) sites are associated with obesity. However, subsequent functional validation of the results from these studies has been challenging given the high number of reported associations. In this study, we applied an integrative analysis approach, aiming to prioritize the drug development candidate genes from many associated CpGs. Association data was collected from previous genome-wide DNA methylation studies and combined using a sample-size-weighted strategy. Gene expression data in adipose tissues and enriched pathways of the affiliated genes were overlapped, to shortlist the associated CpGs. The CpGs with the most overlapping evidence were indicated as the most appropriate CpGs for future studies. Our results revealed that 119 CpGs were associated with obesity (p ≤ 1.03 × 10^−7). Of the affiliated genes, SOCS3 was the only gene involved in all enriched pathways and was differentially expressed in both visceral adipose tissue (VAT) and subcutaneous adipose tissue (SAT). In conclusion, our integrative analysis is an effective approach in highlighting the DNA methylation with the highest drug development relevance. SOCS3 may serve as a target for drug development of obesity and its complications. Keywords: DNA methylation, obesity, association, gene expression, CpG Introduction Since 1980, the incidence of obesity has increased throughout the world (Stevens et al., [38]2012; Ng et al., [39]2014). The onset of obesity involves the interaction between genetic and environmental factors (Contaldo and Pasanisi, [40]2004; Ussar et al., [41]2015). Genome Wide Association Studies (GWASs) have successfully identified many genetic variations associated with human complex diseases and provide crucial new insights about underlying molecular mechanisms (De La Vega et al., [42]2011; Fall and Ingelsson, [43]2014; Winham et al., [44]2014; Evangelou et al., [45]2018). Until now, the largest obesity GWAS study has identified 97 body mass index (BMI) associated loci (P ≤ 5 × 10^−8) from up to 339,224 individuals. However, most of the genetic susceptibility remains unclear (Locke et al., [46]2015). Existing evidence suggests that obesity is a result of interactions between genetic and environmental factors (Marti et al., [47]2008). DNA methylation provides a molecular mechanism for the interaction between the environment and obesity, in that it may affect individual susceptibility to obesity by altering the gene expression. In recent years, the association between DNA methylation and obesity has intensively been studied (van Dijk et al., [48]2015; Dhana et al., [49]2018; Wang et al., [50]2018). For example, a genome-wide DNA methylation association study in obesity that recruited 5,387 individuals, identified 278 CpGs associated with BMI (Wahl et al., [51]2017). The associated CpGs have provided wider insight in addition to previous genetic studies. On the other side, the numerous associated CpGs has made it difficult for functional investigations using cell and animal models. In this study, we applied an integrative analysis approach, to prioritize genes with more relevance from several associated CpGs. Using this approach, we identified SOCS3 as a promising candidate for mechanism studies and drug development. This approach can also be adapted to genome-wide DNA methylation studies of other diseases. Methods The integrative analysis approach included three components. The first component was to nominate the candidate CpGs by combining the association results from previous studies of peripheral blood samples (Steps 1–4, Figure [52]1). The second component was to estimate the functional relevance of the candidates through pathway enrichment analysis (Step 5). The third component was to validate that the genes affiliated with candidate CpGs were differentially expressed in adipose tissues (Step 6). Finally, the evidence from these components were put together and the genes with positive support from all components were considered and prioritized by our approach (Step 7). Figure 1. [53]Figure 1 [54]Open in a new tab The flow chart of the intergrative analysis. The circled numbers represent the steps in the pipeline. Literature Search The literature search was conducted in the PubMed database using the keywords “CpG”, “DNA methylation” and “obesity” to capture all articles published from 2014 to 2018. We applied an English language restriction to our search results. Inclusion Criteria and Data Extraction Both cohort studies and case-control studies reporting the association between DNA methylation and obesity (as measured by BMI) were included in this meta-analysis. Studies that used samples from cancer patients were not included. We further excluded the studies that used non-human subjects. The full text of each article was carefully read to determine whether studies should be included. Once included, data were extracted from the articles, including the publication year, participant characteristics, sample size, association p-value, and the effect size. Meta-Analysis We employed a sample-size weighted strategy to combine the p values reported in the included studies, taking into consideration the direction of the association effect size. This strategy was implemented using R software ([55]https://www.r-project.org/). In this meta-analysis the CpG site with p value less than 1.03 × 10^−7 (Bonferroni correction based on 485,577 CpGs designed in Illumina HM450K array) and with effect sizes consistent with the direction across all included studies, were considered as significant. Pathway Enrichment Analysis We investigated the enrichment of the affiliated genes in the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, using the Metascape online software ([56]http://metascape.org; Tripathi et al., [57]2015). The genes were annotated using the default resources provided by Metascape. KEGG pathways were reduced using the default settings (the number of gene hits ≥3, enrichment p-value ≤ 0.05 and enrichment statistics ≥1.5). A FDR p-value ≤ 0.05 was taken to declare a significant enrichment. Differential Expression Analysis in Adipose Tissues We aimed to investigate whether the associated genes were differentially expressed in the SAT and VAT of obesity patients, by comparing their gene transcription levels with normal individuals. This analysis was performed using the GEO2R tool ([58]https://www.ncbi.nlm.nih.gov/geo/geo2r/) on two datasets, [59]GSE2508 (10 obese vs. 10 lean) and [60]GSE88837 (15 obese vs. 15 lean) for the SAT and VAT, respectively. The gene transcription levels were assayed using Affymetrix Human Genome U95 V2 and U133 arrays. The differential gene expression in obese samples was identified using the Bayesian estimation by GEO2R. Transcription level data of each sample was queried from the GEO database (Davis and Meltzer, [61]2007). Empirical Bayes statistics were calculated using the R package “limma” (Smyth, [62]2004; Ritchie et al., [63]2015). The fold change of DNA methylation was calculated using the group mean. P value ≤ 0.05 and |log[2] (fold change)| ≥1 were used as criteria for differentially expressed genes. CpGs which were differentially expressed in both tissue types were identified as relevant loci. The DNA Methylation Associated With Obesity in Human VAT and Liver Tissue The DNA methylation of the included studies was all measured in peripheral blood, but the DNA methylation in peripheral blood may be different from that in the metabolic tissues. To test whether the association in peripheral blood samples can be transferred into obesity related tissue, we tested the association of the significant CpGs in human VAT and liver tissue, using two GEO datasets, [64]GSE88940 (10 obese vs. 10 normal VAT samples) and [65]GSE65057 (8 obese vs. 7 normal liver samples), respectively. Results Characteristic of Individual Studies According to the keywords “CpG”, “DNA methylation” and “obesity”, a total of 350 related articles were retrieved. Two hundred and seventy studies were excluded based on the title and abstract, as they were inconsistent with inclusion criteria, leaving 80 articles. Of those, 67 articles were excluded after a full-text review. As a result, 13 articles were included in the analysis. The reason for the exclusion of most articles was because they were functional studies in cells or animals. The basic characteristics of the included studies are detailed in Table [66]1. Table 1. Characteristics of the included genome-wide DNA methylation studies. References Corhort source Ethnicity Subjects Body mass