Abstract Aim: To explore the relationship between Chlamydia pneumonia (Cpn) infection and lung cancer using integrative methylome and transcriptome analyses. Methods: Twelve primary lung cancer patients who were positive for Cpn and twelve patients who were negative were selected for demographic, clinicopathological, and lifestyle matching. Genomic DNA and RNA were extracted and DNA methylation and mRNA levels were detected using the Infinium Human Methylation 450 Beadchip array and mRNA + lncRNA Human Gene Expression Microarray. We identified differentially expressed methylation and genes profiles. Results: Integrative analysis revealed an inverse correlation between differentially expressed genes and DNA methylation. Cpn-related lung cancer methylated genes (target genes) were introduced into the gene ontology and KEGG, PID, BioCarta, Reactome, BioCyc and PANTHER enrichment analyses using a q-value cutoff of 0.05 to identify potentially functional methylation of abnormal genes associated with Cpn infection. Gene sets enrichment analysis was evaluated according to MsigDB. Levels of differentially expressed methylated sites were quantitatively verified. The promoter methylation sites of 62 genes were inversely related to expression levels. According to the quantitative analysis of DNA methylation, the methylation level of the RIPK3 promoter region was significantly different between Cpn-positive cancerous and adjacent tissues, but not between Cpn-negative cancerous and adjacent tissues. Conclusion: Hypomethylation of the RIPK3 promoter region increases RIPK3 expression, leading to regulated programmed necrosis and activation of NF-κB transcription factors, which may contribute to the development and progression of Cpn-related lung cancer. Keywords: lung neoplasms, chlamydia pneumonia, DNA methylation, gene expression Introduction According to GLOBOCAN,[36]^1 lung cancer accounted for the majority of global cancer incidence and deaths in 2012. Pathogenic factors for lung cancer include both genetic and environmental factors.[37]^2^,[38]^3 In recent years, the role of infection in the etiology of cancer has gained scientific interest. Chlamydia pneumoniae (Cpn) is a common cause of acute respiratory infections, and the inflammation caused by chronic infection may lead to carcinogenesis. Epidemiological evidence suggests that Cpn infection may be associated with the development and progression of lung cancer.[39]^4 A recent meta-analysis of 2,549 patients with lung cancer and 2,764 controls showed that Cpn infection significantly increased the risk of lung cancer (odds ratio =2.07, 95% confidence interval: 1.43–2.99).[40]^5 Cell experiments found that Cpn infection can induce human mesothelial cell transformation, thereby increasing the risk of lung cancer.[41]^6 Studies using integrated computer methods found that Cpn triggered lung cancer growth by changing host cell replication, transcription, and DNA damage repair mechanisms[42]^7 The underlying molecular mechanism of Cpn chronic infection-mediated lung cancer is not yet clear. Studies have shown that inflammation-induced lung cancer differs from genetic- and epigenetic-mediated cancer development.[43]^8 DNA methylation is an important epigenetic mechanism that regulates gene transcription and cell function.[44]^9 It regulates gene expression without changing gene sequences. DNA methylation usually inhibits gene expression, while demethylation induces gene activation and expression. Early carcinogenesis can cause abnormal changes in methylation, and methylation can be detected in bodily fluids and early lesions of biopsied tissues. Methylation can be reversed by demethylating drugs, suggesting that methylation-mediated gene abnormalities may serve as a biological target for early detection of lung cancer and treatment interventions. To explore the pathogenic effect of Cpn on lung cancer and its underlying epigenetic mechanism, we used the high-throughput Illumina methylation 450 k array to determine the methylation levels of the whole genome in lung cancer tissues and para-cancer controls. We revealed methylation characteristics of serologic Cpn positive lung cancer and found that the combination of methylation and expression profiles identified 62 potentially functional methylated abnormal genes associated with Cpn. Furthermore, we identified the role of abnormal methylation in the development and progression of lung cancer at the molecular level. Materials and methods Subjects Twelve newly diagnosed primary lung cancer cases that tested positive for Cpn serological antibodies (IgA and IgG) and 12 cases that were negative following thoracic surgery in the First Affiliated Hospital of Fujian Medical University were selected and matched for demographic characteristics, clinicopathological features, and lifestyles ([45]Table 1). Twelve pairs of patients were tested using methylation and expression microarrays, and methylation levels of the promoter region in the target genes were verified using the Sequenom methylation method (see [46]Table 1, No. 1–6); the remaining twelve pairs were validated using the MethylTarget method (see [47]Table 1, No. 7–12). No patient received adjuvant therapy, such as radiotherapy and chemotherapy, before surgery. Lung cancer tissues and corresponding normal tissues were resected and stored at −80 °C within 10 min after surgery. Histopathological diagnosis of each specimen was performed independently by two pathologists ([48]Figure 1). The study protocol was approved by the institutional review board of Fujian Medical University, and written informed consent was obtained from all patients. This study was conducted in accordance with the Declaration of Helsinki. Table 1. Demographic and clinical characteristics of the lung cancer patients No. Age Sex Pathological category TNM staging Cpn IgA&G A1/B1 56 F ADC T4N2M0 Ⅲ b + C1/D1 56 F ADC T4N1M0 Ⅲ a − A2/B2 41 F ADC T1aN0M0 Ⅱ a + C2/D2 44 F ADC T2aN1M0 Ⅱ a − A3/B3 60 M ADC T2N0M0 Ⅱ a + C3/D3 58 M ADC T2N0M0 Ⅱ b − A4/B4 60 M ADC T2aN2M0 Ⅲ a + C4/D4 59 M ADC T2N0M0 Ⅰ b − A5/B5 66 M SCC T1aN2M0 Ⅲ a + C5/D5 67 M SCC T2aN0M0 Ⅰ b − A6/B6 58 M SCC T1aN0M0 Ⅰ a + C6/D6 60 M SCC T1aN0M0 Ⅰ a − A7/B7 59 M SCC T2aN1M0 Ⅱa + C7/D7 57 M SCC T1N1M0 Ⅱa − A8/B8 49 M ADC T1bN0M0 Ⅰa + C8/D8 52 M ADC T1aN0M0 Ⅰa − A9/B9 56 F ADC T1aN0M0 Ⅰa + C9/D9 60 F ADC T1aN0M0 Ⅰa − A10/B10 57 F ADC T2N0M0 Ⅰb + C10/D10 60 F ADC T2aN0M0 Ⅰb − A11/B11 58 M ADC T1aN1M0 Ⅱa + C11/D11 69 M ADC T1bN1M0 Ⅱa − A12/B12 43 M ADC T2bN2M0 Ⅲa + C12/D12 45 M ADC T1aN2M0 Ⅲa − [49]Open in a new tab Abbreviations: No, No. of lung cancer tissue/No. of adjacent normal tissue; F, Female; M, Male; ADC, Adenocarcinoma; SCC, Squamous cell carcinoma. Figure 1. [50]Figure 1 [51]Open in a new tab Pathological diagnosis of (A) lung cancer and (B) adjacent normal tissues (×400 times, hematoxylin and eosin staining). Serological tests, DNA and total RNA extraction We used Cpn IgA SeroFIA^TM and IgG SeroFIA^TM kits (Savyon, Israel) and the themicro-indirect immunofluorescence (MIF) method to detect serum Cpn IgG and IgA antibodies in patients with lung cancer. DNA was extracted using the Qiagen^® Genomic DNA Mini Kit, and total RNA was extracted using the Trizol reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacture’s protocol. Purity and concentration were determined using the NanoDrop^® ND-2000 spectrophotometer. DNA and total RNA integrity were determined using 2% agarose gel electrophoresis. DNA methylation and gene expression profiling After bisulfite conversion of DNA, the methylation profile of genomic DNA was assessed using the Infinium Human Methylation 450 Beadchip Kit (Illumina, San Diego, CA, USA), which can detect more than 450,000 methylation sites located within and outside CpG islands.[52]^10 Relative total RNA levels were also determined using the mRNA + lncRNA Human Gene Expression Microarray V4.0 (4×180 K format, CapitalBio Corp, Beijing, China), which is more specific for DNA-DNA hybridization and achieves low abundance of RNA amplification and detection efficiency. All samples were processed at the same time to avoid chip-to-chip variation. The quality of hybridization and overall chip performance were monitored using visual inspection of both the internal quality control checks and the raw scanned data. Quantitative methylation analysis The Sequenom MassARRAY platform (CapitalBio Corp, Beijing, China) was used to perform quantitative methylation analysis of the target gene promoter.[53]^11 This system uses matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry in combination with RNA base-specific cleavage (MassCLEAVE). PCR primers were designed with Methprimer ([54]http://epidesigner.com). Each forward primer was tagged with a 10-mer to balance the PCR, and each reverse primer had an additional T7 promoter tag for in vitro transcription. The target regions were amplified using the primer pairs shown in [55]Table 2. The MethylTarget platform (GeneskyBio Corp, Shanghai, China) was also used to quantitatively analyze DNA methylation levels. The targeted DNA sequence was amplified by net-PCR and designed DNA fragments were sequenced using Illumina Hiseq 2000. PCR primers were designed with primer3 ([56]http://primer3.ut.ee/). Table 2. Primer design for validating the methylation level of gene fragments Name Primer sequence^a Direction Target length Target position^b Target CpG RIPK3 L+GAGTTGATTGAATAATTTTTTTTATAGG F 276 −43 to +232 6 R+AACACCATCTCCACTCATAACCTAA [57]Open in a new tab Notes: ^aL = aggaagagag, R = cagtaatacgactcactatagggagaaggct. ^bThe position of the transcription start site in the sequence is considered to be +1. Accurate experimental methods and uniform judgment standards were adopted. The reagents were kept consistent during the experiment, and the consumables were autoclaved before each experiment. All samples were processed simultaneously to avoid heterogeneity between chips. Hybridization quality and overall chip performance were monitored using internal quality control checks and raw scan data. Data processing and analysis Infinium methylation data were processed using the GenomeStudio Methylation Module (v. 2011.1) in which methylation values (expressed as β-values, ranging from 0 to 1) and detection P-values were calculated. For quality control, methylation measures with a detection P-value >0.05 were removed. The paired empirical Bayes moderated t-test in the limma package was used to obtain differential methylation positions (DMPs) (adjusted P-values (q-values) ≤0.05 and | β-Difference | ≥0.2) between cancer and adjacent normal tissues. The q-values were corrected using the Benjamini Hochberg method for multiple hypotheses, known as the false discovery rate method.[58]^12 Hierarchical cluster analysis of DMPs was carried out using complete linkage and Euclidean distance as a measure of similarity. Heatmaps were generated using the gplots package in R. We also used the hypergeometric test to investigate whether the CpG islets were enriched at a particular genomic location using 450 k as the background. The 450 K methylation probes can be annotated into a gene’s transcription start sites (TSS) 1500, TSS200, the 5ʹ untranslated region (UTR), the 1st exon, body, and the 3ʹUTR. Taking the TSS1500 region as an example, the TSS1500 region of one gene may contain multiple CpG sites. The mean value of the Beta values using multiple CpG sites on the gene TSS1500 region indicates the degree of methylation in this region. Therefore, we were interested in the impact of differential methylation regions (DMRs) on gene expression. The paired empirical Bayes moderated t test in the limma package was used to obtain DMRs (q-values ≤0.05) between cancerous and adjacent normal tissues. Human Gene Expression raw signal data were processed using the Agilent Feature Extraction V12.0 and array normalization using the quantile method with limma packages. The genes detected in more than eight samples were used for differential expression analysis, and the paired empirical Bayes moderated t test in the limma package was used to identify differentially expressed genes (DEGs) (q-values ≤0.05 and | Fold-Change (FC) | >2.0) between cancerous and adjacent normal tissues. The q-values were corrected using the Benjamini Hochberg method for multiple hypotheses. DEGs were further hierarchical clustered and filtered using Heatmaps and Volcano plots in R, respectively. Volcano plots were constructed using | FC | > 2.0 and q-value ≤0.05 as cut-offs. [59]Figure 2 shows a schematic diagram of the steps followed to identify the target genes. DMRs of the methylome and DEGs of the transcriptome were identified for Cpn positive lung cancer vs Cpn positive adjacent normal tissues ([60]Figure 2A vs [61]B) and Cpn negative lung cancer vs Cpn negative adjacent normal tissues ([62]Figure 2C vs [63]D). Integrative analysis using the Pearson’s correlation was used to screen significantly different genes (DGs) with a negative correlation between DNA methylation of up-stream gene regions and expression levels (Pearson Correlation Coefficient < −0.7 and P-value <0.05). After removing the overlap of Cpn+ DGs and Cpn- DGs, the remaining set included Cpn-related lung cancer methylation genes (target genes). The target genes were entered into KOBAS ([64]http://kobas.cbi.pku.edu.cn/) for Gene Ontology (GO) and pathway (KEGG, PID, BioCarta, Reactome, BioCyc and PANTHER) enrichment analysis with a q-value cutoff of 0.05. For all genes, gene sets enrichment analysis (GSEA) was evaluated according to MsigDB. In the validation analysis, the average methylation levels of each DNA fragment and CpG site were compared using the Wilcoxon Rank Sum test with a P-value cutoff of 0.05. Figure 2. [65]Figure 2 [66]Open in a new tab Schematic diagram of integrative methylome and transcriptome analyses. Results Genome-wide DNA methylation distribution in Cpn-related NSCLC We identified 5,309 DMPs including 3,619 hypermethylated and 1,690 hypomethylated probes in A vs B, and 6,213 DMPs including 4,032 hypermethylated and 2,181 hypomethylated probes in C vs D ([67]Figure S1). Hierarchical clustering of DMPs showed separate clustering of non-small cell lung carcinoma (NSCLC) and adjacent normal tissues in both A vs B ([68]Figure S2) and C vs D ([69]Figure S3) indicating distinct epigenetic regulation. Further we classified these DMPs based on their enrichment at CpG islands (CGIs), flanking CGIs (shores and shelves, 2–4 kb from CGIs), and open sea (non-related to CGIs). Our data showed that the majority of the hypermethylated probes (52%, 43%) were enriched in the CGIs ([70]Figure S4A–[71]C), whereas hypomethylated probes (68%, 70%) were mostly enriched in the open sea ([72]Figure S4B–[73]D). Hypermethylation in NSCLC was primarily enriched in the CpG-rich regions, while hypomethylation was primarily enriched in the open sea. When comparing the distribution of hypermethylated probes between A vs B ([74]Figure S4A) and C vs D ([75]Figure S4C), we found both were enriched in the CGIs. However, for all DMPs, we observed that A vs B was more enriched in the CGIs compared to C vs D ([76]Table S1). Gene expression profiles in Cpn-related NSCLC When subjected to principal component analysis (PCA) analysis, the contribution rate of the top 2 principal components was 51.90%. The gene expression profiles of the lung cancer and normal pairs in the two different Cpn status normal tissues (B, D) were relatively close, while that of Cpn positive and negative NSCLC cancer tissues (A, C) were markedly different, which tended to be disperse ([77]Figures S5 and [78]S6). There was a total of 32,676 probe sets (transcripts) representing 20,753 genes on the Gene Chip. We observed significant differences in DEGs in A vs B and C vs D. Among these 2,765 genes 1,454 genes were up-regulated and 1,311 were down-regulated in A vs B, and 50 genes were up-regulated and 792 were down-regulated in C vs D ([79]Figure S1 and [80]Table S2). Hierarchical clustering was applied to the expression profiles of the DEGs for Cpn positive NSCLC vs Cpn positive adjacent normal tissues (A vs B) and Cpn negative NSCLC vs Cpn negative adjacent normal tissues (C vs D). These data are represented by heatmaps ([81]Figure S7 for A vs B; [82]Figure S8 for C vs D) and volcano plots ([83]Figure S9 for A vs B and [84]Figure S10 for C vs D). The results indicate that DEGs could be used to distinguish cancerous tissues from normal tissues. Transcriptional effects of aberrant DNA methylation Changes of methylation in the upstream region of genes were inversely related to differential gene expression, so we screened a set of locus genes that were linked to the TSS1500, TSS200, 5ʹUTR and 1st Exon regions (all in the up-stream region) with an identified methylation site. For Cpn positive patients, 179 methylation probes were identified in A vs B; for Cpn negative patients, 174 methylation probes were identified in C vs D. For A vs B, we removed the data of the probes of C vs D. We found 124 methylation probes that were inversely related to gene expression, which indicated that we found 42 genes ([85]Table S3). We also observed a negative correlation between gene upstream region methylation and gene expression ([86]Figure 3A for A vs B; [87]Figure 3B for C vs D). We used a combined analysis of the DEGs and DMRs. We found that gene expression profiles of 62 aberrantly methylated genes that existed in A vs B but not in C vs D were inversely correlated with promoter methylation ([88]Table 3). Figure 3. [89]Figure 3 [90]Open in a new tab The inverse relationship between gene upstream region methylation and gene expression. (A) Heatmaps of negative correlation between DMRs and DEGs for Cpn positive NSCLC vs Cpn positive adjacent normal tissues (A vs B), and (B) Cpn negative NSCLC vs Cpn negative adjacent normal tissues (C vs D). In DNA methylation, “Red” indicates relative hypermethylation, and “Blue” indicates relative hypomethylation. In gene expression, “Red” indicates up-regulated genes, and “Blue” indicates down-regulated genes. Table 3. Combined analysis of differentially expressed genes and regional differential methylation Gene name Beta-Difference log2FC Pearson Correlation Coefficient P-value Position A2M 0.1705 −3.20614252 −0.837427855 0.000676128 1stExon AGMAT −0.159166667 1.35150382 −0.880935386 0.000153807 TSS200 AKAP2 0.18125 −2.78702291 −0.769815354 0.00340665 5UTR ANKRD34C 0.099666667 −1.43452865 −0.680803777 0.014804841 1stExon ANO9 −0.0463 2.723041022 −0.789208119 0.002272594 TSS200 AURKB −0.016 1.317301713 −0.65441348 0.020945504 5UTR B4GALT7 −0.029944444 1.68726944 −0.787944903 0.00233616 TSS1500 BTBD2 −0.0825 2.170248508 −0.811862002 0.001340802 1stExon CAV3 0.089 −1.98277107 −0.84982591 0.000464848 5UTR CD5 −0.110416667 1.990345448 −0.669599749 0.017222151 5UTR CD5 −0.110416667 1.990345448 −0.669599749 0.017222151 1stExon CHRNE −0.152833333 2.033405563 −0.795905015 0.001957367 TSS200 CRYAB 0.038333333 −1.92940006 −0.679417719 0.015089441 1stExon CTDSPL 0.034366667 −1.50889147 −0.745738098 0.005360185 TSS1500 DCDC2B −0.060208333 1.326446304 −0.643943293 0.023833279 TSS1500 DDR2 0.183666667 −1.28964065 −0.699935449 0.011268077 TSS200 DNASE2 −0.131 1.63560986 −0.745690109 0.005364771 TSS1500 DRD5 0.069055556 −1.31223918 −0.809859441 0.001408625 TSS1500 FAM83G −0.036777778 1.510598922 −0.790554389 0.002206312 TSS200 FCGRT 0.093520833 −1.25654531 −0.747651343 0.005179749 TSS1500 FCGRT 0.114266667 −1.25654531 −0.751752926 0.004808407 TSS200 FXYD1 0.084296296 −2.00910021 −0.851908083 0.000435101 TSS1500 FXYD1 0.1595 −2.00910021 −0.85942177 0.000339841 1stExon FXYD6 0.088020833 −1.52143406 −0.800846771 0.001747102 TSS200 FXYD6 0.062125 −1.52143406 −0.749248154 0.005032697 TSS1500 G0S2 0.049625 −2.55597963 −0.808766391 0.001446739 TSS1500 GPR55 −0.123277778 1.40233994 −0.609104286 0.035530718 TSS1500 GPX3 0.1235 −1.70733757 −0.705306284 0.010399324 TSS200 HNF4A −0.0855 1.934479092 −0.726951301 0.007394059 TSS1500 IBSP −0.053 2.712874451 −0.695283046 0.012063195 5UTR IER2 0.028541667 −1.96991612 −0.690961945 0.01283825 TSS1500 IKBKE −0.062388889 2.697424492 −0.75315101 0.004686547 5UTR INMT 0.120666667 −1.47252997 −0.779760355 0.002781556 1stExon ITIH2 −0.194166667 1.704574286 −0.717254319 0.008646242 1stExon LRRC56 −0.057533333 2.011179421 −0.725065188 0.007626224 5UTR LRRC56 −0.018333333 2.011179421 −0.617585018 0.032370025 1stExon LTA −0.093916667 1.352498731 −0.699220618 0.01138764 TSS1500 MACROD1 −0.062583333 1.503512762 −0.762375578 0.003940007 TSS1500 MAPRE3 −0.030153846 2.418662642 −0.776238853 0.002991938 5UTR MT1A 0.119666667 −1.77417907 −0.684709796 0.0140241 5UTR MYO1C 0.051466667 −1.09237987 −0.60995149 0.035205449 TSS200 MYO1C 0.050545455 −1.09237987 −0.645602098 0.023357564 TSS1500 NEFL 0.059666667 −1.64334004 −0.699773508 0.011295081 TSS1500 NFKBIE −0.099222222 1.249279877 −0.721458957 0.00808527 TSS1500 OLFML1 0.11 −1.65131157 −0.885876403 0.000125511 TSS200 OLFML1 0.202333333 −1.65131157 −0.872479841 0.000213568 5UTR OLFML1 0.202333333 −1.65131157 −0.872479841 0.000213568 1stExon OPTC −0.022375 1.490102398 −0.713128372 0.009224489 TSS1500 PCDHB4 0.096366667 −1.10626023 −0.740056492 0.00592402 TSS200 PCDHB4 0.086277778 −1.10626023 −0.680026954 0.014963855 TSS1500 PMP22 0.05352381 −2.19707598 −0.615847586 0.033000441 TSS1500 PYDC1 −0.084916667 1.458731243 −0.648735975 0.022477711 TSS1500 RCC1 −0.120066667 2.163364397 −0.737576047 0.006183702 TSS1500 RGS14 −0.159111111 2.991134915 −0.726418651 0.00745908 TSS200 RGS14 −0.097111111 2.991134915 −0.681870132 0.014588595 TSS1500 RHOV −0.077333333 1.409150164 −0.651032837 0.02184835 TSS1500 RIPK3 −0.102166667 2.274888316 −0.730116006 0.007016463 TSS200 RIPK3 −0.122041667 2.274888316 −0.703065198 0.010755538 TSS1500 RIPK3 −0.081777778 2.274888316 −0.678747347 0.015228525 5UTR RIPK3 −0.081777778 2.274888316 −0.678747347 0.015228525 1stExon RPAP2 0.016916667 −1.03892684 −0.765039552 0.003742239 5UTR RPAP2 0.016916667 −1.03892684 −0.765039552 0.003742239 1stExon SALL3 0.167583333 −2.25513937 −0.662355223 0.018931937 TSS1500 SALL3 0.171333333 −2.25513937 −0.888822565 0.000110691 1stExon SCRN2 −0.033666667 1.114621646 −0.783750322 0.00255702 TSS1500 SEPT10 0.022785714 −1.62973024 −0.636139296 0.026166172 TSS200 SEPT10 0.155333333 −1.62973024 −0.768207549 0.003516963 5UTR SEPT10 0.155333333 −1.62973024 −0.768207549 0.003516963 1stExon SLC35B2 −0.044055556 1.779433339 −0.677235973 0.015545557 TSS1500 SLIT2 0.049 −1.8350073 −0.87349099 0.000205598 5UTR SLIT2 0.049 −1.8350073 −0.87349099 0.000205598 1stExon SOX5 0.042 −1.73714341 −0.764161182 0.003806595 1stExon SPINT1 −0.046 1.777841017 −0.800378594 0.001766244 1stExon SPON1 0.087027778 −1.88844713 −0.736270052 0.006323822 TSS200 STOML3 −0.042041667 1.232677275 −0.785619588 0.002456714 TSS1500 STX12 0.019083333 −1.28519266 −0.775214895 0.003055325 TSS200 SYTL3 −0.103833333 1.912604554 −0.607955514 0.035975192 1stExon TNF −0.096388889 1.311252028 −0.712540054 0.009309233 TSS1500 TRIM31 −0.2245 1.268419408 −0.742254538 0.005700841 TSS200 TRIM31 −0.162416667 1.268419408 −0.742145274 0.005711784 5UTR ZNF578 0.105916667 −1.34457748 −0.609799168 0.035263773 5UTR [91]Open in a new tab Cpn-related lung cancer genes (target genes) and potential biological function analysis In total, 62 target genes were submitted for GO and pathway (KEGG, PID, BioCarta, Reactome, BioCyc and PANTHER) enrichment analysis. For biological processes, positive regulation of chronic inflammatory response to antigenic stimulus (GO: 0002876) was the most represented GO term, followed by negative regulation of growth (GO: 0045926) and regulation of chronic inflammatory response to antigenic stimulus (GO: 0002874). For cellular components, the major represented categories were membrane raft (GO: 0045121). Genes involved in important molecular functions were also identified, such as nuclear factor-kappa B-inducing kinase activity (GO: 0004704) ([92]Figure 4). Using the KEGG pathway database, we found that tumor necrosis factor (TNF) and receptor-interacting serine/threonine-protein kinase 3 (RIPK3) overexpression by hypomethylation of the promoter region may activate the TNF signaling pathway (hsa04668, [93]Figure S11). RIPK3 and inhibitor of nuclear factor-kappa B kinase subunit epsilon (IKBKE) may play a role in responding to bacterial invasion in the cytosolic DNA-sensing pathway (hsa04623, [94]Figure S12). Furthermore, we also found that the first ranked pathway in the Reactome database was Methylation of MeSeH for excretion (R-HSA-2408552). In PANRHER, we found a number of target genes that were involved in the adrenaline and noradrenaline biosynthesis pathway ([95]P00001) and Toll receptor signaling pathway ([96]P00054). Details are shown in [97]Figure 5. We also conduced GSEA for all 20,198 genes. The Heat map and gene list correlation profile in the dataset for the comparison of A vs B ([98]Figure S13) and C vs D ([99]Figure S14) are shown. Our data indicate that DNA methylation can affect the binding of transcription factors to genes. Figure 4. [100]Figure 4 [101]Open in a new tab Cpn-related lung cancer genes (target genes) and potential biological function analysis. Gene ontology for the top 30 biological processes, cellular components, and molecular functions annotation. Figure 5. [102]Figure 5 [103]Open in a new tab KEGG, PID, BioCarta, Reactome, BioCyc and PANTHER pathway enrichment analysis for all 62 target genes. Quantitative methylation analysis and validation Among the 62 target genes that were the most representative aberrantly methylated genes of Cpn-associated lung cancer, RIPK3 was verified using 24 microarray samples as the most methylated in all promoter regions (TSS1500, TSS200, 5ʹUTR and 1stExon regions) and was enriched in innate immune responses for foreign DNA from invading microbes in pathway analysis. Furthermore, methylation levels were quantified against the DMRs in the promoter regions of RIPK3. The Wilcoxon Rank Sum test showed that, in the Cpn positive group, the average methylation level of RIPK3 in lung cancer samples was significantly lower than that in the para-cancer control samples (7.25% vs 11.67%, P=0.005), but the difference was not significant in the Cpn negative group (8.96% vs 12.54%, P=0.093). The lowest methylation level of RIPK3 was found in Cpn-positive lung cancer samples, and in the 2nd, 4th, and 5th CpG sites there was a significant difference in methylation levels between lung cancer and para-cancer control samples in the Cpn-positive group, but not in the Cpn negative group ([104]Figure 6). For the differentially expressed methylation sites in the RIPK3 promoter regions screened using chip and first validation, DNA methylation levels were verified on samples numbered ABDC 7–12. Only the 5th CpG site of RIPK3 was statistically significant in A vs B (P=0.041), and not in C vs D (P=0.310). Figure 6. [105]Figure 6 [106]Open in a new tab Methylation levels of the CpG sites in the RIPK3 promoter. (A) Schematic diagram of the RPIK3 promoter. The sequence represents a 276-base pair fragment (−43 bp to +232 bp) in RPIK3. The start of the 1st exon was considered as +1 of the sequence. Numbers 1–6 refer to locations of the CpG sites within the RIPK3 elements tested, and underlining shows the number of multiple CpG sites that were tested simultaneously. (B) Comparison of the methylation levels of CpG sites in RIPK3 promoter regions. Data are expressed as Median (P[25], P[75]). * Wilcoxon Rank Sum test was performed: *P<0.05(relative to the respective control: B or D). n[case] =6, n[control] =6. Discussion Lung cancer is the most prevalent and deadly cancer in the world. In addition to an imbalance in proto- and anti-oncogenes, aberrant promoter hypermethylation is now recognized as an essential component in lung cancer progression.[107]^13 Infection has been found to be closely related to abnormal DNA methylation and can further induced cancer.[108]^14^,[109]^15 The development of chip techniques has led to novel approaches to lung cancer classification and diagnosis. We investigated the genome-wide DNA methylation and gene expression profiling of 12 NSCLC (6 Cpn positive and 6 Cpn negative) and 12 paired adjacent lung tissues. The target genes of Cpn infection-associated lung cancer were enriched in several representative pathways, including positive regulation of chronic inflammatory response to antigenic stimulus (GO: 0002876), negative regulation of growth (GO: 0045926), regulation of chronic inflammatory response to antigenic stimulus (GO: 0002874), membrane raft (GO: 0045121) and nuclear factor-kappa B-inducing kinase activity (GO: 0004704). RIPK3 and IKBKE were enriched in the TNF signaling pathway (hsa04668) and the cytosolic DNA-sensing pathway (hsa04623). Chip and validation screening demonstrated that the abnormal methylation sites in the promoter regions of RIPK3 were associated with Cpn-related lung cancer. A recent study[110]^16 on Chlamydia trachomatis (Ct) and ovarian cancer suggested that Chlamydia infection promotes host DNA damage, causing malignant cell proliferation, which permanently affects the host at the genomic and epigenetic levels, particularly through altering host chromatin structure by DNA methylation and post-translational histone modifications. Cpn can induce histone H3 and H4 modifications, which have a major effect on cytokine production.[111]^16 Ct infection was associated with increased expression of two mesenchymal cell markers: fibronectin and α-smooth muscle actin (α-SMA). The DNA methylation status of selected regions of E-cadherin, fibronectin, and α-SMA genes revealed that Ct infection was accompanied by changes in DNA methylation of the E-cadherin promoter.[112]^17 A whole genome sequencing study[113]^18 of Cpn showed that there are many enzymes involved in the synthesis and metabolism of aromatic compounds, such as synthetase, hydroxylase, decarboxylase, and methylase. However, there have been no Cpn-related methylation studies. We speculated that Cpn may lead to abnormal methylation of human genomic DNA, resulting in abnormal activation of oncogenes and transcriptional silencing of tumor suppressor genes, causing disordered cell growth and differentiation. It is well known that DNA methylation of the promoter region strongly correlates with transcriptional repression, and that DNA methylation downstream of the TSS, in particular of the 1st exon, is critical for transcriptional silencing, independent of the cell type. In the current study, we found an inverse relationship between Cpn-related DMPs and DEGs. We identified Cpn-related DMRs for 62 significant target genes. These genes were enriched in several representative pathways, including positive regulation of chronic inflammatory response to antigenic stimulus, regulation of chronic inflammatory response to antigenic stimulus, and nuclear factor-kappa B-inducing kinase activity, among others. The biological function of most of these genes was related to chronic infection, which indicates that Cpn might be involved in the progression of lung cancer through DNA methylation changes. Validation experiments showed that RIPK3 was enriched in the TNF signaling pathway and cytosolic DNA-sensing pathway, and was hypomethylated in the corresponding promoter regions. We also found that RIPK3 was a unique aberrant methylated gene in Cpn-positive lung cancer tissues. There was an inverse relationship between methylation of the promoter region and transcriptional activity.[114]^19^,[115]^20 Hypomethylation of CpG islands in the promoter can result in gene overexpression. The start of the 1st exon was considered as +1 of the sequence, and RIPK3 was located in the promoter region from −43 bp to +232 bp (276 bp total). A previous study showed that reduced methylation of the RIPK3 promoter triggered expression of RIPK3 in RIPK3-null cancer cells[116]^21 and treatment with hypomethylating agents restored RIPK3 expression.[117]^22 RIPK3 was a component of the tumor necrosis factor receptor-I signaling complex that activated NF-κB transcription factors[118]^23^,[119]^24 and further initiated cell proliferation, apoptosis resistance, and malignant transformation. However, recent studies showed that RIPK3, RIPK1, and mixed-lineage kinase domain-like protein (MLKL) formed a multi-protein complex, called a necrotic body,[120]^23^,[121]^25^,[122]^26 which regulated programmed cell necrosis.[123]^27 This was considered a new type of necrosis that is dependent on genetic programming and regulation.[124]^28 Programmed necrosis does not depend on the traditional caspase apoptotic pathway and has been defined as RIPK3-dependent necrotizing cell death.[125]^29 In contrast, programmed necrosis provided the host with an autonomous defense, and cell rupture caused by programmed necrosis induced inflammation by releasing the damage-related molecules. Increasing evidence suggests that programmed necrosis is related to clinical diseases.[126]^29 Necrosis played not only a protective role but also a detrimental role in the pathogenesis of pulmonary diseases, such as chronic obstructive pulmonary disease and lung cancer,[127]^30 and is therefore an important potential therapeutic target.[128]^28 Contrary to the result of our study, recent data suggest that RIPK3 expression is often silenced in cancer cells due to genomic methylation near its transcriptional start site.[129]^22^,[130]^31 Fukasawa et al[131]^32 compared the methylation levels of 288 cancer-associated gene promoters between lung cancer cells and normal lung cells. In the majority of lung cancer cell lines, RIPK3 was hypermethylated and down-regulated. The methylated frequency of RIPK3 was highest in small cell carcinoma. Outcome differences might be due to the different pathological types of lung cancer studied. It would be interesting to discuss the role of RIPK3 in development of lung cancer and its association with methylation by histological subtype. In the present study, we also identified other Cpn-related target genes. Neurofilament (NF) is an intermediate filament found in the cytoplasm of central and peripheral nervous system neurons. It is a type IV intermediate filament heterodimer composed of light (NEFL), middle, and heavy chains,[132]^33 which provide structural support of axons and adjusts axon diameter to control the rate of electrical signals traveling along the axon.[133]^34 A recent study reported that abnormal methylation and expression of NF genes were detected in esophageal squamous cell carcinoma,[134]^35 breast cancer,[135]^36 renal cell carcinoma,[136]^37 hepatocellular carcinoma,[137]^38 and Ewing’s sarcoma.[138]^39 We found that hypermethylation of NEFL TSS1500 regions in Cpn-positive lung cancer tissues inhibited the expression of NEFL, which might contribute to the development of Cpn-associated lung cancer. Another study showed that NEFL expression regulated by promoter hypomethylation, which inhibits the NF-κB pathway, was associated with suppressed invasion and migration of NSCLC cell lines and patients.[139]^40 More recently dopamine and its receptors have been indicated as regulatory factors for immune cells and malignant cell proliferation.[140]^41 The bioavailability of dopamine is regulated by dopamine receptors (DRs),[141]^42 which are G protein-coupled receptors. DRs are divided into different classes - D1 (D1 and D5) and D2 (D2, D3, and D4) – and are expressed in alveolar epithelial cells. The expression of DRs might be affected by corresponding DNA methylation levels. Liao et al[142]^43 used methylation-sensitive representational difference analysis to screen gastric cancer and normal gastric mucosa, and found the promoter and exon regions of DRD5 were differentially methylated DNA sequences. DRD5 promoted adenylate cyclase to synthesize cAMP by activating the Gαs/olf family of G proteins.[143]^44 cAMP is an important second messenger in many biological processes and cAMP pathway dysregulation is related to the development of cancer. Our results showed that, in Cpn-related lung cancer, down-regulated DRD5 expression mediated by changes in corresponding promoter methylation status inhibited adenylate cyclase, thereby decreasing intracellular cAMP accumulation. This promoted the mitotic proliferation of cells, which may ultimately lead to development of lung cancer. The function of DRD5 in lung cancer requires further investigation. There is currently no gold standard for detecting Cpn antigen levels, and the detection rate of existing methods is relatively low, increasing the difficulty of determining if patients have an active infection. Thus, in this study, serum Cpn IgG and IgA levels were detected using MIF, which is the standard for serologic detection of Chlamydia infection. Limited by the design of our case-control study, we only explored the relationship between chlamydia and lung cancer. Thus, Cpn infection may be involved in lung cancer development, and well-designed cohort studies and randomized controlled trials are needed to minimize the effect of disease on antibody titers, reduce selection bias, and better adjust for potential confounders. This study explored the epigenetic mechanism of Cpn infection in lung cancer. Our integrative methylome and transcriptome analyses revealed that Cpn infection contributed to methylation-mediated changes in RIPK3. Activation of the NF-κB pathway, TNF signaling pathway, and cytosolic DNA-sensing pathway might be involved in the development and progression of lung cancer. Case-control studies have inherent limitations, such as selection bias and recall bias, that are difficult to avoid, and thus the same methylation sites may have distinctly different regulatory roles depending on factors such as race, tissue sample, and time of expression. New study designs and sample sizes may provide better insight into the association between Cpn infection, gene methylation, and lung cancer. Therefore, the results of this study need to be further validated in a multi-center, rigorously designed large sample study, and the specific mechanism needs to be further explored using relevant functional studies. Acknowledgments