Abstract Simple Summary Aberrant epigenetic modifications in oncogenic pathways can lead to the onset of different cancers. This study aims to explore the role of differential DNA methylation in the regulation of oncogenic signaling pathways by integrating data from multiple sources including methylome, transcriptome and clinical presentation to uncover the effect of methylation changes acting on the four most common cancers. We utilized a differential methylation-parsing pipeline, which extracted differentially methylated biomarkers based on feature selection. Extracted biomarkers were integrated with the matching RNA-Seq and clinical data to determine if these differentially methylated CpGs could serve as potential diagnostic candidates for the four most common cancers. Our results suggested differential methylation of the genes within the NRF2-PI3K pathway may lead to the presentation of various cancer and serve as potential epigenetic biomodifiers. Abstract Disruption of signaling pathways that plays a role in the normal development and cellular homeostasis may lead to the dysregulation of cellular signaling and bring about the onset of different diseases, including cancer. In addition to genetic aberrations, DNA methylation also acts as an epigenetic modifier to drive the onset and progression of cancer by mediating the reversible transcription of related genes. Although the role of DNA methylation as an alternative driver of carcinogenesis has been well-established, the global effects of DNA methylation on oncogenic signaling pathways and the presentation of cancer is only emerging. In this article, we introduced a differential methylation parsing pipeline (MethylMine) which mined for epigenetic biomarkers based on feature selection. This pipeline was used to mine for biomarkers, which presented a substantial difference in methylation between the tumor and the matching normal tissue samples. Combined with the Data Integration Analysis for Biomarker discovery (DIABLO) framework for machine learning and multi-omic analysis, we revisited the TCGA DNA methylation and RNA-Seq datasets for breast, colorectal, lung, and prostate cancer, and identified differentially methylated genes within the NRF2-KEAP1/PI3K oncogenic pathway, which regulates the expression of cytoprotective genes, that serve as potential therapeutic targets to treat different cancers. Keywords: DNA methylation, machine learning, cancer biomarkers, gene expression, p53, NRF2, Wnt, Hippo, signaling pathways 1. Introduction Over the past few decades, the field of cancer research has continuously evolved [[30]1]. Different techniques have been applied in the detection and treatment of cancer, including screening for cancers during early stages of disease, in order to identify different cancers before the onset of symptoms [[31]2,[32]3,[33]4]. Furthermore, new strategies to predict the outcome of cancer treatments has also emerged [[34]5,[35]6,[36]7]. With the advent of new technologies in the field of oncology, large amounts of cancer data have collected (including genomic, transcriptomic and epigenomic data). However, the accurate prediction of disease outcome and differentiation between different cancers remains one of the most interesting and challenging tasks for researchers. Machine-learning (ML) methods have become a popular tool in recent years to assist medical researchers to discover and identify patterns and relationships between different cancers, in order to create more effective biomarker panels which will be able to predict potential biomarkers that are present in different cancers [[37]8,[38]9,[39]10,[40]11,[41]12]. ML defines the ability of machines to learn and predict future events and outcomes based on large datasets, and has been utilized in health care to improve the interpretation of medical data especially in the development of novel computational tools for stratification, grading, and prognostication of patients with the goal of improving patient care [[42]8,[43]13,[44]14]. Genomic data has been associated with rich clinical annotations, and DNA sequencing has now been incorporated into standard clinical practice [[45]15]. An emerging trend in application of machine learning in oncology is the integration and analysis of mixed data (e.g., genomic, transcriptomic, and epigenomics)—multi-omics to uncover patterns that are reflected across the different data. These studies have painted a portrait of the mutation landscape of multiple cancers including breast, colorectal, lung and prostate [[46]16,[47]17,[48]18,[49]19]. However, some cancers may have relatively few genetic mutations, with their biology largely driven by other types of variations such as aberrant epigenetic modifications. Epigenetics is the study of heritable phenotypic changes that do not involve alteration in the genomic sequence, and includes mechanisms such as histone modification, altered micro (mi) RNA, or long non-coding (lnc) RNA [[50]20]. Mediated by DNA methyltransferases, DNA methylation is a major epigenetic mechanism that occurs when a methyl group is transferred to the C5 position of the carbon to form 5-methylcytosine (5mC). This modification plays an essential role in various biological processes including the regulation of gene expression, genomic imprinting, cell differentiation and normal development [[51]21,[52]22,[53]23]. Aberrant DNA methylation has been associated with various diseases, including cancers [[54]24,[55]25,[56]26]. DNA hypermethylation acts as a regulator of gene expression, and methylation-mediated silencing of tumor suppressor genes or regulatory regions within the genome can lead to dysregulation of cell growth or altered response to cancer therapies [[57]27,[58]28,[59]29,[60]30]. Despite the varied and complex nature of modifications in the epigenetic landscape, many cancers display a high degree of similarity both across different tissues and within the tissues of origin [[61]31,[62]32]. Therefore, abnormal DNA methylation has been viewed as an attractive avenue for development of cancer diagnostics and therapeutics. Although thousands of studies have highlighted the value of using changes in DNA methylation as candidate biomarkers in the detection and treatment of various cancers, only a handful of these targets have been approved for clinical use (as summarized in [63]Table 1). Traditionally, these studies have focused on the association of differential methylation and its effects on individual genes in the phenotypic presentation of different carcinogenesis. However, few studies have investigated the effects of epigenetic changes (especially DNA methylation) on individual and complex signaling pathways across cancers at different anatomical sites. Cell signaling (signal transduction) plays a vital role within biological systems by relaying extracellular signals in order to regulate intracellular gene expression. The signal transduction process is typically initiated when a ligand binds to a membrane-bound receptor, triggering a cascade of intercellular signaling activities through activation of multiple kinases. This ultimately has an effect on the downstream gene expression activation of different signaling pathways and can lead to various physiological or cellular responses (e.g., cell proliferation, differentiation, and metabolism), and any disruptions within these intra and/or extracellular communication chains can lead to the development of different diseases, including cancer [[64]33,[65]34,[66]35]. As DNA methylation is a dynamic and reversible process, changes in methylation could act as regulators of certain oncogenic pathways, leading to development of diseases, and provide an attractive target for development of biomarkers and therapies. Table 1. Summary of DNA methylation biomarkers in five common cancers that is approved for clinical use. Cancer Methylated Gene Description References