Abstract Objectives: Colorectal cancer (CRC) is a prevalent disease characterized by significant dysregulation of gene expression. Non-invasive tests that utilize microRNAs (miRNAs) have shown promise for early CRC detection. This study aims to determine the association between miRNAs and key genes in CRC. Methods: Two datasets ([35]GSE106817 and [36]GSE23878) were extracted from the NCBI Gene Expression Omnibus database. Penalized logistic regression (PLR) and artificial neural networks (ANN) were used to identify relevant miRNAs and evaluate the classification accuracy of the selected miRNAs. The findings were validated through bipartite miRNA-mRNA interactions. Results: Our analysis identified 3 miRNAs: miR-1228, miR-6765-5p, and miR-6787-5p, achieving a total accuracy of over 90%. Based on the results of the mRNA-miRNA interaction network, CDK1 and MAD2L1 were identified as target genes of miR-6787-5p. Conclusions: Our results suggest that the identified miRNAs and target genes could serve as non-invasive biomarkers for diagnosing colorectal cancer, pending laboratory confirmation. Keywords: Colorectal neoplasms, microRNA, smoothly clipped absolute deviation, least absolute shrinkage and selection operator, the minimax concave penalty, artificial neural networks Introduction Colorectal cancer (CRC) is the third most prevalent cancer globally and the second leading cause of cancer-related mortality. In 2020 alone, there were approximately 1.93 million new CRC cases and 935 000 deaths. Age is a major risk factor for CRC, with most cases occurring in individuals aged 50 or older. Other risk factors include a family history of CRC, inflammatory bowel disease, genetic mutations, poor dietary choices, obesity, and lack of physical activity.^[37]1 -[38]3 In recent decades, developing countries have experienced an epidemiological shift in CRC, marked by a concerning rise in its incidence.^ [39]4 CRC has become a major contributor to cancer-related mortality worldwide.^ [40]5 The early detection of CRC through screening plays a crucial role in enhancing treatment outcomes and improving patient survival rates. This is primarily because early-stage CRC typically presents no noticeable symptoms. Consequently, individuals with early-stage CRC are often diagnosed at later stages, when the cancer is more advanced and treatment is more challenging.^ [41]6 The overall survival of patients is intricately linked to the progression of cancer at the time of diagnosis. This is primarily due to the fact that the extent of cancer progression upon diagnosis serves as a robust predictor of overall survival.^[42]7,[43]8 Early diagnosis has the potential to significantly impact the trajectory of treatment.^ [44]9 Traditional screening methods for CRC, such as fecal immunochemical testing (FIT) and guaiac-based fecal occult blood test (gFOBT), have become routine practices. However, these methods have inherent drawbacks, including low sensitivity and the inability to detect CRC in a timely manner. These limitations have spurred efforts to develop new screening methods that offer improved sensitivity and timely detection.^[45]6,[46]10 Biomarkers, as molecular signatures, hold the potential to serve as more effective tools for cancer screening compared to traditional methods.^ [47]6 The dysregulation of genes, both coding and non-coding, along with perturbed signaling pathways, plays a substantial role in cancer development. Recent research has highlighted the significance of leveraging these genes and signaling pathways for early cancer detection.^ [48]11 miRNAs have emerged as highly recognized biological molecules and genes that intricately regulate the pathways involved in the formation of cancer cells, specifically in CRC. These miRNAs engage in interactions with proteins and other non-coding RNAs, thereby contributing to the pathogenesis of CRC.^ [49]12 Extracellular miRNAs have been identified in serum and plasma, rendering them non-invasive biomarkers with potential applications in various disease conditions.^[50]12,[51]13 Circulating miRNAs in the blood exhibit remarkable stability and reproducibility, rendering them a promising biomarker for CRC. Biological processes can influence the expression of miRNAs, and epigenetic changes can further contribute to alterations in miRNA expression specifically in CRC.^[52]14 -[53]16 In recent years, the study of Differentially Expressed miRNAs (DEmiRs) has gained traction in cancer research. DEmiRs are miRNAs whose expression levels significantly differ between normal and disease conditions, such as in cancerous vs. healthy tissues. One significant challenge in identifying biomarkers associated with different clinical outcomes, such as distinguishing normal from cancerous tissue samples, is the high-dimensional nature of the data. The number of miRNAs often exceeds the sample size, requiring specialized methods to address this issue. Penalized regression models, including Penalized Logistic Regression (PLR), have garnered considerable attention for analyzing this type of data. These models enable simultaneous variable selection and coefficient estimation. As a result, non-informative miRNAs receive close to zero estimations, while the remaining miRNAs in the model are associated with the outcome and can reliably detect CRC. In this study, we employed PLR with 3 different penalties: Smoothly Clipped Absolute Deviation (SCAD), Least Absolute Shrinkage and Selection Operator (LASSO), and the Minimax Concave Penalty (MCP), to identify miRNAs related to CRC. The primary objective of this article was to identify miRNAs capable of detecting CRC at an early stage. By leveraging systems biology and data mining techniques, we aimed to determine non-invasive biomarkers with high accuracy, facilitating timely treatment through early diagnosis of CRC. Material and Methods The bioinformatics strategy presented in [54]Figure 1 involved the utilization of serum microarray datasets to identify miRNAs and key genes associated with CRC through systems biology methods. Initially, miRNAs were extracted from each sample’s profile and subjected to evaluation using PLR. Subsequently, an ANN was developed to assess the accuracy of the selected miRNAs. The analysis resulted in the identification of Differentially Expressed miRNAs (DEmiRs) and their respective target genes. To validate the findings, common genes were identified between the target genes and Differentially Expressed Genes (DEGs) using bipartite miRNA-mRNA interactions. Figure 1. [55]Figure 1. [56]Open in a new tab Flow chart of bioinformatics analysis. Notably, factors such as age, health status, and patient risk factors were not accounted for in this study. miRNA expression profile dataset Two miRNAs and gene expression datasets for CRC were acquired from the Gene Expression Omnibus (GEO) repository, namely [57]GSE106817 and [58]GSE23878. [59]GSE106817 was generated using the “3D-Human miRNA V21_1.0.0” platform ([60]GPL21263) and comprised 4043 samples including various disease conditions and healthy individuals. Among these, 115 samples were from CRC patients, while 2759 samples were from healthy individuals. In order to maintain balance, 115 healthy samples were randomly selected using R software. The expression levels of 2566 miRNAs were measured in each sample without any initial screening, providing data for subsequent analysis and modeling. Additionally, [61]GSE23878, generated with the “Illumina HumanHT-12 V3.0” platform ([62]GPL6947), consisted of 59 tissue specimens, including 35 CRC samples and 24 normal tissue samples. This dataset was used as a validation set to assess key genes identified in the study. Statistical Analysis miRNA selection through penalized model PLR techniques are a class of statistical learning methods that can be used for variable selection. These techniques attach a penalty to the objective function of the PLR, which shrinks the estimates of the regression coefficients toward zero. In this way, penalized regression techniques can simultaneously perform variable selection and coefficient estimation. In this study, we used PLR models with (1) Smoothly Clipped Absolute Deviation (SCAD) and (2) Least Absolute Shrinkage and Selection Operator (LASSO) and (3) the Minimax Concave Penalty (MCP) to identify important miRNAs. Briefly, PLR is a shrinkage regression model that adds a penalty term to the regression coefficients in the likelihood function. The LASSO penalty considers an absolute value term for each variable in the likelihood function as the penalty term, more specifically. The SCAD penalty is a Smoothly Clipped Absolute Deviation penalty that is defined as follows: [MATH: p_λ(t)=λ|t|,if|t|λ :MATH] [MATH: p_λ(t)=(|t|^22aλ|t|+λ^2)/2(a1),ifλ<|t|aλ :MATH] [MATH: p_λ(t)=((a+1)λ^2)/2,if|t|>aλ :MATH] Where t is the regression coefficient and λ is the tuning parameter. The MCP a concave penalty function used in penalized regression for variable selection and coefficient estimation. It is defined as follows: [MATH: pλ(|βj| )=(λ|βj|a)I(|βj| )aλ). :MATH] We used a 10-fold cross-validation strategy to select the optimal value of λ. The value of λ that minimized the Bayesian Information Criterion was chosen as the optimal value. The PLR models with the 3 types of penalties were repeated 1000 times and the miRNA that were selected at least by 2 penalties were considered as miRNA biomarkers. The “grpreg” package was used for gene selection in R software version 4.0.2.^[63]17 -[64]19 The source code used for the analysis is available on GitHub at [65]https://github.com/ARGHAREBAGHI. Artificial neural networks The analysis involved utilizing the R package version 4.0.2 software to train an ANN. To prepare the data for training, it was normalized using the maximum and minimum values. Subsequently, an ANN model was designed in the R software package, incorporating the important variables. The model parameters were adjusted to construct a disease prediction model, taking into account the weight information derived from the expression of miRNAs. In this model, the pathogenicity score was computed by summing the weighted scores, which were multiplied by the significant miRNAs’ disappearance. For gene selection, the “neuralnet” package (version 19) within R software version 4.0.2 was employed. To optimize the performance of the model, a 10-fold cross-validation strategy was employed, allowing for the fine-tuning of hyper-parameters.^[66]20 -[67]23 miRNA target prediction The miRWalk 3.0 online database, available at [68]http://mirwalk.uni-hd.de/, is a user-friendly and easily accessible resource that provides predictive data obtained through a machine learning algorithm. The database prioritizes accuracy, simplicity, and up-to-date information to facilitate efficient miRNA research. In the context mentioned, miRWalk was utilized as a tool to search for predicted target genes of miRNAs.^ [69]24 Protein-protein interaction (PPI) network analysis In this study, an interactive network of proteins was employed to investigate gene interactions and identify hub genes. The protein-protein interaction (PPI) network for the selected genes was constructed using the STRING online tool, with an interaction score threshold of 0.4. To visualize and analyze the constructed network, Cytoscape software version 3.8.2 was utilized. The CytoHubba plugin version 1.6 within Cytoscape was employed to evaluate various network measures, including Maximum Neighborhood Component (MNC), Maximal Clique Centrality (EPC), and DEGREE, to identify the hub genes within the network. Furthermore, a Venn diagram was utilized to identify the common genes and select the hub genes that appeared consistently across the different measures.^[70]25,[71]26 DEGs’ enrichment analyses In this study, the function of DEGs was explored through Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) enrichment analyses. The GO classification system, encompassing molecular function (MF), cellular component (CC), and biological processes (BP), was utilized to gain insights into the functional characteristics of the DEGs. To conduct the functional enrichment analysis of the gene list, the Database for Annotation, Visualization, and Integrated Discovery (DAVID) program, accessible at [72]https://david.ncifcrf.gov, was employed. The analysis involved determining significant enrichment of gene functions using an adjusted P-value cutoff threshold of <.05.^[73]27,[74]28 Potential miRNA-mRNA interactions In this study, DEmiRs were identified between CRC samples and normal tissues, considering an adjusted P-value < .05 and |logFC| > 1 as the criteria for differential expression. Subsequently, the target genes of the DEmiRs were determined using the miRWalk database. To understand the miRNA-mRNA regulatory interactions comprehensively, a bipartite miRNA-mRNA correlation network was constructed and analyzed using Cytoscape version 3.8.2 software. The interaction score threshold of 0.4 was employed to filter out weak interactions in the network. The choice of a bipartite network is appropriate for this study since mRNAs and miRNAs do not directly interact with each other. This network structure allows mRNAs and miRNAs to be connected solely through their interactions with target genes. Hub gene validation by GEPIA The Gene Expression Profiling Interactive Analysis (GEPIA) database ([75]http://gepia.cancer-pku.cn/) is a web-based tool designed for fast and CHECK FOR PLAGIRISM : customizable analyses using data from The Cancer Genome Atlas (TCGA) and Genotype-Tissue Expression (GTEx) projects. In this study, GEPIA was used to validate the expression of key hub genes by comparing cancerous and normal tissue samples, specifically focusing on colorectal cancer. Differential gene expression was analyzed using ANOVA, with statistical significance set at P-value < .05 and a fold change greater than 2. Result Differentially expression analysis The miRNAs expression data series ([76]GSE106817) was utilized to identify miRNAs that were DEmiRs, as well as DEGs. In order to validate the findings, a total of 3763 DEGs were identified by applying the criteria of an adjusted P-value < .05 and |logFC| > 1. It was observed that these genes overlapped with the DEGs identified in the primary data series ([77]GSE23878), which was utilized for comparison. Identification of differentially expressed miRNAs The miRNA expression data was utilized to train the PLR model, as outlined in the Methods section, with the aim of identifying DEmiRs associated with CRC diagnosis. The PLR model used the binary outcome variable, where 1 represented CRC and 0 denoted healthy controls. In [78]Table 1, we present the names of the 14 selected DEmiR profiles and their respective frequencies, determined over 1000 repetitions using LASSO, SCAD, and MCP methods. LASSO selected 11 miRNA profiles, while SCAD and MCP identified 5 and 2 miRNA profiles, respectively. Notably, 3 miRNAs (miR-6765-5p, miR-6787-5p, and miR-1228) were confirmed as significant in at least 2 PLR methods. Table 1. Frequencies of the selected miRNA over 1000 repetitions using penalized logistic regression by SCAD, MCP, and LASSO penalties. miRNA SCAD MCP LASSO Total accuracy MIMAT0005582 1 1000 1000 .966 MIMAT0019776 1000 .983 MIMAT0027430 1 1000 .966 MIMAT0027436 961 .966 MIMAT0027474 1 1000 .966 MIMAT0015079 305 .759 MIMAT0003320 1000 .845 MIMAT0004970 1000 .966 MIMAT0005922 1000 .931 MIMAT0015075 389 .879 MIMAT0018949 1000 .931 MIMAT0022259 1000 .966 MIMAT0019776 1000 .931 MIMAT0027392 1000 .931 No. selected miRNA 5 2 11 [79]Open in a new tab The results of the univariate PLR analysis for the selected miRNAs are presented in [80]Table 2, which includes the regression coefficient, standard error of the coefficient, odds ratio (OR), and corresponding P-values. Notably, the results demonstrate that all 13 miRNAs exhibited statistically significant associations with the diagnosis of CRC. Table 2. Results of fitting univariate logistic regression for the selected genes using penalized logistic regression by SCAD, MCP, and LASSO penalties. miRNA SCAD MCP LASSO β (S.E) OR P-value β (S.E) OR P-value β (S.E) OR P-value MIMAT0005582 10.95 (2.80) 56954 <.0001 10.95 (2.80) 56954 <.0001 10.95 (2.80) 56954 <.0001 MIMAT0019776 −3.04 (.56) .048 <.0001 MIMAT0027430 12.23 (2.46) 204843 <.0001 12.23 (2.46) 204843 <.0001 MIMAT0027436 1.81 (.34) 6.11 <.0001 MIMAT0027474 −5.84 (1.24) .003 <.0001 −5.84 (1.24) .003 <.0001 MIMAT0015079 −1.44 (.28) .237 <.0001 MIMAT0003320 −2.01 (.26) .134 <.0001 MIMAT0004970 −3.36 (.59) .035 <.0001 MIMAT0005922 9.99 (1.60) 21807 <.0001 MIMAT0015075 −1.74 (.21) .176 <.0001 MIMAT0018949 −4.69 (.59) .009 <.0001 MIMAT0022259 −2.27 (.40) .103 <.0001 MIMAT0019776 −3.04 (.56) .048 <.0001 MIMAT0027392 −6.57 (1.38) .001 <.0001 [81]Open in a new tab [82]Table 2 presents the outcomes of unpenalized logistic regression for estimating the regression coefficients of the selected miRNAs. The table reveals that certain miRNAs exhibited a positive association with CRC, whereas others displayed a negative association with CRC. * • Positively associated miRNAs: miR-1228, miR-6765-5p, miR-6768, and miR-1268. This means that an increase in the expression of these miRNAs increases the chance of CRC. * • Negatively associated miRNAs: miR-1343, miR-6787-5p, miR-650, miR-920, miR-3190, miR-4433, miR-5100, miR-1343, and miR-6746. This means that a decrease in the expression of these miRNAs increases the chance of CRC. The miRNAs identified through PLR were employed as inputs for an ANN model to develop classifiers capable of diagnosing patients. The ANN model was designed with a 1:1:1 architecture, comprising a single input layer, 1 hidden layer, and 1 output layer. The activation functions used in the model were sigmoid for the input layer, hyperbolic tangent for the hidden layer, and linear for the output layer. The input variables for the ANN model were the miRNA expression values that were chosen in the preceding step. The model’s output was a binary value, either 0 or 1, enabling the classification of patients as non-cancerous or cancerous, respectively. This classification holds the potential for early cancer detection, offering valuable diagnostic capabilities. The outcomes of the ANN model are displayed in the final column of [83]Table 1. Notably, a majority of the miRNAs exhibit a total accuracy greater than 90%, underscoring their significant potential for cancer detection. Identification of key genes using PPI network analysis In this study, an analysis was conducted using the PPI (Protein-Protein Interaction) network to explore the 3763 DEGs. The resulting PPI network consisted of 443 nodes and 8314 edges, as depicted in [84]Figure 4. Additionally, the Venn diagram analysis of the 10 top genes, using the 3 methods, resulted in the identification of 7 hub genes: CDC20, MAD2L1, UBE2C, CDK1, AURKB, CCNA2, and TOP2A. These findings are illustrated in [85]Figure 2. Figure 4. [86]Figure 4. [87]Open in a new tab Bipartite mRNA-miRNA subnetwork for CRC. Blue diamonds consist of hub genes between CRC and normal tissues. Green diamonds consist of 2 hub genes targeting miR-6787. Cytoscape v.3.8.2 was used to visualize the network. Figure 2. Figure 2. [88]Open in a new tab The overlap between the top 10 predicted target genes, ranked by MNC, EPC, and DEGREE illustrated in a Venn diagram. The number 7 in the image’s center describes the 3 groups’ commonalities. Functional and pathway enrichment analysis The results of the GO study, biological processes (BP), cellular components (CC) and molecular functions (MF) were significantly enriched: * • Top 10 terms BP: rRNA processing, cell division, translation, mitochondrial translation, mitotic spindle organization, protein folding, cytoplasmic translation, ribosomal large subunit biogenesis, proteasomal ubiquitin-independent protein catabolic process, mitotic sister chromatid segregation. * • Top 10 terms CC: nucleoplasm, cytosol, membrane, extracellular exosome, cytoplasm, nucleus, endoplasmic reticulum, mitochondrion, chromosome, ribosome. * • Top 10 terms MF: protein binding, RNA binding, identical protein binding, structural constituent of ribosome, cadherin binding, enzyme binding, chaperone binding, ATPase activity, snoRNA binding, unfolded protein binding. On other hand, KEGG pathway analysis indicated the following pathways involved: Nucleocytoplasmic transport, Proteasome, DNA replication, Spliceosome, Glutathione metabolism, Ribosome, Protein processing in endoplasmic reticulum, p53 signaling pathway ([89]Figure 3). Figure 3. [90]Figure 3. [91]Open in a new tab Gene Ontology (GO) and KEGG pathway enrichment analyses were performed for the module genes. The top 10 GO terms in Biological Process (BP), Molecular Function (MF), and Cellular Component (CC), along with significant KEGG pathways, are presented. BiPartite miRNA and mRNA network analysis mRNA-miRNA network analysis is a valuable computational approach utilized for understanding the underlying mechanisms contributing to CRC pathogenesis. In this particular study, the MiRwalk database was employed to identify target genes of DEmiRs. By assessing the overlap between the identified miRNA targets and the validated DEmiGs, key hub genes such as CDK1 and MAD2L1 were identified as both targets of mir-6787 and pivotal players in CRC. Notably, the expression of miR-6787-5p was significantly downregulated in cancer tissue samples compared to normal tissue samples, with CDK1 and MAD2L1, being identified as its target genes. These findings highlight the intricate regulatory network involving miRNAs and their target genes in CRC ([92]Figure 4). Gene expression analysis of the central hub genes We used the GEPIA database to analyze the expression of 2 candidate genes in cancer tissues and normal samples from the TCGA-COAD dataset. The results revealed that CDK1 and MAD2L1 were both significantly upregulated in tumors in comparison to normal tissues presented in [93]Figure 5. Figure 5. [94]Figure 5. [95]Open in a new tab Validation of hub genes in colorectal cancer using TCGA-COAD. Two hub genes including CDK1, and MAD2L1 were significantly upregulated in CRC tissues compared to normal tissues in TCGA- COAD data. Discussion CRC is a leading cause of global mortality, making early detection vital for improved treatment response and reduced mortality rates. Biomarkers play a critical role in CRC diagnosis and treatment, and bioinformatics tools facilitate the identification of CRC-related biomarkers and molecular interactions.^[96]29 -[97]31 In this study, a bioinformatics approach was employed, utilizing 2 databases, [98]GSE106817 and [99]GSE23878, to identify DEmiRs and hub genes associated with the progression of CRC. The analysis of these databases enabled the identification of specific miRNAs and genes that play a crucial role in CRC progression. By investigating the expression patterns and interactions of these DEmiRs and hub genes, valuable insights into the molecular mechanisms underlying CRC development and progression can be gained. miRNAs such as miR-6765-5p, miR-6787-5p, and miR-1228 were selected based on their intersection in LASSO, MCP, and SCAD regression methods. The overall accuracy of these 3 miRNAs exceeded 95%, underscoring their potential as promising biomarkers for stable plasma level determination in CRC patients. The study also demonstrated the utility of an ANN employing 3 different penalty functions to effectively identify miRNAs significantly associated with CRC. miRNAs have emerged as key regulators in cancer biology, functioning as both tumor suppressors and oncogenes depending on their expression patterns and the cancer type. These small non-coding RNAs play a pivotal role in a range of cancer-related processes, including initiation, malignant transformation, progression, and metastasis. Recent research has demonstrated that certain cancers have unique miRNA signatures, making them valuable diagnostic and prognostic markers as well as potential therapeutic targets. Advances in techniques such as microarray analysis, RT-PCR, and next-generation sequencing have facilitated the profiling of miRNAs in various cancer types, even from archived tumor tissues. Emerging detection methods, such as nanoparticle-based and hybridization chain reaction (HCR) amplification, aim to enhance miRNA detection sensitivity. miRNAs are also stable in body fluids, making them promising candidates for non-invasive cancer diagnostics. Their dysregulation in cancer cells, influenced by both genetic and epigenetic factors, highlights their role in tumorigenesis, and disruptions in the miRNA biogenesis process could significantly contribute to cancer development.^[100]32,[101]33 In CRC, miR-1228 is often downregulated. This downregulation is associated with poor prognosis. The exact role of miR-1228 in CRC is not fully understood, but it is thought to play a role in tumor growth and progression. miR-1228 targets a number of genes that are involved in cell proliferation, angiogenesis, and apoptosis. By targeting these genes, miR-1228 helps prevent cancer cells from growing and spreading. Numerous studies have shown that miR-1228 plays an essential role in the proliferation of cancer cells and can be used for early detection of cancer.^[102]34,[103]35 miR-1228 regulates stress-induced cellular apoptosis by targeting the MOAP1 protein.^ [104]36 In another report, the findings showed that miR-1228 has a role in metabolism, maintaining cell survival, regulating apoptosis, stimulus- response, and survival. However, some studies have investigated the target gene miR-1228 for CRC.^[105]37,[106]38 LRP1 is the target gene of miR-1228 and is located on chromosome 12.^[107]39,[108]40 This gene mainly plays a role in basic metabolism and cell structure, which is a key component of maintaining cell survival. In past research, the expression level of miR-1228-3p has been checked in drug resistance of breast cancer, chronic heart failure, endometrial carcinoma, prostate cancer, CRC, and cancer secretions. The expression level of miR-1228-3p is stable in blood circulation and can be used as a biomarker.^ [109]41 In a study by Yang et al,^ [110]37 it was revealed that miR-1228 remained unaffected by surgical treatment, indicating its suitability as an optimal reference gene for treatment studies. Additionally, the circulating level of miR-1228 was found to be independent of tumor stage. In CRC, miR-6787-5p is often downregulated. This downregulation is associated with poor prognosis. The exact role of miR-6787-5p in CRC is not fully understood, but it is thought to play a role in tumor growth and progression. miR-6787-5p targets a number of genes that are involved in cell proliferation, angiogenesis, and apoptosis. By targeting these genes, miR-6787-5p helps prevent cancer cells from growing and spreading.^ [111]42 The exact role of miR-6765-5p in CRC is not fully understood. Bioinformatics analysis was then performed using the MNC, EPC, and DEGREE tools in Cytoscape software. The functional and biological interactions between the DEGs were investigated using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses. In the present study, the nucleoplasm was identified as one of the significant enrichment pathways of DEGs in CRC. Network analysis demonstrated that 4 genes of DEGs are involved in this pathway. These findings suggest that the DEGs are involved in a number of biological processes that are important for the pathogenesis of CRC. Further research is needed to confirm these findings and to identify new diagnostic targets for CRC.^ [112]43 Therapeutic modulation of cell membrane lipid composition and organization is an emerging field with potential applications in a variety of diseases, including cancer. Research has shown that this approach could be used to treat a variety of diseases, including cancer.^ [113]44 It has been shown that GO terms such as rRNA processing,^ [114]45 translation,^ [115]46 Mitochondrial translation,^ [116]47 mitotic spindle organization,^ [117]48 extracellular exosome.^ [118]49 and protein binding^ [119]50 were associated with CRC By using miRNA-mRNA expression profiling, CDK1 and MAD2L1 were identified as the most important genes playing an important role in CRC. The CDK1 gene encodes a protein known as cyclin-dependent kinase 1, which belongs to a family of enzymes involved in the regulation of the cell cycle. The cell cycle is a fundamental process responsible for cell growth, division, and the generation of new cells. In CRC, the CDK1 gene can undergo mutations, resulting in abnormal functioning. These mutations can lead to excessive production of the cyclin-dependent kinase 1 protein. Scientific investigations have demonstrated that dysregulation of CDK1 accelerates tumor growth and uncontrolled proliferation of cancer cells.^[120]51,[121]52 Zhang et al^[122]53 revealed that CDK1, in addition to being overexpressed and sensitive to apoptosis in CRC cells, plays a crucial role in controlling the cell cycle and contributes to the development of colorectal tumors through an iron-regulated signaling axis. Previous studies have established a link between CDK1 overexpression and the development of colorectal, liver, and lung cancers, ultimately impacting patient survival.^ [123]54 MAD2L1 plays a crucial role as a tumor suppressor gene in regulating the cell cycle. Mutations in the MAD2L1 gene can disrupt the normal control of cell growth and division, which can contribute to the development of cancer. Deletion of the MAD2L1 gene has been found to impede the growth of CRC cells.^[124]55,[125]56 Venugopal et al^ [126]57 revealed that there is a higher expression of MAD2L1 in CRC cell lines and tissues, and this overexpression has been associated with poor prognosis. Li et al^ [127]55 revealed that MAD2L1 gene has demonstrated potential as a biomarker for colorectal cancer, according to previous studies. The present study introduced a novel set of gene expression profiles that are predictive of CRC patients using a miRNA-mRNA model. This model provides a different perspective than the traditional proportional point of view. Conclusions This study identified 3 novel miRNAs (miR-1228, miR-6765-5p, and miR-6787-5p) that are potentially associated with CRC and could serve as biomarkers. Additionally, the target genes related to these miRNAs, namely CDK1 and MAD2L1, were found to be upregulated in CRC compared to normal tissues. The miRNAs associated with the hub genes in the mRNA-miRNA bipartite network played a pivotal role in CRC. However, further molecular studies are warranted to validate the role of these genes in CRC tumorigenesis. List of Abbreviations Abbreviation Definition miRNAs microRNAs CRC Colorectal cancer ANN Artificial neural networks PLR Penalized logistic regression SCAD Smoothly clipped absolute deviation LASSO Least absolute shrinkage and selection operator MCP The minimax concave penalty GEO Gene Expression Omnibus DEmiRs Differentially Expressed miRNAs PPI Protein-protein interaction DEGs Differentially expressed genes [128]Open in a new tab Acknowledgments