Abstract
Objectives:
Colorectal cancer (CRC) is a prevalent disease characterized by
significant dysregulation of gene expression. Non-invasive tests that
utilize microRNAs (miRNAs) have shown promise for early CRC detection.
This study aims to determine the association between miRNAs and key
genes in CRC.
Methods:
Two datasets ([35]GSE106817 and [36]GSE23878) were extracted from the
NCBI Gene Expression Omnibus database. Penalized logistic regression
(PLR) and artificial neural networks (ANN) were used to identify
relevant miRNAs and evaluate the classification accuracy of the
selected miRNAs. The findings were validated through bipartite
miRNA-mRNA interactions.
Results:
Our analysis identified 3 miRNAs: miR-1228, miR-6765-5p, and
miR-6787-5p, achieving a total accuracy of over 90%. Based on the
results of the mRNA-miRNA interaction network, CDK1 and MAD2L1 were
identified as target genes of miR-6787-5p.
Conclusions:
Our results suggest that the identified miRNAs and target genes could
serve as non-invasive biomarkers for diagnosing colorectal cancer,
pending laboratory confirmation.
Keywords: Colorectal neoplasms, microRNA, smoothly clipped absolute
deviation, least absolute shrinkage and selection operator, the minimax
concave penalty, artificial neural networks
Introduction
Colorectal cancer (CRC) is the third most prevalent cancer globally and
the second leading cause of cancer-related mortality. In 2020 alone,
there were approximately 1.93 million new CRC cases and 935 000 deaths.
Age is a major risk factor for CRC, with most cases occurring in
individuals aged 50 or older. Other risk factors include a family
history of CRC, inflammatory bowel disease, genetic mutations, poor
dietary choices, obesity, and lack of physical activity.^[37]1 -[38]3
In recent decades, developing countries have experienced an
epidemiological shift in CRC, marked by a concerning rise in its
incidence.^ [39]4 CRC has become a major contributor to cancer-related
mortality worldwide.^ [40]5 The early detection of CRC through
screening plays a crucial role in enhancing treatment outcomes and
improving patient survival rates. This is primarily because early-stage
CRC typically presents no noticeable symptoms. Consequently,
individuals with early-stage CRC are often diagnosed at later stages,
when the cancer is more advanced and treatment is more challenging.^
[41]6 The overall survival of patients is intricately linked to the
progression of cancer at the time of diagnosis. This is primarily due
to the fact that the extent of cancer progression upon diagnosis serves
as a robust predictor of overall survival.^[42]7,[43]8
Early diagnosis has the potential to significantly impact the
trajectory of treatment.^ [44]9 Traditional screening methods for CRC,
such as fecal immunochemical testing (FIT) and guaiac-based fecal
occult blood test (gFOBT), have become routine practices. However,
these methods have inherent drawbacks, including low sensitivity and
the inability to detect CRC in a timely manner. These limitations have
spurred efforts to develop new screening methods that offer improved
sensitivity and timely detection.^[45]6,[46]10
Biomarkers, as molecular signatures, hold the potential to serve as
more effective tools for cancer screening compared to traditional
methods.^ [47]6 The dysregulation of genes, both coding and non-coding,
along with perturbed signaling pathways, plays a substantial role in
cancer development. Recent research has highlighted the significance of
leveraging these genes and signaling pathways for early cancer
detection.^ [48]11 miRNAs have emerged as highly recognized biological
molecules and genes that intricately regulate the pathways involved in
the formation of cancer cells, specifically in CRC. These miRNAs engage
in interactions with proteins and other non-coding RNAs, thereby
contributing to the pathogenesis of CRC.^ [49]12 Extracellular miRNAs
have been identified in serum and plasma, rendering them non-invasive
biomarkers with potential applications in various disease
conditions.^[50]12,[51]13 Circulating miRNAs in the blood exhibit
remarkable stability and reproducibility, rendering them a promising
biomarker for CRC. Biological processes can influence the expression of
miRNAs, and epigenetic changes can further contribute to alterations in
miRNA expression specifically in CRC.^[52]14 -[53]16 In recent years,
the study of Differentially Expressed miRNAs (DEmiRs) has gained
traction in cancer research. DEmiRs are miRNAs whose expression levels
significantly differ between normal and disease conditions, such as in
cancerous vs. healthy tissues.
One significant challenge in identifying biomarkers associated with
different clinical outcomes, such as distinguishing normal from
cancerous tissue samples, is the high-dimensional nature of the data.
The number of miRNAs often exceeds the sample size, requiring
specialized methods to address this issue. Penalized regression models,
including Penalized Logistic Regression (PLR), have garnered
considerable attention for analyzing this type of data. These models
enable simultaneous variable selection and coefficient estimation. As a
result, non-informative miRNAs receive close to zero estimations, while
the remaining miRNAs in the model are associated with the outcome and
can reliably detect CRC.
In this study, we employed PLR with 3 different penalties: Smoothly
Clipped Absolute Deviation (SCAD), Least Absolute Shrinkage and
Selection Operator (LASSO), and the Minimax Concave Penalty (MCP), to
identify miRNAs related to CRC. The primary objective of this article
was to identify miRNAs capable of detecting CRC at an early stage. By
leveraging systems biology and data mining techniques, we aimed to
determine non-invasive biomarkers with high accuracy, facilitating
timely treatment through early diagnosis of CRC.
Material and Methods
The bioinformatics strategy presented in [54]Figure 1 involved the
utilization of serum microarray datasets to identify miRNAs and key
genes associated with CRC through systems biology methods. Initially,
miRNAs were extracted from each sample’s profile and subjected to
evaluation using PLR. Subsequently, an ANN was developed to assess the
accuracy of the selected miRNAs. The analysis resulted in the
identification of Differentially Expressed miRNAs (DEmiRs) and their
respective target genes. To validate the findings, common genes were
identified between the target genes and Differentially Expressed Genes
(DEGs) using bipartite miRNA-mRNA interactions.
Figure 1.
[55]Figure 1.
[56]Open in a new tab
Flow chart of bioinformatics analysis.
Notably, factors such as age, health status, and patient risk factors
were not accounted for in this study.
miRNA expression profile dataset
Two miRNAs and gene expression datasets for CRC were acquired from the
Gene Expression Omnibus (GEO) repository, namely [57]GSE106817 and
[58]GSE23878. [59]GSE106817 was generated using the “3D-Human miRNA
V21_1.0.0” platform ([60]GPL21263) and comprised 4043 samples including
various disease conditions and healthy individuals. Among these, 115
samples were from CRC patients, while 2759 samples were from healthy
individuals. In order to maintain balance, 115 healthy samples were
randomly selected using R software. The expression levels of 2566
miRNAs were measured in each sample without any initial screening,
providing data for subsequent analysis and modeling. Additionally,
[61]GSE23878, generated with the “Illumina HumanHT-12 V3.0” platform
([62]GPL6947), consisted of 59 tissue specimens, including 35 CRC
samples and 24 normal tissue samples. This dataset was used as a
validation set to assess key genes identified in the study.
Statistical Analysis
miRNA selection through penalized model
PLR techniques are a class of statistical learning methods that can be
used for variable selection. These techniques attach a penalty to the
objective function of the PLR, which shrinks the estimates of the
regression coefficients toward zero. In this way, penalized regression
techniques can simultaneously perform variable selection and
coefficient estimation. In this study, we used PLR models with (1)
Smoothly Clipped Absolute Deviation (SCAD) and (2) Least Absolute
Shrinkage and Selection Operator (LASSO) and (3) the Minimax Concave
Penalty (MCP) to identify important miRNAs. Briefly, PLR is a shrinkage
regression model that adds a penalty term to the regression
coefficients in the likelihood function. The LASSO penalty considers an
absolute value term for each variable in the likelihood function as the
penalty term, more specifically. The SCAD penalty is a Smoothly Clipped
Absolute Deviation penalty that is defined as follows:
[MATH: p_λ(t)=λ⋅|t|,if|t|≤λ :MATH]
[MATH: p_λ(t)=−(|t|^2−2aλ|t|+λ^2)/2(a−1),ifλ<|
mo>t|≤aλ :MATH]
[MATH: p_λ(t)=((a+1)λ^2)/2,if|
mo>t|>aλ :MATH]
Where t is the regression coefficient and λ is the tuning parameter.
The MCP a concave penalty function used in penalized regression for
variable selection and coefficient estimation. It is defined as
follows:
[MATH:
p′λ(|βj|
)=(λ−|βj|a)I(|βj|
)≤aλ). :MATH]
We used a 10-fold cross-validation strategy to select the optimal value
of λ. The value of λ that minimized the Bayesian Information Criterion
was chosen as the optimal value. The PLR models with the 3 types of
penalties were repeated 1000 times and the miRNA that were selected at
least by 2 penalties were considered as miRNA biomarkers. The “grpreg”
package was used for gene selection in R software version 4.0.2.^[63]17
-[64]19 The source code used for the analysis is available on GitHub at
[65]https://github.com/ARGHAREBAGHI.
Artificial neural networks
The analysis involved utilizing the R package version 4.0.2 software to
train an ANN. To prepare the data for training, it was normalized using
the maximum and minimum values. Subsequently, an ANN model was designed
in the R software package, incorporating the important variables. The
model parameters were adjusted to construct a disease prediction model,
taking into account the weight information derived from the expression
of miRNAs. In this model, the pathogenicity score was computed by
summing the weighted scores, which were multiplied by the significant
miRNAs’ disappearance. For gene selection, the “neuralnet” package
(version 19) within R software version 4.0.2 was employed. To optimize
the performance of the model, a 10-fold cross-validation strategy was
employed, allowing for the fine-tuning of hyper-parameters.^[66]20
-[67]23
miRNA target prediction
The miRWalk 3.0 online database, available at
[68]http://mirwalk.uni-hd.de/, is a user-friendly and easily accessible
resource that provides predictive data obtained through a machine
learning algorithm. The database prioritizes accuracy, simplicity, and
up-to-date information to facilitate efficient miRNA research. In the
context mentioned, miRWalk was utilized as a tool to search for
predicted target genes of miRNAs.^ [69]24
Protein-protein interaction (PPI) network analysis
In this study, an interactive network of proteins was employed to
investigate gene interactions and identify hub genes. The
protein-protein interaction (PPI) network for the selected genes was
constructed using the STRING online tool, with an interaction score
threshold of 0.4. To visualize and analyze the constructed network,
Cytoscape software version 3.8.2 was utilized. The CytoHubba plugin
version 1.6 within Cytoscape was employed to evaluate various network
measures, including Maximum Neighborhood Component (MNC), Maximal
Clique Centrality (EPC), and DEGREE, to identify the hub genes within
the network. Furthermore, a Venn diagram was utilized to identify the
common genes and select the hub genes that appeared consistently across
the different measures.^[70]25,[71]26
DEGs’ enrichment analyses
In this study, the function of DEGs was explored through Kyoto
Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO)
enrichment analyses. The GO classification system, encompassing
molecular function (MF), cellular component (CC), and biological
processes (BP), was utilized to gain insights into the functional
characteristics of the DEGs. To conduct the functional enrichment
analysis of the gene list, the Database for Annotation, Visualization,
and Integrated Discovery (DAVID) program, accessible at
[72]https://david.ncifcrf.gov, was employed. The analysis involved
determining significant enrichment of gene functions using an adjusted
P-value cutoff threshold of <.05.^[73]27,[74]28
Potential miRNA-mRNA interactions
In this study, DEmiRs were identified between CRC samples and normal
tissues, considering an adjusted P-value < .05 and |logFC| > 1 as the
criteria for differential expression. Subsequently, the target genes of
the DEmiRs were determined using the miRWalk database. To understand
the miRNA-mRNA regulatory interactions comprehensively, a bipartite
miRNA-mRNA correlation network was constructed and analyzed using
Cytoscape version 3.8.2 software. The interaction score threshold of
0.4 was employed to filter out weak interactions in the network. The
choice of a bipartite network is appropriate for this study since mRNAs
and miRNAs do not directly interact with each other. This network
structure allows mRNAs and miRNAs to be connected solely through their
interactions with target genes.
Hub gene validation by GEPIA
The Gene Expression Profiling Interactive Analysis (GEPIA) database
([75]http://gepia.cancer-pku.cn/) is a web-based tool designed for fast
and CHECK FOR PLAGIRISM : customizable analyses using data from The
Cancer Genome Atlas (TCGA) and Genotype-Tissue Expression (GTEx)
projects. In this study, GEPIA was used to validate the expression of
key hub genes by comparing cancerous and normal tissue samples,
specifically focusing on colorectal cancer. Differential gene
expression was analyzed using ANOVA, with statistical significance set
at P-value < .05 and a fold change greater than 2.
Result
Differentially expression analysis
The miRNAs expression data series ([76]GSE106817) was utilized to
identify miRNAs that were DEmiRs, as well as DEGs. In order to validate
the findings, a total of 3763 DEGs were identified by applying the
criteria of an adjusted P-value < .05 and |logFC| > 1. It was observed
that these genes overlapped with the DEGs identified in the primary
data series ([77]GSE23878), which was utilized for comparison.
Identification of differentially expressed miRNAs
The miRNA expression data was utilized to train the PLR model, as
outlined in the Methods section, with the aim of identifying DEmiRs
associated with CRC diagnosis. The PLR model used the binary outcome
variable, where 1 represented CRC and 0 denoted healthy controls. In
[78]Table 1, we present the names of the 14 selected DEmiR profiles and
their respective frequencies, determined over 1000 repetitions using
LASSO, SCAD, and MCP methods. LASSO selected 11 miRNA profiles, while
SCAD and MCP identified 5 and 2 miRNA profiles, respectively. Notably,
3 miRNAs (miR-6765-5p, miR-6787-5p, and miR-1228) were confirmed as
significant in at least 2 PLR methods.
Table 1.
Frequencies of the selected miRNA over 1000 repetitions using penalized
logistic regression by SCAD, MCP, and LASSO penalties.
miRNA SCAD MCP LASSO Total accuracy
MIMAT0005582 1 1000 1000 .966
MIMAT0019776 1000 .983
MIMAT0027430 1 1000 .966
MIMAT0027436 961 .966
MIMAT0027474 1 1000 .966
MIMAT0015079 305 .759
MIMAT0003320 1000 .845
MIMAT0004970 1000 .966
MIMAT0005922 1000 .931
MIMAT0015075 389 .879
MIMAT0018949 1000 .931
MIMAT0022259 1000 .966
MIMAT0019776 1000 .931
MIMAT0027392 1000 .931
No. selected miRNA 5 2 11
[79]Open in a new tab
The results of the univariate PLR analysis for the selected miRNAs are
presented in [80]Table 2, which includes the regression coefficient,
standard error of the coefficient, odds ratio (OR), and corresponding
P-values. Notably, the results demonstrate that all 13 miRNAs exhibited
statistically significant associations with the diagnosis of CRC.
Table 2.
Results of fitting univariate logistic regression for the selected
genes using penalized logistic regression by SCAD, MCP, and LASSO
penalties.
miRNA SCAD MCP LASSO
β (S.E) OR P-value β (S.E) OR P-value β (S.E) OR P-value
MIMAT0005582 10.95 (2.80) 56954 <.0001 10.95 (2.80) 56954 <.0001 10.95
(2.80) 56954 <.0001
MIMAT0019776 −3.04 (.56) .048 <.0001
MIMAT0027430 12.23 (2.46) 204843 <.0001 12.23 (2.46) 204843 <.0001
MIMAT0027436 1.81 (.34) 6.11 <.0001
MIMAT0027474 −5.84 (1.24) .003 <.0001 −5.84 (1.24) .003 <.0001
MIMAT0015079 −1.44 (.28) .237 <.0001
MIMAT0003320 −2.01 (.26) .134 <.0001
MIMAT0004970 −3.36 (.59) .035 <.0001
MIMAT0005922 9.99 (1.60) 21807 <.0001
MIMAT0015075 −1.74 (.21) .176 <.0001
MIMAT0018949 −4.69 (.59) .009 <.0001
MIMAT0022259 −2.27 (.40) .103 <.0001
MIMAT0019776 −3.04 (.56) .048 <.0001
MIMAT0027392 −6.57 (1.38) .001 <.0001
[81]Open in a new tab
[82]Table 2 presents the outcomes of unpenalized logistic regression
for estimating the regression coefficients of the selected miRNAs. The
table reveals that certain miRNAs exhibited a positive association with
CRC, whereas others displayed a negative association with CRC.
* • Positively associated miRNAs: miR-1228, miR-6765-5p, miR-6768,
and miR-1268. This means that an increase in the expression of
these miRNAs increases the chance of CRC.
* • Negatively associated miRNAs: miR-1343, miR-6787-5p, miR-650,
miR-920, miR-3190, miR-4433, miR-5100, miR-1343, and miR-6746. This
means that a decrease in the expression of these miRNAs increases
the chance of CRC.
The miRNAs identified through PLR were employed as inputs for an ANN
model to develop classifiers capable of diagnosing patients. The ANN
model was designed with a 1:1:1 architecture, comprising a single input
layer, 1 hidden layer, and 1 output layer. The activation functions
used in the model were sigmoid for the input layer, hyperbolic tangent
for the hidden layer, and linear for the output layer.
The input variables for the ANN model were the miRNA expression values
that were chosen in the preceding step. The model’s output was a binary
value, either 0 or 1, enabling the classification of patients as
non-cancerous or cancerous, respectively. This classification holds the
potential for early cancer detection, offering valuable diagnostic
capabilities.
The outcomes of the ANN model are displayed in the final column of
[83]Table 1. Notably, a majority of the miRNAs exhibit a total accuracy
greater than 90%, underscoring their significant potential for cancer
detection.
Identification of key genes using PPI network analysis
In this study, an analysis was conducted using the PPI (Protein-Protein
Interaction) network to explore the 3763 DEGs. The resulting PPI
network consisted of 443 nodes and 8314 edges, as depicted in
[84]Figure 4. Additionally, the Venn diagram analysis of the 10 top
genes, using the 3 methods, resulted in the identification of 7 hub
genes: CDC20, MAD2L1, UBE2C, CDK1, AURKB, CCNA2, and TOP2A. These
findings are illustrated in [85]Figure 2.
Figure 4.
[86]Figure 4.
[87]Open in a new tab
Bipartite mRNA-miRNA subnetwork for CRC. Blue diamonds consist of hub
genes between CRC and normal tissues. Green diamonds consist of 2 hub
genes targeting miR-6787. Cytoscape v.3.8.2 was used to visualize the
network.
Figure 2.
Figure 2.
[88]Open in a new tab
The overlap between the top 10 predicted target genes, ranked by MNC,
EPC, and DEGREE illustrated in a Venn diagram. The number 7 in the
image’s center describes the 3 groups’ commonalities.
Functional and pathway enrichment analysis
The results of the GO study, biological processes (BP), cellular
components (CC) and molecular functions (MF) were significantly
enriched:
* • Top 10 terms BP: rRNA processing, cell division, translation,
mitochondrial translation, mitotic spindle organization, protein
folding, cytoplasmic translation, ribosomal large subunit
biogenesis, proteasomal ubiquitin-independent protein catabolic
process, mitotic sister chromatid segregation.
* • Top 10 terms CC: nucleoplasm, cytosol, membrane, extracellular
exosome, cytoplasm, nucleus, endoplasmic reticulum, mitochondrion,
chromosome, ribosome.
* • Top 10 terms MF: protein binding, RNA binding, identical protein
binding, structural constituent of ribosome, cadherin binding,
enzyme binding, chaperone binding, ATPase activity, snoRNA binding,
unfolded protein binding.
On other hand, KEGG pathway analysis indicated the following pathways
involved: Nucleocytoplasmic transport, Proteasome, DNA replication,
Spliceosome, Glutathione metabolism, Ribosome, Protein processing in
endoplasmic reticulum, p53 signaling pathway ([89]Figure 3).
Figure 3.
[90]Figure 3.
[91]Open in a new tab
Gene Ontology (GO) and KEGG pathway enrichment analyses were performed
for the module genes. The top 10 GO terms in Biological Process (BP),
Molecular Function (MF), and Cellular Component (CC), along with
significant KEGG pathways, are presented.
BiPartite miRNA and mRNA network analysis
mRNA-miRNA network analysis is a valuable computational approach
utilized for understanding the underlying mechanisms contributing to
CRC pathogenesis. In this particular study, the MiRwalk database was
employed to identify target genes of DEmiRs. By assessing the overlap
between the identified miRNA targets and the validated DEmiGs, key hub
genes such as CDK1 and MAD2L1 were identified as both targets of
mir-6787 and pivotal players in CRC. Notably, the expression of
miR-6787-5p was significantly downregulated in cancer tissue samples
compared to normal tissue samples, with CDK1 and MAD2L1, being
identified as its target genes. These findings highlight the intricate
regulatory network involving miRNAs and their target genes in CRC
([92]Figure 4).
Gene expression analysis of the central hub genes
We used the GEPIA database to analyze the expression of 2 candidate
genes in cancer tissues and normal samples from the TCGA-COAD dataset.
The results revealed that CDK1 and MAD2L1 were both significantly
upregulated in tumors in comparison to normal tissues presented in
[93]Figure 5.
Figure 5.
[94]Figure 5.
[95]Open in a new tab
Validation of hub genes in colorectal cancer using TCGA-COAD. Two hub
genes including CDK1, and MAD2L1 were significantly upregulated in CRC
tissues compared to normal tissues in TCGA- COAD data.
Discussion
CRC is a leading cause of global mortality, making early detection
vital for improved treatment response and reduced mortality rates.
Biomarkers play a critical role in CRC diagnosis and treatment, and
bioinformatics tools facilitate the identification of CRC-related
biomarkers and molecular interactions.^[96]29 -[97]31 In this study, a
bioinformatics approach was employed, utilizing 2 databases,
[98]GSE106817 and [99]GSE23878, to identify DEmiRs and hub genes
associated with the progression of CRC. The analysis of these databases
enabled the identification of specific miRNAs and genes that play a
crucial role in CRC progression. By investigating the expression
patterns and interactions of these DEmiRs and hub genes, valuable
insights into the molecular mechanisms underlying CRC development and
progression can be gained. miRNAs such as miR-6765-5p, miR-6787-5p, and
miR-1228 were selected based on their intersection in LASSO, MCP, and
SCAD regression methods. The overall accuracy of these 3 miRNAs
exceeded 95%, underscoring their potential as promising biomarkers for
stable plasma level determination in CRC patients. The study also
demonstrated the utility of an ANN employing 3 different penalty
functions to effectively identify miRNAs significantly associated with
CRC.
miRNAs have emerged as key regulators in cancer biology, functioning as
both tumor suppressors and oncogenes depending on their expression
patterns and the cancer type. These small non-coding RNAs play a
pivotal role in a range of cancer-related processes, including
initiation, malignant transformation, progression, and metastasis.
Recent research has demonstrated that certain cancers have unique miRNA
signatures, making them valuable diagnostic and prognostic markers as
well as potential therapeutic targets. Advances in techniques such as
microarray analysis, RT-PCR, and next-generation sequencing have
facilitated the profiling of miRNAs in various cancer types, even from
archived tumor tissues. Emerging detection methods, such as
nanoparticle-based and hybridization chain reaction (HCR)
amplification, aim to enhance miRNA detection sensitivity. miRNAs are
also stable in body fluids, making them promising candidates for
non-invasive cancer diagnostics. Their dysregulation in cancer cells,
influenced by both genetic and epigenetic factors, highlights their
role in tumorigenesis, and disruptions in the miRNA biogenesis process
could significantly contribute to cancer development.^[100]32,[101]33
In CRC, miR-1228 is often downregulated. This downregulation is
associated with poor prognosis. The exact role of miR-1228 in CRC is
not fully understood, but it is thought to play a role in tumor growth
and progression. miR-1228 targets a number of genes that are involved
in cell proliferation, angiogenesis, and apoptosis. By targeting these
genes, miR-1228 helps prevent cancer cells from growing and spreading.
Numerous studies have shown that miR-1228 plays an essential role in
the proliferation of cancer cells and can be used for early detection
of cancer.^[102]34,[103]35 miR-1228 regulates stress-induced cellular
apoptosis by targeting the MOAP1 protein.^ [104]36 In another report,
the findings showed that miR-1228 has a role in metabolism, maintaining
cell survival, regulating apoptosis, stimulus- response, and survival.
However, some studies have investigated the target gene miR-1228 for
CRC.^[105]37,[106]38 LRP1 is the target gene of miR-1228 and is located
on chromosome 12.^[107]39,[108]40 This gene mainly plays a role in
basic metabolism and cell structure, which is a key component of
maintaining cell survival. In past research, the expression level of
miR-1228-3p has been checked in drug resistance of breast cancer,
chronic heart failure, endometrial carcinoma, prostate cancer, CRC, and
cancer secretions. The expression level of miR-1228-3p is stable in
blood circulation and can be used as a biomarker.^ [109]41 In a study
by Yang et al,^ [110]37 it was revealed that miR-1228 remained
unaffected by surgical treatment, indicating its suitability as an
optimal reference gene for treatment studies. Additionally, the
circulating level of miR-1228 was found to be independent of tumor
stage.
In CRC, miR-6787-5p is often downregulated. This downregulation is
associated with poor prognosis. The exact role of miR-6787-5p in CRC is
not fully understood, but it is thought to play a role in tumor growth
and progression. miR-6787-5p targets a number of genes that are
involved in cell proliferation, angiogenesis, and apoptosis. By
targeting these genes, miR-6787-5p helps prevent cancer cells from
growing and spreading.^ [111]42 The exact role of miR-6765-5p in CRC is
not fully understood.
Bioinformatics analysis was then performed using the MNC, EPC, and
DEGREE tools in Cytoscape software. The functional and biological
interactions between the DEGs were investigated using Gene Ontology
(GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses. In
the present study, the nucleoplasm was identified as one of the
significant enrichment pathways of DEGs in CRC. Network analysis
demonstrated that 4 genes of DEGs are involved in this pathway. These
findings suggest that the DEGs are involved in a number of biological
processes that are important for the pathogenesis of CRC. Further
research is needed to confirm these findings and to identify new
diagnostic targets for CRC.^ [112]43 Therapeutic modulation of cell
membrane lipid composition and organization is an emerging field with
potential applications in a variety of diseases, including cancer.
Research has shown that this approach could be used to treat a variety
of diseases, including cancer.^ [113]44 It has been shown that GO terms
such as rRNA processing,^ [114]45 translation,^ [115]46 Mitochondrial
translation,^ [116]47 mitotic spindle organization,^ [117]48
extracellular exosome.^ [118]49 and protein binding^ [119]50 were
associated with CRC
By using miRNA-mRNA expression profiling, CDK1 and MAD2L1 were
identified as the most important genes playing an important role in
CRC. The CDK1 gene encodes a protein known as cyclin-dependent kinase
1, which belongs to a family of enzymes involved in the regulation of
the cell cycle. The cell cycle is a fundamental process responsible for
cell growth, division, and the generation of new cells. In CRC, the
CDK1 gene can undergo mutations, resulting in abnormal functioning.
These mutations can lead to excessive production of the
cyclin-dependent kinase 1 protein. Scientific investigations have
demonstrated that dysregulation of CDK1 accelerates tumor growth and
uncontrolled proliferation of cancer cells.^[120]51,[121]52 Zhang et
al^[122]53 revealed that CDK1, in addition to being overexpressed and
sensitive to apoptosis in CRC cells, plays a crucial role in
controlling the cell cycle and contributes to the development of
colorectal tumors through an iron-regulated signaling axis. Previous
studies have established a link between CDK1 overexpression and the
development of colorectal, liver, and lung cancers, ultimately
impacting patient survival.^ [123]54
MAD2L1 plays a crucial role as a tumor suppressor gene in regulating
the cell cycle. Mutations in the MAD2L1 gene can disrupt the normal
control of cell growth and division, which can contribute to the
development of cancer. Deletion of the MAD2L1 gene has been found to
impede the growth of CRC cells.^[124]55,[125]56 Venugopal et al^
[126]57 revealed that there is a higher expression of MAD2L1 in CRC
cell lines and tissues, and this overexpression has been associated
with poor prognosis. Li et al^ [127]55 revealed that MAD2L1 gene has
demonstrated potential as a biomarker for colorectal cancer, according
to previous studies.
The present study introduced a novel set of gene expression profiles
that are predictive of CRC patients using a miRNA-mRNA model. This
model provides a different perspective than the traditional
proportional point of view.
Conclusions
This study identified 3 novel miRNAs (miR-1228, miR-6765-5p, and
miR-6787-5p) that are potentially associated with CRC and could serve
as biomarkers. Additionally, the target genes related to these miRNAs,
namely CDK1 and MAD2L1, were found to be upregulated in CRC compared to
normal tissues. The miRNAs associated with the hub genes in the
mRNA-miRNA bipartite network played a pivotal role in CRC. However,
further molecular studies are warranted to validate the role of these
genes in CRC tumorigenesis.
List of Abbreviations
Abbreviation Definition
miRNAs microRNAs
CRC Colorectal cancer
ANN Artificial neural networks
PLR Penalized logistic regression
SCAD Smoothly clipped absolute deviation
LASSO Least absolute shrinkage and selection operator
MCP The minimax concave penalty
GEO Gene Expression Omnibus
DEmiRs Differentially Expressed miRNAs
PPI Protein-protein interaction
DEGs Differentially expressed genes
[128]Open in a new tab
Acknowledgments