Abstract

Background

   Cervical cancer is among the most prevalent malignancies worldwide.
   This study explores the relationships between angiogenesis-related
   genes (ARGs) and immune infiltration, and assesses their implications
   for the prognosis and treatment of cervical cancer. Additionally, it
   develops a diagnostic model based on angiogenesis-related
   differentially expressed genes (ARDEGs).

Methods

   We systematically evaluated 15 ARDEGs using Gene Ontology (GO), Kyoto
   Encyclopedia of Genes and Genomes (KEGG), Gene Set Enrichment Analysis
   (GSEA), and Gene Set Variation Analysis (GSVA). Immune cell
   infiltration was assessed using a single-sample gene-set enrichment
   analysis (ssGSEA) algorithm. We then constructed a diagnostic model for
   ARDEGs using Least Absolute Shrinkage and Selection Operator (LASSO)
   regression analysis and evaluated the diagnostic value of this model
   and the hub genes in predicting clinical outcomes and immunotherapy
   responses in cervical cancer.

Results

   A set of ARDEGs was identified from the Cancer Genome Atlas (TCGA),
   Gene Expression Omnibus (GEO), and UCSC Xena database. We performed
   KEGG, GO, and GSEA analyses on these genes, revealing significant
   involvement in cell proliferation, differentiation, and apoptosis. The
   ARDEGs diagnostic model, constructed using LASSO regression analysis,
   showed high predictive accuracy in cervical cancer patients. We
   developed a reliable nomogram and decision curve analysis to evaluate
   the clinical utility of the ARDEG diagnostic model. The 15 ARDEGs in
   the model were associated with clinicopathological features, prognosis,
   and immune cell infiltration. Notably, ITGA5 expression and the
   abundance of immune cell infiltration (specifically mast cell
   activation) were highly correlated.

Conclusion

   This study identifies the prognostic characteristics of ARGs in
   cervical cancer patients, elucidating aspects of the tumor
   microenvironment. It enhances the predictive accuracy of immunotherapy
   outcomes and establishes new strategies for immunotherapeutic
   interventions.

   Keywords: Cervical cancer, Angiogenesis, Diagnostic model, Prognostic
   index, Tumor microenvironment infiltration, Nomogram

1. Introduction

   Cervical cancer represents a major global health challenge, ranking as
   the fourth leading cause of cancer-related mortality among women
   [[27]1]. This malignancy is particularly prevalent in regions with
   limited healthcare resources, accounting for approximately 84 % of all
   cases and 88 % of deaths from cancer worldwide [[28]2]. Despite
   advances in treatment modalities including surgery, concurrent
   chemoradiotherapy, and novel approaches such as vascular-targeted
   therapy and immunotherapy, outcomes for patients with recurrent or
   metastatic cervical cancer remain poor [[29]3,[30]4]. These innovative
   therapies have shown promise, yet the long-term benefits are limited to
   a small group of patients, highlighting the critical need for enhancing
   response rates to immunotherapy.

   Recent research underscores the pivotal role of the tumor
   microenvironment (TME) in cervical cancer progression and its influence
   on the efficacy of immunotherapy [[31]5]. The TME comprises a complex
   network of extracellular matrix components, blood vessels, and various
   cellular constituents including immune, stromal, and cancer cells
   [[32]6]. Angiogenesis, the process of new blood vessel formation, is a
   key aspect of the TME that contributes to the immunosuppressive
   landscape, facilitating tumor immune evasion [[33]7]. Consequently,
   strategies targeting angiogenesis represent a viable approach for
   treating advanced stages of the disease. Following promising results
   from the KEYNOTE-826 trial, the combination of the antibody-drug
   conjugate paporizumab with bevacizumab was approved in late 2021 as a
   frontline therapy for recurrent/metastatic cervical cancer. However,
   the precise mechanisms by which immunotherapy and targeted therapy
   interact remain poorly understood.

   This study aims to elucidate the relationship between angiogenesis and
   the TME, with a focus on identifying diverse immunophenotypes of
   cervical cancer. We analyzed the expression of ARGs and their impact on
   disease prognosis, clinical outcomes, and immune response by utilizing
   data from TCGA, GEO, and UCSC Xena databases. Differential gene
   expression analysis between normal cells and cervical squamous cell
   carcinoma and endocervical adenocarcinoma (CESC) tissues was performed,
   followed by enrichment analysis using GO, KEGG, and GSEA methodologies.
   Key genes and clinical parameters were integrated through
   univariate/multivariate Cox regression and LASSO regression analysis to
   develop an ARDEGs model. The predictive capability of this model was
   evaluated using Kaplan-Meier analysis, time-dependent ROC analysis, and
   other statistical tools including the C-index, calibration curves, and
   decision curve analysis. Additionally, treatment responses in
   immunotherapy cohorts were examined, and the diagnostic significance of
   gene mutations in hub genes was assessed. Our results aim to provide
   new insights into the potential of immunotherapy in treating cervical
   cancer, offering strategies to anticipate treatment outcomes and
   improve therapeutic efficacy.

2. Materials and methods

2.1. Data acquisition and processing

   Expression profile data for cervical cancer (TCGA-CESC) were retrieved
   from TCGA using the 'TCGABiolinks' R package [[34]8]. A total of 309
   cervical cancer samples (Cancer group) and 3 adjacent non-tumor samples
   (Normal group) were included after excluding those lacking essential
   clinical information. The data were normalized to Fragments Per
   Kilobase Million (FPKM), and relevant clinical information was sourced
   from the UCSC Xena database ([35]http://genome.ucsc.edu) [[36]9]. The
   normalization of count sequencing data in the TCGA-CESC dataset was
   performed using the ‘limma’ package [[37]10].

2.2. External validation datasets

   Related datasets, [38]GSE44001 [[39]11], [40]GSE63514 [[41]12], and
   [42]GSE7803 [[43]13], were downloaded from the GEO [[44]14] using the
   ‘GEOquery’ R package [[45]15]. Specifically, the [46]GSE44001 dataset
   employed the [47]GPL14951 Illumina HumanHT-12 WG-DASL V4.0 R2
   Expression BeadChip platform to analyze gene expression in 300 cervical
   cancer samples. Concurrently, the [48]GSE63514 dataset utilized the
   [49]GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array
   to evaluate gene expression in 28 cervical cancer samples and 24
   matched adjacent normal tissue samples. Additionally, the [50]GSE7803
   dataset used the [51]GPL96 [HG-U133A] Affymetrix Human Genome U133A
   Array to assess gene expression in 21 cervical cancer samples and 10
   partially matched adjacent normal tissues. Probe name annotation for
   each dataset was meticulously performed using the respective GPL
   platform files. All samples were included in subsequent analyses. These
   datasets served as validation sets to confirm the reproducibility and
   reliability of the findings, with summarized information presented in
   [52]Table 1.

Table 1.

   CESC Dataset Information list.
                            TCGA-CESC   [53]GSE44001 [54]GSE63514 [55]GSE7803
   Platform                             [56]GPL14951 [57]GPL570   [58]GPL96
   Species                 Homo sapiens Homo sapiens Homo sapiens Homo sapiens
   Samples in Normal group 3                         24           10
   Samples in CESC group   309          300          28           21
   Reference                            [[59]5]      [[60]6]      [[61]7]
   [62]Open in a new tab

   Abbreviations: TCGA, the cancer genome atlas; CESC, Cervical squamous
   cell carcinoma and endocervical adenocarcinoma.

2.3. Standardization and merging of datasets

   To identify the potential roles and related biological features and
   pathways of differential genes in the CESC group versus the normal
   group, we first standardized the datasets TCGA-CESC, [63]GSE44001,
   [64]GSE63514, and [65]GSE7803 using the R package limma. Subsequently,
   batch effects were removed from the CESC datasets [66]GSE44001,
   [67]GSE63514, and [68]GSE7803 using the R package sva, resulting in a
   merged GEO dataset. The datasets before and after batch effect removal
   were compared using box plots and principal component analysis(PCA)
   graphs.

2.4. Angiogenesis-related genes (ARGs) and data integration

   ARGs were identified via the GeneCards Database
   ([69]https://www.genecards.org/) [[70]16], using 'Angiogenesis' as a
   keyword, retaining only those coded as ‘Protein Coding’ with a
   relevance score greater than 3, culminating in a collection of 365
   ARGs. After cross-referencing and deduplication, 145 additional ARGs
   were included from the literature, totaling 510 ARGs ([71]Supplementary
   Table S1).

2.5. Mutational and CNV analysis

   The somatic mutation dataset, including single nucleotide polymorphisms
   (SNPs), was obtained from TCGA. Data visualization was conducted using
   the 'maftools' R package [[72]17]. To examine gene copy number
   variation (CNV) among CESC patients, we utilized the R environment and
   integrated the data prior to analysis with GISTIC 2.0 [[73]18],
   employing the default parameters.

2.6. Cervical cancer-related differentially expressed genes

   To investigate differentially expressed genes (DEGs) between normal and
   CESC tissues, we utilized data from the GEO dataset. DEGs were
   identified by filtering for |log 2(fold change, FC)| > 0 with P-values
   <0.05. Genes with logFC >1 and P < 0.05 were classified as upregulated,
   while those with logFC < −1 and P < 0.05 were classified as
   downregulated. To discern genes associated with CESC pathology, we
   intersected DEGs with associated response genes (ARGs), and visualized
   these relationships using a Venn diagram. Differential expressions were
   further depicted through volcano plots generated with the R package,
   ggplot 2.

2.7. Calculation of angiogenic score

   The single-sample gene-set enrichment analysis (ssGSEA) [[74]19]
   algorithm effectively quantifies the relative abundance of genes within
   individual dataset samples. This method is crucial for advancing
   high-throughput genomic studies as it enables a meticulous analysis of
   gene expression levels across an extensive array of biological samples.
   Consequently, ssGSEA enhances our comprehension of the specific
   contributions of individual genes to cellular functions and disease
   pathologies, thereby augmenting the prospects for developing targeted
   therapeutic strategies. We applied the ssGSEA algorithm using the GSVA
   package in R to compute the angiogenesis scores of cancer samples from
   TCGA-CESC and GEO datasets. These scores were derived from the
   expression matrix of ARGs in each sample. Based on their angiogenesis
   scores, patients were dichotomized into two groups, and expression
   differences between the groups were analyzed using the Mann-Whitney U
   test, with significance set at P < 0.05.

2.8. Functional enrichment analysis: GO and KEGG

   Gene Ontology (GO) [[75]20] and Kyoto Encyclopedia of Genes and Genomes
   (KEGG) [[76]21] analyses were conducted to understand the functional
   impacts of ARDEGs. Using the R package clusterProfiler [[77]22], GO
   annotation was analyzed, and significant enrichment was identified at
   P < 0.05 with a false discovery rate (FDR) threshold (q-value) of
   <0.05. The Benjamini-Hochberg method was employed for p-value
   correction during these analyses.

2.9. Gene Set Enrichment Analysis (GSEA)

   GSEA [[78]23] is used to evaluate the distribution trend of a
   predefined gene set within a gene list ordered according to their
   relevance to a particular phenotype, thereby assessing the contribution
   of these genes to the phenotype. In this study, we used the logFC value
   to rank molecules and evaluate their enrichment in the predefined gene
   sets. The clusterProfiler package was then used to conduct enrichment
   analysis on all phenotype-related genes. The parameters for this GSEA
   enrichment analysis were as follows: seed set to 2020, 1000
   permutations, minimum of 10 genes per gene set, maximum of 500 genes
   per gene set, and the p-value adjustment method was Benjamini-Hochberg
   (BH). We obtained the “h.all.v7.4.symbols.gmt” gene set from the
   Molecular Signatures Database (MSigDB) and conducted GSEA analysis to
   evaluate the expression of genes in the TCGA-CESC dataset. The criteria
   for significant enrichment were P.value < 0.05 and FDR
   (q.value) < 0.25.

2.10. Construction of angiogenesis diagnostic model

   To develop a diagnostic model for ARDEGs in the TCGA-CESC dataset, we
   initially employed the glmnet package [[79]24] to perform LASSO
   regression [[80]25], using the parameter family = "binomial" with
   ten-fold cross-validation. The process was iterated over 1000 cycles to
   prevent overfitting. LASSO regression, a prevalent machine learning
   algorithm for building diagnostic models, is based on linear regression
   enhanced by adding a penalty term (lambda × absolute value of the
   slope). This regularization helps prevent overfitting during curve
   fitting and improves the model's generalization ability.
   [MATH: <mrow><mtext>riskScore</mtext><mo
   linebreak="badbreak">=</mo><mrow><munder><mo>∑</mo><mi
   mathvariant="normal">i</mi></munder><mrow><mtext>Coefficient</mtext><ms
   pace width="0.25em"></mspace><mrow><mo
   stretchy="true">(</mo><msub><mrow><mtext>hub</mtext><mspace
   width="0.25em"></mspace><mtext>gene</mtext></mrow><mi
   mathvariant="normal">i</mi></msub><mo
   stretchy="true">)</mo></mrow><mo>*</mo><mtext>mRNA</mtext><mspace
   width="0.25em"></mspace><mtext>Expression</mtext><mspace
   width="0.25em"></mspace><mrow><mo
   stretchy="true">(</mo><mrow><mtext>hub</mtext><mspace
   width="0.25em"></mspace><msub><mtext>gene</mtext><mi
   mathvariant="normal">i</mi></msub></mrow><mo
   stretchy="true">)</mo></mrow></mrow></mrow></mrow> :MATH]

   Subsequently, we extracted the penalty coefficients (lambda) of the
   ARDEGs from the LASSO regression diagnostic model, and calculated the
   risk score of the ARDEGs diagnostic model, denoted as riskScore.
   Additionally, we utilized the R package RCircos to annotate the
   locations of the ARDEGs on human chromosomes.

2.11. Clinical prognosis analysis

   To elucidate the prognostic significance of target genes in cervical
   cancer, we employed univariate and multivariate Cox regression analyses
   using the TCGA-CESC dataset, incorporating key genes (hub genes and
   mRNA) and clinical variables to develop a predictive Cox regression
   model. We generated nomograms to estimate 1-, 3-, and 5-year survival
   probabilities for cervical cancer patients, based on multivariate
   regression scores for each clinical variable. The results were
   visualized using a risk factor map that depicted the association
   between molecular expression of ARDEGs, risk scores, and patient
   survival outcomes within the prognostic model. We further validated the
   nomogram's accuracy and resolution with a calibration curve,
   correlating actual survival probabilities against model predictions
   under varying conditions, utilizing the ggDCA package in R.

2.12. Prognostic evaluation using Kaplan-Meier curves

   Our analysis also included Kaplan-Meier curve assessments to analyze
   survival time and its association with various prognostic factors. The
   Kaplan-Meier method calculated survival probabilities, reflecting
   ongoing patient survival beyond specified time points. Significance was
   determined for genes impacting survival (P < 0.05).

2.13. Immune infiltration analysis with CIBERSORT

   CIBERSORT ([81]https://cibersort.stanford.edu/) [[82]26] is an
   innovative online algorithm that leverages gene expression data to
   accurately estimate the composition of immune cells within tissue
   samples. This method utilizes a signature matrix that characterizes 22
   distinct human immune cell subtypes—including T cells, B cells,
   monocytes, and natural killer cells—through their unique gene
   expression profiles. Employing linear support vector regression,
   CIBERSORT effectively deconvolutes RNA expression data from mixed cell
   populations, enabling precise quantification of the relative abundances
   of different immune cell types. This algorithm serves as a crucial tool
   for analyzing the distribution and infiltration of immune cells in
   solid tumors and various complex tissues, thereby enhancing our
   understanding of immune landscapes in diverse pathological contexts.
   Using the CIBERSORT tool, we evaluated the immune cell infiltration in
   the TCGA-CESC samples based on 22 known immune cell subtype gene
   expression profiles. This analysis highlighted significant differences
   in immune cell infiltration among the CESC samples. A heatmap was
   created to illustrate significant correlations among these immune cells
   (P < 0.05).

2.14. Receiver Operating Characteristic(ROC) curve analysis

   To assess the diagnostic utility of hub gene expression, ROC curves
   were plotted for the Normal and CESC groups. We evaluated the area
   under the curve (AUC) values, which indicate the sensitivity and
   specificity of the models; higher AUC values suggest better diagnostic
   performance.

2.15. Statistical analysis

   All data processing and analyses in this study were performed using R
   software (Version 4.1.2). For assessing statistical significance
   between two continuous variables, we applied the independent Student
   t-test for normally distributed data and the Mann-Whitney U test (also
   known as the Wilcoxon rank-sum test) for non-normally distributed data.
   To evaluate differences between two categorical variables, we utilized
   either the Chi-square test or Fisher's exact test, depending on the
   data's distribution. Correlations among various molecules were
   determined using Spearman's correlation analysis, unless otherwise
   specified. We reported all statistical P-values as two-sided, adhering
   to the significance threshold of P < 0.05.

3. Results

3.1. Figure legends

   To meticulously elucidate our methodology for investigating the
   biological characteristics of cervical cancer using bioinformatics
   techniques, we have developed a detailed analysis flowchart
   ([83]Supplementary Fig. 1).

3.2. Merging of GEO datasets

   Initially, we processed the cervical cancer datasets [84]GSE44001,
   [85]GSE63514, and [86]GSE7803 to remove batch effects, resulting in the
   merged dataset named GEO dataset. We then compared the datasets before
   and after batch effect removal using distribution boxplots and
   Principal Component Analysis (PCA) graphs ([87]Supplementary Fig. 2).
   The results from the distribution boxplots and PCA graphs indicate that
   the batch effects in the samples of the GEO dataset were effectively
   eliminated after the batch removal process.

3.3. Analysis of the DEGs in cervical cancer

   We performed a differential expression analysis on the TCGA-CESC
   dataset using the limma package to identify DEGs between the normal and
   CESC groups. Our analysis revealed that 5083 genes met the criteria of
   absolute log fold change (|logFC|) greater than 1 and P-value less than
   0.05. Specifically, we identified 2167 genes as significantly
   upregulated (logFC >1) and 2916 genes as significantly downregulated
   (logFC < −1) in the CESC samples compared to normal tissue. We
   visualized these changes using a volcano plot as depicted in [88]Fig.
   1A. Furthermore, to identify ARDEGs, we intersected the DEGs and ARGs
   that also met the criteria of |logFC|>0 and P < 0.05 within the
   TCGA-CESC dataset, resulting in 202 ARDEGs. These intersections are
   represented in a Venn diagram ([89]Fig. 1B). Additionally, we employed
   the heatmap function from the R package ‘heatmap’ to illustrate the
   expression differences between the normal and CESC groups for these 202
   ARDEGs. To further refine our list of ARDEGs, we intersected these 202
   ARDEGs with 1942 prognostic molecules identified in cervical cancer,
   ultimately identifying 30 significant ARDEGs. This intersection is also
   illustrated in a Venn diagram ([90]Fig. 1C).

Fig. 1.

   [91]Fig. 1
   [92]Open in a new tab

   Differential Gene Expression Analysis in Cervical Cancer (A) Volcano
   plot illustrating the distribution of DEGs within the TCGA-CESC
   dataset. (B) Venn diagram depicting the intersections between ARGs and
   DEGs identified within the TCGA-CESC dataset. (C) Venn diagram of DEGs
   and prognostic molecular ARGs in the TCGA-CESC dataset.

3.4. GO and KEGG Pathway Enrichment Analysis of ARDEGs

   To elucidate the biological processes (BP), molecular functions (MF),
   cellular components (CC), and pathway associations of the 30 ARDEGs in
   cervical cancer, comprehensive GO ([93]Table 2) and KEGG ([94]Table 3)
   enrichment analyses were conducted. We applied stringent criteria for
   these analyses, selecting enrichment entries with both P-values and
   q-values less than 0.05. The results were visually represented through
   bubble charts ([95]Fig. 2A–B), circular network diagrams ([96]Fig.
   2C–D), and histograms ([97]Fig. 2E–F).

Table 2.

   GO enrichment analysis results of ARDEGs.
   ONTOLOGY ID Description GeneRatio BgRatio pvalue p.adjust qvalue
   BP GO:0045765 regulation of angiogenesis 10/29 383/18670 1.66e-10
   2.33e-07 1.33e-07
   BP GO:0043062 extracellular structure organization 10/29 422/18670
   4.27e-10 2.33e-07 1.33e-07
   BP GO:1901342 regulation of vasculature development 10/29 422/18670
   4.27e-10 2.33e-07 1.33e-07
   BP GO:0030198 extracellular matrix organization 9/29 368/18670 2.88e-09
   1.18e-06 6.70e-07
   BP GO:0048146 positive regulation of fibroblast proliferation 5/29
   51/18670 1.41e-08 4.59e-06 2.62e-06
   CC GO:0034774 secretory granule lumen 5/30 321/19717 1.13e-04 0.004
   0.002
   CC GO:0001726 ruffle 4/30 172/19717 1.28e-04 0.004 0.002
   CC GO:0060205 cytoplasmic vesicle lumen 5/30 338/19717 1.44e-04 0.004
   0.002
   CC GO:0031983 vesicle lumen 5/30 339/19717 1.46e-04 0.004 0.002
   CC GO:0005925 focal adhesion 5/30 405/19717 3.33e-04 0.005 0.003
   MF GO:0048018 receptor ligand activity 8/29 482/17697 7.41e-07 7.11e-05
   3.69e-05
   MF GO:0050839 cell adhesion molecule binding 8/29 499/17697 9.62e-07
   7.11e-05 3.69e-05
   MF GO:0005161 platelet-derived growth factor receptor binding 3/29
   15/17697 1.78e-06 7.11e-05 3.69e-05
   MF GO:0005178 integrin binding 5/29 132/17697 2.20e-06 7.11e-05
   3.69e-05
   MF GO:0070851 growth factor receptor binding 5/29 134/17697 2.37e-06
   7.11e-05 3.69e-05
   [98]Open in a new tab

   Abbreviations: ARDEGs, Angiogenesis-related differentially expressed
   genes; GO: Gene Ontology; BP, biological process; CC, cellular
   component; MF, molecular function.

Table 3.

   KEGG enrichment analysis results of ARDEGs.
   ONTOLOGY ID Description GeneRatio BgRatio pvalue p.adjust qvalue
   KEGG hsa04510 Focal adhesion 8/25 201/8076 9.63e-08 1.38e-05 8.42e-06
   KEGG hsa05323 Rheumatoid arthritis 6/25 93/8076 2.94e-07 2.10e-05
   1.29e-05
   KEGG hsa04512 ECM-receptor interaction 5/25 88/8076 6.13e-06 2.02e-04
   1.24e-04
   KEGG hsa 05219 Bladder cancer 4/25 41/8076 6.70e-06 2.02e-04 1.24e-04
   KEGG hsa04151 PI3K-Akt signaling pathway 8/25 354/8076 7.07e-06
   2.02e-04 1.24e-04
   [99]Open in a new tab

   Abbreviations: ARDEGs, Angiogenesis-related differentially expressed
   genes; KEGG, Kyoto Encyclopedia of Genes and Genomes.

Fig. 2.

   [100]Fig. 2
   [101]Open in a new tab

   GO and KEGG Pathway Enrichment Analysis of ARDEGs (A) GO analysis of
   ARDEGs and (B) KEGG analysis of ARDEGs. Results for the bubble diagram
   display. (C) Circular network diagram display of ARDEGs GO analysis and
   (D) KEGG pathway analysis results. (E) ARDEGs bar graph of GO analysis
   and (F) KEGG pathway analysis results. The screening criteria for the
   GO and KEGG enrichment entries were P < 0.05 and q < 0.05.

   In terms of BP, the ARDEGs predominantly contributed to the regulation
   of the JAK-STAT signaling cascade (GO:0046425), Notch signaling in
   cardiac development (GO:0061314), cellular response to hypoxia
   (GO:0001666), and the positive regulation of apoptosis execution phase
   (GO:1900119). For CC, notable enrichments included the secretory
   granule lumen (GO:0034774), ruffles (GO:0001726), ruffle membrane
   (GO:0032587), and lamellipodium membrane (GO:0031258).

   MF analysis revealed significant enrichment in Notch binding
   (GO:0005112), receptor-ligand activity (GO:0048018), platelet-derived
   growth factor receptor binding (GO:0005161), and growth factor receptor
   binding (GO:0070851). Additionally, KEGG pathway analysis identified
   significant enrichments in critical signaling pathways including the
   PI3K-Akt (hsa04151), MAPK (hsa 04010), and focal adhesion pathways
   (hsa04510), as well as those related to rheumatoid arthritis
   (hsa05323), ECM-receptor interactions (hsa04512), bladder cancer (hsa
   05219), IL-17 signaling (hsa 04657), and melanoma(hsa0521).

3.5. GSEA and GSVA enrichment analysis of the TCGA-CESC dataset

   We applied GSEA to evaluate the associations between gene expression
   profiles and biological processes, cellular components, and molecular
   pathways in the Normal and CESC groups within the TCGA-CESC dataset.
   Our analysis revealed significant enrichment of gene expression in key
   oncogenic pathways including PI3K-AKT, WNT, NOTCH, and Hedgehog among
   others, as depicted in [102]Fig. 3A–E and summarized in [103]Table 4.

Fig. 3.

   [104]Fig. 3
   [105]Open in a new tab

   GSEA and GSVA Enrichment Analysis of the TCGA-CESC dataset (A) GSEA for
   the four principal biological pathways. (B-E) Marked enrichment
   observed in the PI3K-AKT, WNT, NOTCH, and Hedgehog pathways,
   respectively. (F) GSVA analysis differentiating the Normal and CESC
   groups within the dataset. Blue denotes samples from normal cervical
   tissue (Normal group), while red indicates samples from cervical cancer
   patients (CESC group). The screening criteria for significant
   enrichment for GSEA and GSVA were P < 0.05 and q < 0.25.

Table 4.

   GSEA analysis of TCGA-CESC.
   Description setSize enrichmentScore NES pvalue p.adjust qvalues
   WP_PI3KAKT_SIGNALING_PATHWAY 339 −0.421338455 −1.583611504 0.002717391
   0.028535354 0.02189818
   KEGG_WNT_SIGNALING_PATHWAY 150 −0.382701026 −1.339565238 0.03652968
   0.145911047 0.111972902
   REACTOME_SIGNALING_BY_NOTCH 234 0.367222844 1.453964692 0.003367003
   0.029921885 0.022962212
   WP_HEDGEHOG_SIGNALING_PATHWAY 43 −0.650532216 −1.890314928 0.001694915
   0.028535354 0.02189818
   NABA_CORE_MATRISOME 274 −0.63471233 −2.343163134 0.00140056 0.028535354
   0.02189818
   REACTOME_PLATELET_ACTIVATION_SIGNALING_AND_AGGREGATION 260 −0.483379036
   −1.774962916 0.001402525 0.028535354 0.02189818
   WP_FOCAL_ADHESION 198 −0.473607215 −1.698359252 0.001453488 0.028535354
   0.02189818
   KEGG_FOCAL_ADHESION 199 −0.520045063 −1.865867367 0.001457726
   0.028535354 0.02189818
   NABA_ECM_GLYCOPROTEINS 195 −0.656522896 −2.349950105 0.001466276
   0.028535354 0.02189818
   REACTOME_MUSCLE_CONTRACTION 205 −0.607228932 −2.179930265 0.001466276
   0.028535354 0.02189818
   KEGG_CALCIUM_SIGNALING_PATHWAY 177 −0.484610146 −1.725678237
   0.001485884 0.028535354 0.02189818
   WP_MYOMETRIAL_RELAXATION_AND_CONTRACTION_PATHWAYS 156 −0.509049892
   −1.787618711 0.001519757 0.028535354 0.02189818
   REACTOME_DISEASES_OF_GLYCOSYLATION 142 −0.49027598 −1.701442089
   0.001552795 0.028535354 0.02189818
   REACTOME_CARDIAC_CONDUCTION 137 −0.588974012 −2.03917596 0.001557632
   0.028535354 0.02189818
   REACTOME_RESPONSE_TO_ELEVATED_PLATELET_CYTOSOLIC_CA2_ 131 −0.557798893
   −1.920741604 0.0015625 0.028535354 0.02189818
   [106]Open in a new tab

   Abbreviations: GSEA, Gene Set Enrichment Analysis.

   To further differentiate the molecular profiles between the Normal and
   CESC groups, we conducted GSVA of the TCGA-CESC dataset ([107]Fig. 3F).
   This analysis highlighted significant differences in the enrichment of
   the Hedgehog pathway and several other gene sets, illustrating distinct
   gene expression patterns between CESC and normal cervical tissues.

3.6. Development of an ARDEGs diagnostic model using LASSO regression
analysis

   To assess the diagnostic potential of 30 ARDEGs within the TCGA-CESC
   dataset, we employed LASSO regression analysis to develop an ARDEGs
   diagnostic model, as depicted in [108]Fig. 4A. The model's risk factors
   were categorized using a novel risk factor graph ([109]Fig. 4B), which
   is split into two sections: risk grouping (determined through the Cox
   regression prognostic model and stratified by the median) and survival
   outcomes (illustrated using a dot plot of survival time and outcomes
   from TCGA-CESC clinical samples).

Fig. 4.

   [110]Fig. 4
   [111]Open in a new tab

   Development of an ARDEGs Diagnostic Model (A) Diagram of the LASSO
   regression diagnostic model for ARDEGs within the TCGA-CESC dataset.
   (B) Risk factor visualization illustrating the ARDEGs in the diagnostic
   model. (C) Trajectory diagram of variables within the LASSO regression
   for the ARDEGs diagnostic model. (D) Comparative diagram depicting
   group classifications in the ARDEGs diagnostic model. (E) Lollipop
   plots representing ARDEGs and their implications in the diagnostic
   model.

   We identified 15 key ARDEGs: BAIAP2L1, FGFR3, NRP1, E2F1, CA9, SPP1,
   DLL4, VAV3, ITGA5, PTX3, EMCN, NDRG2, EFNA1, CXCL8, and JUN. Enhancing
   the LASSO penalty's absolute value (lambda × slope) improved the
   model's generalizability and mitigated overfitting, as demonstrated by
   the LASSO variable trajectory map ([112]Fig. 4C). This map illustrated
   how gene inclusion varied inversely with the lambda coefficient of the
   LASSO penalty term.

   Further analysis revealed significant differences in the expression
   levels of the ARDEGs diagnostic model between high- and low-scoring
   patient groups diagnosed with cervical cancer (P < 0.001), as shown in
   [113]Fig. 4D. Additionally, we constructed a correlation Laplace
   diagram to further elucidate the relationship between ARDEG expression
   and the risk scores derived from the diagnostic model ([114]Fig. 4E).

   We stratified the prediction scores from 15 ARDEGs in TCGA-CESC samples
   into High/Low groups, and assessed expression variability using the
   Wilcoxon signed-rank test, as shown in [115]Fig. 5A. Our findings
   indicated significant differences in the expression of ARDEGs including
   NRP1, E2F1, CA9, SPP1, DLL4, ITGA5, PTX3, NDRG2, EFNA1, CXCL8, and JUN
   between the groups (P < 0.001).

Fig. 5.

   [116]Fig. 5
   [117]Open in a new tab

   Expression of ARDEGs in the TCGA-CESC Dataset (A) Comparative analysis
   of ARDEGs between different groups (CESC vs. Normal) in the TCGA-CESC
   diagnostic model. Blue denotes samples from the normal control group,
   whereas red indicates samples from the cervical cancer (CESC) group.
   (B) Differential expression of ARDEGs across risk categories (high vs.
   low) within the TCGA-CESC dataset. Blue signifies the low-risk
   expression group, and red denotes the high-risk expression group within
   the disease cohort. (C) Heatmap illustrating the correlation of ARDEGs
   between the groups (CESC vs. Normal) in the TCGA-CESC dataset. (D)
   Circular plots depicting the chromosomal distribution of ARDEGs. NS,
   non-significant. *P < 0.05, **P < 0.01, ***P < 0.001.

   Subsequent differential expression analysis demonstrated notable
   variance in the levels of these 15 ARDEGs between normal and CESC
   tissues within the TCGA-CESC cohort ([118]Fig. 5B). Specifically, genes
   such as BAIAP2L1, NRP1, E2F1, DLL4, and EMCN exhibited substantial
   expression differences (P < 0.01), while FGFR3, CA9, SPP1, VAV3, ITGA5,
   PTX3, NDRG2, and EFNA1 also showed significant differences (P < 0.05).

   Further analysis explored the interrelationships among the ARDEGs
   within the TCGA-CESC dataset, revealing statistically significant
   correlations, particularly between BAIAP2L1 and DLL4, ITGA5, PTX3
   (P < 0.01) and between E2F1 and NDRG2 (P < 0.05), as illustrated in
   [119]Fig. 5C. Additionally, we mapped the chromosomal locations of
   these ARDEGs on a circular chromosome diagram ([120]Fig. 5D),
   highlighting that JUN, VAV3, and EFNA1 are located on chromosome 1,
   with BAIAP2L1 on chromosome 7.

   We generated ROC curves for 15 ARDEGs across normal and CESC samples
   from the TCGA-CESC dataset ([121]Supplementary Fig. 3). The diagnostic
   accuracy of these ARDEGs was notably high in distinguishing between
   normal and CESC conditions. Specifically, BAIAP2L1 demonstrated an Area
   Under the Curve (AUC) of 0.999, followed by DLL4 with an AUC of 0.966,
   NRP1 with an AUC of 0.938, FGFR3 with an AUC of 0.931, and CA9 with an
   AUC of 0.923. Similarly, VAV3 (AUC = 0.919), EFNA1 (AUC = 0.906), and
   PTX3 (AUC = 0.902) also exhibited high diagnostic performance. NDRG2
   (AUC = 0.899), SPP1 (AUC = 0.871), ITGA5 (AUC = 0.841), JUN
   (AUC = 0.780), and CXCL8 (AUC = 0.770) demonstrated moderate diagnostic
   accuracy.

3.7. Prognostic Clinical Manifestations of ARDEGs in cervical cancer

   To elucidate the association between the expression of the 15 ARDEGs
   and the onset of cervical cancer, we conducted univariate and
   multivariate Cox regression analyses using clinical data from the
   TCGA-CESC dataset. The initial statistical analysis of these data is
   detailed in [122]Table 5.

Table 5.

   Univariate and Multivariate Cox regression.
   Characteristics Total(N) Univariate analysis
     __________________________________________________________________

   Multivariate analysis
     __________________________________________________________________

   Hazard ratio (95 % CI) P value Hazard ratio (95 % CI) P value
   BAIAP2L1 306
   Low 153 Reference
   High 153 1.835 (1.134–2.969) 0.013 2.228 (1.314–3.777) 0.003
   FGFR3 306
   Low 153 Reference
   High 153 0.574 (0.358–0.922) 0.022 0.716 (0.411–1.246) 0.237
   NRP1 306
   Low 153 Reference
   High 153 1.791 (1.113–2.882) 0.016 1.688 (0.945–3.013) 0.077
   E2F1 306
   Low 153 Reference
   High 153 0.581 (0.362–0.931) 0.024 0.678 (0.402–1.145) 0.146
   CA9 306
   Low 153 Reference
   High 153 1.547 (0.969–2.469) 0.068 1.009 (0.591–1.725) 0.972
   SPP1 306
   Low 153 Reference
   High 153 1.787 (1.104–2.892) 0.018 1.330 (0.776–2.281) 0.299
   DLL4 306
   Low 153 Reference
   High 153 1.816 (1.131–2.915) 0.014 1.226 (0.713–2.109) 0.461
   VAV3 306
   Low 153 Reference
   High 153 0.574 (0.358–0.920) 0.021 0.711 (0.412–1.227) 0.221
   ITGA5 306
   Low 153 Reference
   High 153 2.374 (1.458–3.866) <0.001 1.298 (0.715–2.353) 0.391
   PTX3 306
   Low 153 Reference
   High 153 1.984 (1.226–3.212) 0.005 1.453 (0.830–2.544) 0.191
   EMCN 306
   Low 153 Reference
   High 153 0.587 (0.365–0.943) 0.028 0.555 (0.321–0.959) 0.035
   NDRG2 306
   Low 153 Reference
   High 153 0.577 (0.361–0.923) 0.022 0.577 (0.327–1.019) 0.058
   EFNA1 306
   Low 153 Reference
   High 153 1.937 (1.207–3.107) 0.006 1.890 (1.123–3.182) 0.017
   CXCL8 306
   Low 153 Reference
   High 153 2.426 (1.497–3.934) <0.001 1.725 (1.012–2.941) 0.045
   JUN 306
   Low 153 Reference
   High 153 1.743 (1.092–2.783) 0.020 1.481 (0.868–2.526) 0.149
   [123]Open in a new tab

   Abbreviations: CI, confidence interval.

   The findings from the Cox regression analyses are illustrated through
   forest plots ([124]Fig. 6A). Additionally, we developed a nomogram to
   ascertain the prognostic significance of these models, as depicted in
   [125]Fig. 6B. We further executed prognostic calibration analyses for
   1-, 3-, and 5-year intervals based on the Cox regression variables,
   presenting the results in calibration curves ([126]Fig. 6C–E). These
   curves plot survival probabilities from actual data against predictions
   along the horizontal axis, with the model's accuracy at various
   forecast intervals denoted by lines and points in diverse colors. The
   proximity of these lines to the gray ideal line reflects the
   prediction's precision.

Fig. 6.

   [127]Fig. 6
   [128]Open in a new tab

   Prognostic Clinical Manifestations of ARDEGs in Cervical Cancer (A)
   Univariate regression analysis is visually summarized in ARDEG forest
   plots. (B) The corresponding nomograms provide a detailed graphical
   representation. (C-E) Calibration curves for the Cox regression
   prognostic model are shown for 1-year (C), 3-year (D), and 5-year (E)
   time points, illustrating model accuracy over these periods. (F–H) DCA
   for the Cox regression prognostic model is presented for 1-year (F),
   3-year (G), and 5-year (H) intervals, highlighting the clinical utility
   of the model.

   Moreover, we utilized Decision Curve Analysis (DCA) to assess the
   clinical utility of the Cox regression prognostic model over 1-year
   ([129]Fig. 6F), 3-year ([130]Fig. 6G), and 5-year ([131]Fig. 6H)
   horizons. The DCA plots demonstrate the net benefits across various
   threshold probabilities, with the model's effectiveness indicated by a
   broader x-value range, suggesting superiority over scenarios with
   all-positive or all-negative assumptions.

3.8. Prognostic performance assessment of ARDEGs

   Kaplan-Meier (KM) survival curves were constructed for the fifteen
   autophagy-related differentially expressed genes (ARDEGs) exhibiting
   statistical significance (P < 0.05) ([132]Fig. 7A-N). Notably, elevated
   expression levels of pivotal genes including BAIAP2L1, NRP1, SPP1,
   DLL4, ITGA5, PTX3, EFNA1, and CXCL8 were significantly correlated with
   diminished overall survival (OS) rates. These findings underscore the
   prognostic relevance of ARDEGs in cervical squamous cell carcinoma
   (CESC) and highlight their potential as prognostic biomarkers in
   clinical assessments.

Fig. 7.

   [133]Fig. 7
   [134]Open in a new tab

   Kaplan-Meier Curves Illustrating Overall Survival Associated with
   ARDEGs (A–N) The blue represents the low expression (Low-expression
   group), while the red represents the cervical cancer patient sample
   (High-expression group). OS, Overall survival. KM curve, Kaplan-Meier
   curve. *P < 0.05, **P < 0.01, ***P < 0.001.

3.9. Mutation analysis of ARDEGs in CESC patients

   To systematically analyze the somatic mutations in 15 ARDEGs (BAIAP2L1,
   FGFR3, NRP1, E2F1, CA9, SPP1, DLL4, VAV3, ITGA5, PTX3, EMCN, NDRG2,
   EFNA1, CXCL8, JUN) associated with CESC, we conducted a mutation
   analysis on CESC patient samples from the TCGA-CESC dataset using the R
   package maftools. Our findings reveal that the predominant mutations in
   the dataset were missense mutations, which constituted the majority of
   alterations. Additionally, the mutation spectrum of the 15 ARDEGs in
   these patients predominantly consisted of single nucleotide
   polymorphisms (SNPs), with fewer instances of insertions and deletions.
   Notably, the most frequent single nucleotide variant (SNV) among CESC
   patients was C > T, followed by C > G. Furthermore, we investigated the
   copy number variations (CNVs) in these 15 ARDEGs within the same
   cohort. By downloading and processing CNV data and applying GISTIC 2.0
   for analysis, we visualized our results, which identified significant
   amplifications and deletions in the ARDEGs. Notably, PTX3, EFNA1, and
   E2F1 exhibited the highest amplification frequencies, whereas FGFR3,
   SPP1, and EMCN were most frequently deleted ([135]Supplementary Fig.
   4).

   Comprehensive Analysis of Immune Cell Infiltration Dynamics in Cervical
   Cancer via CIBERSORT and ssGSEA.

   In this study, we leveraged the expression profiles of 15 ARDEGs in
   cervical cancer samples to compute Activity Scores (As) for each
   patient using the ssGSEA algorithm. Following this, the CIBERSORT
   algorithm was applied to delineate the landscape of immune cell
   infiltration based on the expression data, stratifying the patients
   into high and low As groups ([136]Fig. 8A).

Fig. 8.

   [137]Fig. 8
   [138]Open in a new tab

   Comprehensive Analysis of Immune Infiltration in Cervical Cancer
   Utilizing CIBERSORT (A) The histogram illustrates the immune
   infiltration levels for 22 distinct types of immune cells, highlighting
   their distribution and prevalence. (B) The comparison chart delineates
   the relative abundance of these immune cells, providing a clear visual
   representation of their infiltration levels. (C) The heat map offers a
   detailed correlation analysis between aberrantly regulated
   differentially expressed genes (ARDEGs) and immune cell expressions.
   Positive correlations, depicted as red circles, indicate a direct
   relationship between gene expression and immune cell infiltration, with
   the intensity of the association reflected by the size of the circles.
   Conversely, blue circles signify negative correlations, with the circle
   size proportionate to the strength of the inverse relationship.

   Detailed immunoprofiling revealed distinct differences in the
   infiltration patterns of specific immune cells. Statistically
   significant variances were observed particularly in naive B cells,
   angiogenesis-associated macrophages (M0), activated mast cells, and
   resting mast cells, with p-values <0.001. Additionally, notable
   disparities were also found in the expression levels of CD8^+ T cells
   and angiogenic M1 macrophages between the high and low As groups, with
   p-values of <0.01 and <0.05 respectively ([139]Fig. 8B).

   To further elucidate the interaction between ARDEGs and immune cell
   infiltration, correlation heat maps were utilized, demonstrating a
   pronounced association between ITGA5 expression and the prevalence of
   activated mast cells in both high and low As groups ([140]Fig. 8C).
   These findings highlight the critical interplay between specific ARDEGs
   and immune cell populations, potentially offering new insights into the
   mechanisms driving immune evasion and tumor progression in cervical
   cancer.

3.10. Analysis of immunotherapy outcomes based on an angiogenesis risk
scoring model

   To further analyze the predictive capability of the angiogenesis risk
   scoring model for immunotherapy, we downloaded cervical cancer-related
   angiogenesis scores (Angiogenesis score, As) from the TCIA database. We
   employed the ggplot function in R to create boxplots depicting the
   differences in As among different patient risk groups within the
   TCGA-CESC dataset. The results showed significant disparities between
   the high-risk and low-risk groups across four types of Immune Profiles
   Scores (IPS). These include IPS, IPS-PD1/PD-L1/PD-L2, IPS-CTLA4, and
   IPS-PD1/PD-L1/PD-L2 + CTLA4 ([141]Supplementary Fig. 5), with all four
   IPS being significantly higher in the high-risk group (p < 0.05). This
   indicates that the risk score can effectively predict the prognosis of
   immunotherapy.

3.11. Clinical Correlation Analysis of Prognostic ARDEGs

   We conducted a detailed examination of the expression levels of
   prognostic ARDEGs to ascertain their correlation with clinical outcomes
   in CESC) patients. This study analyzed the association between the
   expression profiles of selected predictive ARDEGs and various
   clinicopathological features linked to distinct prognoses. We also
   explored how these expression patterns influenced key prognostic
   indicators such as tumor progression-free interval (PFI),
   disease-specific survival (DSS), and overall survival (OS) in cervical
   cancer tissues.

   Our findings revealed that higher expression levels of specific
   prognostic ARDEGs—including BAIAP2L1, NRP1, E2F1, CA9, SPP1, DLL4,
   ITGA5, PTX3, EMCN, and CXCL8—were significantly correlated with
   improved OS in patients (P < 0.05) as shown in [142]Fig. 9A. Similarly,
   significant associations were observed between the expression of these
   genes and DSS (P < 0.05), presented in [143]Fig. 9B. Furthermore, the
   expression profiles of BAIAP2L1, NRP1, SPP1, DLL4, ITGA5, and PTX3 were
   significantly linked with extended PFI (P < 0.05), depicted in
   [144]Fig. 9C.

Fig. 9.

   [145]Fig. 9
   [146]Open in a new tab

   Clinical Correlation Analysis of Prognostic ARDEGs Prognosis of ARDEGs
   and (A) OS, (B) DSS, (C) PFI. NS, non-significant. *P < 0.05,
   **P < 0.01, ***P < 0.001.

   These results underscore the potential of ARDEGs as biomarkers for
   prognostic evaluation in cervical cancer, suggesting that their
   differential expression may serve as a valuable predictor of patient
   outcomes.

3.12. Analysis of Hub Gene Expression in Normal and CESC datasets

   We evaluated the expression profiles of 15 ARDEGs across normal and
   CESC samples within GEO datasets. Our analysis identified significant
   differential expressions: FGFR3, E2F1, CA9, SPP1, and NDRG2 displayed
   marked differences (P < 0.001), while ITGA5 showed substantial variance
   (P < 0.01). Additionally, the expressions of NRP1 and EFNA1 were
   significantly different (P < 0.05) ([147]Fig. 10A).

Fig. 10.

   [148]Fig. 10
   [149]Open in a new tab

   Analysis of Hub Gene Expression in Normal and CESC Datasets (A)
   Comparative analysis of ARDEGs expression in CESC versus normal tissue
   samples from the GEO dataset. (B) Heatmap illustrating the correlation
   of ARDEGs across the different groups (CESC/Normal) within the GEO
   dataset. (C-J) ROC curve analysis of significantly differentially
   expressed genes in the GEO dataset, including: (C) E2F1, (D) NDRG2, (E)
   SPP1, (F) FGFR3, (G) CA9, (H) ITGA5, (I) EFNA1, and (J) NRP1. NS,
   non-significant (P ≥ 0.05). *P < 0.05, **P < 0.01, ***P < 0.001.

   A correlation heatmap was constructed to investigate the relationships
   among the ARDEGs in the GEO dataset ([150]Fig. 10B). Notably, FGFR3
   showed significant correlations with VAV3, NDRG2, and EFNA1 (P < 0.01),
   as well as with ITGA5, EMCN, and JUN (P < 0.05) in the TCGA-CESC
   dataset.

   Further, ROC curve analysis affirmed the diagnostic potential of the
   differential expressions among these genes. Specifically, E2F1
   (AUC = 0.854, [151]Fig. 10C), NDRG2 (AUC = 0.757, [152]Fig. 10D), SPP1
   (AUC = 0.738, [153]Fig. 10E), and FGFR3 (AUC = 0.728, [154]Fig. 10F)
   demonstrated diagnostic relevance in distinguishing between normal and
   CESC samples in the GEO dataset. Meanwhile, CA9 (AUC = 0.685, [155]Fig.
   10G), ITGA5 (AUC = 0.650, [156]Fig. 10H), EFNA1 (AUC = 0.625, [157]Fig.
   10I), and NRP1 (AUC = 0.611, [158]Fig. 10J) exhibited lower diagnostic
   values in the TCGA-CESC dataset.

4. Discussion

   Cervical cancer ranks as the fourth most common cause of cancer-related
   mortality among women worldwide and is a significant oncological
   challenge. Recent advancements in novel treatment modalities, such as
   immunotherapy, have notably enhanced survival rates. However, the
   prognosis for patients with recurrent or metastatic cervical cancer
   remains dismal. The tumor microenvironment (TME) plays a critical role
   in tumorigenesis and significantly influences the efficacy of
   immunotherapy. A pivotal feature of the TME is its capacity for
   angiogenesis, initiated by angiogenic factors released by tumor cells
   [[159]27]. Furthermore, angiogenesis induces tumors to evade immune
   monitoring by shaping the immunosuppressive microenvironment.
   Angiogenic factors can also activate immune cells that suppress the
   immune system, such as tumor-associated macrophages and T-regulatory
   cells, or suppress immune effectors and antigen-presenting cells. These
   suppressive immune cells may promote angiogenesis, resulting in a
   malignant pattern of reduced immune activity [[160]28]. These findings
   suggest a link between angiogenesis and innate immunity. Accumulating
   evidence indicates that targeting angiogenesis and immunotherapy are
   potential therapeutic strategies for cervical cancer [[161]29].

   Previous reports have focused exclusively on single ARGs or specific
   immune cell subtypes, without considering their associations.
   Therefore, we focused on holistically understanding the relationship
   between ARGs and TME by Bioinformatics method.

   To the best of our knowledge, this study represents the inaugural
   exploration of ARDEGs and their influence on prognostic clinical
   outcomes and immunological responses in cervical cancer. We conducted
   comprehensive analyses using GO, KEGG, and GSEA to elucidate the
   functional roles of ARDEGs in this disease context. Our functional
   enrichment analysis of ARDEGs in CESC revealed significant involvement
   in biological processes and signaling pathways that are pivotal for
   cancer progression. Notably, the Notch signaling pathways, which were
   enriched among the ARDEGs, are known for their roles in cell
   proliferation, differentiation, and apoptosis [[162]30]. The Notch
   pathway is crucial for cell fate determination and has been shown to
   contribute to tumorigenesis through its influence on angiogenesis and
   the maintenance of cancer stem cells [[163]31]. Similarly, The PI3K-Akt
   signaling pathway, another enriched pathway, is a critical mediator of
   cell survival and proliferation, and its aberrant activation is a
   hallmark of many cancers [[164]32]. In conclusion, The significant
   correlation of these ARDEGs with patient prognosis further emphasizes
   their clinical relevance and potential utility as biomarkers for
   diagnosis and prognostication in CESC.

   Therefore, we investigated ARDEGs in cervical cancer, identifying key
   biomarkers—NRP1, E2F1, CA9, SPP1, DLL4, ITGA5, PTX3, NDRG2, EFNA1,
   BAIAP2L1, and EMCN—that are crucial for understanding the disease's
   aggressive clinical traits and immune evasion. We employed LASSO
   regression analysis to construct a diagnostic model that segregates
   patients into high and low angiogenesis subgroups based on their risk
   scores. The expression profiles of these ARDEGs significantly differed
   between the subgroups and compared to the CESC groups. Further, our ROC
   analysis confirmed that the hub genes' expression could diagnose
   cervical cancer with high specificity and sensitivity. To evaluate the
   prognostic significance of these biomarkers, we developed a risk
   prognostic model, constructed a comprehensive nomogram, and performed
   decision curve analysis to confirm the clinical utility of this
   prognostic approach. The calibration curves at 1, 3, and 5 years
   demonstrated strong agreement between the predicted and actual survival
   rates, underscoring the nomogram's discriminatory and calibration
   capacities. This study not only delineates the diagnostic potential of
   ARDEGs in cervical cancer but also highlights the added clinical value
   of our model in enhancing patient outcomes through tailored therapeutic
   strategies.

   We further examined the impact of hub gene expression levels on OS,
   DSS, and PFI in cervical cancer. We discovered significant associations
   between the expression levels of the hub genes BAIAP2L1, NRP1, SPP1,
   DLL4, ITGA5, and PTX3 with OS, DSS, and PFI (P < 0.05). Notably,
   elevated expression of these genes serves as a robust predictive
   biomarker for reduced OS, DSS, and PFI in cervical cancer patients.
   BAIAP2L1 is notably overexpressed in various malignancies including
   breast cancer [[165]33], ovarian cancer [[166]34], gastric cancer
   [[167]35], lung cancer [[168]36], and hepatocellular cancer [[169]37],
   where its higher expression correlates with advanced disease stages and
   metastasis [[170]33,[171]34,[172]38,[173]39]. Similarly, NRP1, which is
   abundantly expressed across a range of tumors such as glioma [[174]40],
   bladder cancer [[175]41] and cervical cancer [[176]42], is linked with
   unfavorable outcomes. These findings align with prior research,
   reinforcing the critical role of NRP1 in processes such as embryonic
   angiogenesis and neurogenesis—mechanisms essential for cell migration,
   proliferation, survival, and apoptosis [[177][43], [178][44],
   [179][45]]. Furthermore, recent studies underscore the importance of
   NRP1 in modulating immunoregulatory receptors, noting its upregulation
   in tumor-associated Tregs, which impacts the efficacy and stability of
   the immune response [[180]46]. NRP1 plays a pivotal role in inhibiting
   the immunological memory of CD8+T cells during anti-tumor responses,
   while simultaneously enhancing Treg function. The binding of NRP1 to
   semaphorin-3A suppresses the motility and the tumor-targeted
   destructive capabilities of cytotoxic T lymphocytes [[181]47].
   Frequently, NRP1 is linked with a wide array of inhibitory receptors
   including CTLA-4, Tim-3, and LAG-3, defining a subset of dysfunctional
   PD-1^hi CD8^+ tumor-infiltrating lymphocytes. Inhibiting NRP1 can boost
   the efficacy of anti-PD-1 therapies and suppress tumor angiogenesis,
   thus amplifying the overall anti-tumor immune response [[182]48].
   Additionally, SPP1, a secreted phosphoprotein, targets and modulates
   matrix metalloproteinases within cancerous tissues. Its fundamental
   biological functions involve regulating cell growth, proliferation,
   migration, apoptosis, and immune responses. Elevated SPP1 expression in
   cervical cancer correlates with increased disease occurrence and
   progression by modulating immune cell infiltration levels, reinforcing
   prior findings [[183]49]. DLL4, a principal Notch ligand, is critically
   involved in tumor angiogenesis, activation of cancer stem cells, tumor
   progression, and metastasis [[184][50], [185][51], [186][52]].
   Similarly, ITGA5, a fibronectin receptor prevalent across various
   malignancies, is essential for tumor development, metastasis, and drug
   resistance. It promotes tumor growth through activation of the FAK/AKT
   signaling pathway, consistent with our observations [[187]53,[188]54].
   ITGA5 also enhances angiogenesis and correlates with adverse outcomes
   in cervical cancer patients [[189]55]. Furthermore, PTX3, a soluble
   inflammatory mediator in TME and an innate immunomodulator, is
   associated with immune escape and plays a significant role in apoptosis
   and inflammation regulation [[190]56,[191]57]. Recent research has
   established that in colorectal cancer, PTX3 significantly enhances
   protumor immunity by promoting M2-like macrophage polarization, an
   effect mediated by stromal cells [[192]58].

   The malignant properties of tumor cells are closely linked with immune
   cell infiltration, mediated by central hub genes. Utilizing the
   CIBERSORT algorithm, we analyzed the correlation between the expression
   profiles of 22 immune cell types in high- and low-scoring groups of
   cervical cancer patients. Our findings revealed significant differences
   in the expression levels of four specific immune cell types: naive B
   cells, M0 angiogenic cells, activated mast cells, and resting mast
   cells (P < 0.001). This suggests that angiogenesis is potentially
   linked to the tumor microenvironment (TME) and may regulate the
   initiation and progression of tumors. Tumor-infiltrating mast cells
   have been associated with resistance to anti-PD-1 therapies, as
   indicated in previous studies [[193]59]. Our analysis further explores
   the relationship between 15 ARDEGs and the significant infiltration of
   immune cells, revealing that hub genes are closely linked to 6 out of
   the 22 examined immune cell types. Notably, a strong correlation was
   observed between ITGA5 expression and the density of infiltrating
   immune cells, particularly activated mast cells. ITGA5 is known to
   promote angiogenesis, which correlates with lower survival rates in
   patients with cervical cancer. Additionally, mast cells contribute to
   the generation of pro-inflammatory environments and play pivotal roles
   in various biological processes including angiogenesis, immune
   modulation, tissue repair, and remodeling in tumor settings [[194]60].
   As one of the earliest immune cells recruited to solid tumors, mast
   cells are implicated in facilitating angiogenesis during the initial
   stages of cervical carcinogenesis [[195]61]. Research has found that
   ITGA5 can facilitate the recruitment and activation of monocytes,
   tumor-associated macrophages (TAMs), Th2 cells, and M2 cells within the
   tumor microenvironment [[196]62]. It can serve as a cellular ‘anchor’,
   promoting the aggregation, adhesion, and migration of certain immune
   cells, and altering the composition of the tumor microenvironment.

   Therefore, the combination of antiangiogenic therapy with immunotherapy
   may represent a promising approach for treating cervical cancer. Our
   analysis of the angiogenesis risk-scoring model's predictive capacity
   for immunotherapy outcomes revealed significant disparities between
   high-risk and low-risk groups across four Immune Profiling Scores
   (IPS): IPS, IPS-PD1/PD-L1/PD-L2, IPS-CTLA4, and
   IPS-PD1/PD-L1/PD-L2+CTLA4 (P < 0.05). These findings suggest that the
   risk score is a potential predictor of immunotherapy prognosis.

   Despite the promising findings of this study, it is important to
   acknowledge its limitations. Firstly, the research did not incorporate
   wet-lab experiments to validate the bioinformatics predictions, which
   are essential for confirming the functional roles of ARDEGs in cervical
   cancer. Secondly, the sample size, particularly for the CESC cohort,
   was relatively small, which may limit the generalizability of the
   results. Thirdly, the absence of clinical validation analysis means
   that the prognostic and diagnostic models developed here require
   further testing in independent patient cohorts to confirm their utility
   in a clinical setting. Lastly, the use of multiple datasets could
   introduce batch effects, despite efforts to mitigate these through
   standardization and batch correction. These factors should be
   considered when interpreting the results and planning future research
   directions.

5. Conclusion

   In conclusion, this study has pinpointed pivotal ARDEGs in CESC,
   elucidating their potential roles in biological processes and signaling
   pathways through GO and KEGG enrichment analyses. Utilizing LASSO
   regression, we successfully constructed a diagnostic model based on
   ARDEGs. Furthermore, a prognostic model was developed and validated via
   a nomogram and decision curve analysis, underscoring its clinical
   applicability. The immune infiltration analysis revealed significant
   variances in immune cell subtypes within CESC, offering profound
   insights into the tumor microenvironment. The ROC curve analysis
   affirmed the diagnostic precision of the identified hub genes.
   Collectively, these findings significantly enhance our understanding of
   the contributions of angiogenic genes to the pathophysiology of
   cervical cancer.

Funding

     * 1.
       Guangxi Natural Science Foundation Project (2022GXNSFAA035648)
     * 2.
       Guangxi Zhuang Autonomous Region Health Commission self-funded
       research project (Z20210600)
     * 3.
       Guangxi Medical and Health Appropriate Technology Development and
       Application Project (S2020096)

Ethics approval and consent to participate

   Ethical review and approval was not required for this study in
   accordance with the local legislation and institutional requirements.

Consent for publication

   Not applicable.

Availability of data and materials

   The datasets used in this study are publicly available through several
   open-access repositories. Detailed data can be provided upon reasonable
   request. All pertinent data are included in the article or uploaded as
   supplemental information. The datasets are accessible from the
   following repositories: UCSC Xena database
   ([197]http://genome.ucsc.edu), TCGA Database
   ([198]https://portal.gdc.cancer.gov/), Gene Expression Omnibus
   ([199]https://www.ncbi.nlm.nih.gov/geo/), GeneCards Database
   ([200]https://www.genecards.org/), and the GEO database
   ([201]https://www.ncbi.nlm.nih.gov/geo/). Specifically, the
   CESC-related datasets [202]GSE44001
   ([203]https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE44001),
   [204]GSE63514
   ([205]https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63514), and
   [206]GSE7803
   ([207]https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE7803) are
   sourced from the GEO database.

CRediT authorship contribution statement

   Shuzhen Li: Writing – review & editing, Writing – original draft,
   Visualization, Validation, Supervision, Software, Resources, Project
   administration, Methodology, Investigation, Funding acquisition, Formal
   analysis, Data curation, Conceptualization. Kun Gao: Validation,
   Supervision, Resources, Formal analysis, Data curation. Desheng Yao:
   Writing – review & editing, Visualization, Validation, Resources,
   Project administration, Methodology, Investigation, Funding
   acquisition, Formal analysis, Data curation, Conceptualization.

Declaration of competing interest

   The authors declare that they have no known competing financial
   interests or personal relationships that could have appeared to
   influence the work reported in this paper.

List of Abbreviations

   ARGs
          angiogenesis-related genes

   ARDEGs
          angiogenesis-related differentially expressed gene

   GO
          Gene Ontology

   KEGG
          Kyoto Encyclopedia of Genes and Genomes

   GSEA
          Gene Set Enrichment Analysis

   GSVA
          Gene Set Variation Analysis

   LASSO
          Least Absolute Shrinkage and Selection Operator

   TCGA
          The Cancer Genome Atlas

   CESC
          Carcinoma and Endocervical Adenocarcinoma

   DEGs
          Differentially Expressed Genes

   DCA
          Decision Curve Analysis

   PCA
          Principal Component Analysis

   CNV
          Copy Number Variations

   PFI
          Progression-Free Interval

   DSS
          Disease-Specific Survival

   ROC
          Receiver Operating Characteristic

   NRP1
          Neuropilin-1

   FPKM
          Fragments Per Kilobase Million

   As
          Angiogenesis scores

   BP
          Biological Processes

   MF
          Molecular Functions

   CC
          Cellular Components

   KM
          Kaplan-Meier

   AUC
          Area Under the Curve

Footnotes

   ^Appendix A

   Supplementary data to this article can be found online at
   [208]https://doi.org/10.1016/j.heliyon.2024.e33277.

Contributor Information

   Shuzhen Li, Email: H2020016@sr.gxmu.edu.cn.

   Desheng Yao, Email: yaodesheng@gxmu.edu.cn.

Appendix A. Supplementary data

   The following are the Supplementary data to this article:
   Multimedia component 1
   [209]mmc1.xlsx^ (25.1KB, xlsx)
   Multimedia component 2
   [210]mmc2.pdf^ (372.4KB, pdf)
   Multimedia component 3
   [211]mmc3.pdf^ (414.7KB, pdf)
   Multimedia component 4
   [212]mmc4.pdf^ (328KB, pdf)
   Multimedia component 5
   [213]mmc5.pdf^ (353.3KB, pdf)
   Multimedia component 6
   [214]mmc6.pdf^ (403.9KB, pdf)

References