Abstract Background Cervical cancer is among the most prevalent malignancies worldwide. This study explores the relationships between angiogenesis-related genes (ARGs) and immune infiltration, and assesses their implications for the prognosis and treatment of cervical cancer. Additionally, it develops a diagnostic model based on angiogenesis-related differentially expressed genes (ARDEGs). Methods We systematically evaluated 15 ARDEGs using Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), Gene Set Enrichment Analysis (GSEA), and Gene Set Variation Analysis (GSVA). Immune cell infiltration was assessed using a single-sample gene-set enrichment analysis (ssGSEA) algorithm. We then constructed a diagnostic model for ARDEGs using Least Absolute Shrinkage and Selection Operator (LASSO) regression analysis and evaluated the diagnostic value of this model and the hub genes in predicting clinical outcomes and immunotherapy responses in cervical cancer. Results A set of ARDEGs was identified from the Cancer Genome Atlas (TCGA), Gene Expression Omnibus (GEO), and UCSC Xena database. We performed KEGG, GO, and GSEA analyses on these genes, revealing significant involvement in cell proliferation, differentiation, and apoptosis. The ARDEGs diagnostic model, constructed using LASSO regression analysis, showed high predictive accuracy in cervical cancer patients. We developed a reliable nomogram and decision curve analysis to evaluate the clinical utility of the ARDEG diagnostic model. The 15 ARDEGs in the model were associated with clinicopathological features, prognosis, and immune cell infiltration. Notably, ITGA5 expression and the abundance of immune cell infiltration (specifically mast cell activation) were highly correlated. Conclusion This study identifies the prognostic characteristics of ARGs in cervical cancer patients, elucidating aspects of the tumor microenvironment. It enhances the predictive accuracy of immunotherapy outcomes and establishes new strategies for immunotherapeutic interventions. Keywords: Cervical cancer, Angiogenesis, Diagnostic model, Prognostic index, Tumor microenvironment infiltration, Nomogram 1. Introduction Cervical cancer represents a major global health challenge, ranking as the fourth leading cause of cancer-related mortality among women [[27]1]. This malignancy is particularly prevalent in regions with limited healthcare resources, accounting for approximately 84 % of all cases and 88 % of deaths from cancer worldwide [[28]2]. Despite advances in treatment modalities including surgery, concurrent chemoradiotherapy, and novel approaches such as vascular-targeted therapy and immunotherapy, outcomes for patients with recurrent or metastatic cervical cancer remain poor [[29]3,[30]4]. These innovative therapies have shown promise, yet the long-term benefits are limited to a small group of patients, highlighting the critical need for enhancing response rates to immunotherapy. Recent research underscores the pivotal role of the tumor microenvironment (TME) in cervical cancer progression and its influence on the efficacy of immunotherapy [[31]5]. The TME comprises a complex network of extracellular matrix components, blood vessels, and various cellular constituents including immune, stromal, and cancer cells [[32]6]. Angiogenesis, the process of new blood vessel formation, is a key aspect of the TME that contributes to the immunosuppressive landscape, facilitating tumor immune evasion [[33]7]. Consequently, strategies targeting angiogenesis represent a viable approach for treating advanced stages of the disease. Following promising results from the KEYNOTE-826 trial, the combination of the antibody-drug conjugate paporizumab with bevacizumab was approved in late 2021 as a frontline therapy for recurrent/metastatic cervical cancer. However, the precise mechanisms by which immunotherapy and targeted therapy interact remain poorly understood. This study aims to elucidate the relationship between angiogenesis and the TME, with a focus on identifying diverse immunophenotypes of cervical cancer. We analyzed the expression of ARGs and their impact on disease prognosis, clinical outcomes, and immune response by utilizing data from TCGA, GEO, and UCSC Xena databases. Differential gene expression analysis between normal cells and cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC) tissues was performed, followed by enrichment analysis using GO, KEGG, and GSEA methodologies. Key genes and clinical parameters were integrated through univariate/multivariate Cox regression and LASSO regression analysis to develop an ARDEGs model. The predictive capability of this model was evaluated using Kaplan-Meier analysis, time-dependent ROC analysis, and other statistical tools including the C-index, calibration curves, and decision curve analysis. Additionally, treatment responses in immunotherapy cohorts were examined, and the diagnostic significance of gene mutations in hub genes was assessed. Our results aim to provide new insights into the potential of immunotherapy in treating cervical cancer, offering strategies to anticipate treatment outcomes and improve therapeutic efficacy. 2. Materials and methods 2.1. Data acquisition and processing Expression profile data for cervical cancer (TCGA-CESC) were retrieved from TCGA using the 'TCGABiolinks' R package [[34]8]. A total of 309 cervical cancer samples (Cancer group) and 3 adjacent non-tumor samples (Normal group) were included after excluding those lacking essential clinical information. The data were normalized to Fragments Per Kilobase Million (FPKM), and relevant clinical information was sourced from the UCSC Xena database ([35]http://genome.ucsc.edu) [[36]9]. The normalization of count sequencing data in the TCGA-CESC dataset was performed using the ‘limma’ package [[37]10]. 2.2. External validation datasets Related datasets, [38]GSE44001 [[39]11], [40]GSE63514 [[41]12], and [42]GSE7803 [[43]13], were downloaded from the GEO [[44]14] using the ‘GEOquery’ R package [[45]15]. Specifically, the [46]GSE44001 dataset employed the [47]GPL14951 Illumina HumanHT-12 WG-DASL V4.0 R2 Expression BeadChip platform to analyze gene expression in 300 cervical cancer samples. Concurrently, the [48]GSE63514 dataset utilized the [49]GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array to evaluate gene expression in 28 cervical cancer samples and 24 matched adjacent normal tissue samples. Additionally, the [50]GSE7803 dataset used the [51]GPL96 [HG-U133A] Affymetrix Human Genome U133A Array to assess gene expression in 21 cervical cancer samples and 10 partially matched adjacent normal tissues. Probe name annotation for each dataset was meticulously performed using the respective GPL platform files. All samples were included in subsequent analyses. These datasets served as validation sets to confirm the reproducibility and reliability of the findings, with summarized information presented in [52]Table 1. Table 1. CESC Dataset Information list. TCGA-CESC [53]GSE44001 [54]GSE63514 [55]GSE7803 Platform [56]GPL14951 [57]GPL570 [58]GPL96 Species Homo sapiens Homo sapiens Homo sapiens Homo sapiens Samples in Normal group 3 24 10 Samples in CESC group 309 300 28 21 Reference [[59]5] [[60]6] [[61]7] [62]Open in a new tab Abbreviations: TCGA, the cancer genome atlas; CESC, Cervical squamous cell carcinoma and endocervical adenocarcinoma. 2.3. Standardization and merging of datasets To identify the potential roles and related biological features and pathways of differential genes in the CESC group versus the normal group, we first standardized the datasets TCGA-CESC, [63]GSE44001, [64]GSE63514, and [65]GSE7803 using the R package limma. Subsequently, batch effects were removed from the CESC datasets [66]GSE44001, [67]GSE63514, and [68]GSE7803 using the R package sva, resulting in a merged GEO dataset. The datasets before and after batch effect removal were compared using box plots and principal component analysis(PCA) graphs. 2.4. Angiogenesis-related genes (ARGs) and data integration ARGs were identified via the GeneCards Database ([69]https://www.genecards.org/) [[70]16], using 'Angiogenesis' as a keyword, retaining only those coded as ‘Protein Coding’ with a relevance score greater than 3, culminating in a collection of 365 ARGs. After cross-referencing and deduplication, 145 additional ARGs were included from the literature, totaling 510 ARGs ([71]Supplementary Table S1). 2.5. Mutational and CNV analysis The somatic mutation dataset, including single nucleotide polymorphisms (SNPs), was obtained from TCGA. Data visualization was conducted using the 'maftools' R package [[72]17]. To examine gene copy number variation (CNV) among CESC patients, we utilized the R environment and integrated the data prior to analysis with GISTIC 2.0 [[73]18], employing the default parameters. 2.6. Cervical cancer-related differentially expressed genes To investigate differentially expressed genes (DEGs) between normal and CESC tissues, we utilized data from the GEO dataset. DEGs were identified by filtering for |log 2(fold change, FC)| > 0 with P-values <0.05. Genes with logFC >1 and P < 0.05 were classified as upregulated, while those with logFC < −1 and P < 0.05 were classified as downregulated. To discern genes associated with CESC pathology, we intersected DEGs with associated response genes (ARGs), and visualized these relationships using a Venn diagram. Differential expressions were further depicted through volcano plots generated with the R package, ggplot 2. 2.7. Calculation of angiogenic score The single-sample gene-set enrichment analysis (ssGSEA) [[74]19] algorithm effectively quantifies the relative abundance of genes within individual dataset samples. This method is crucial for advancing high-throughput genomic studies as it enables a meticulous analysis of gene expression levels across an extensive array of biological samples. Consequently, ssGSEA enhances our comprehension of the specific contributions of individual genes to cellular functions and disease pathologies, thereby augmenting the prospects for developing targeted therapeutic strategies. We applied the ssGSEA algorithm using the GSVA package in R to compute the angiogenesis scores of cancer samples from TCGA-CESC and GEO datasets. These scores were derived from the expression matrix of ARGs in each sample. Based on their angiogenesis scores, patients were dichotomized into two groups, and expression differences between the groups were analyzed using the Mann-Whitney U test, with significance set at P < 0.05. 2.8. Functional enrichment analysis: GO and KEGG Gene Ontology (GO) [[75]20] and Kyoto Encyclopedia of Genes and Genomes (KEGG) [[76]21] analyses were conducted to understand the functional impacts of ARDEGs. Using the R package clusterProfiler [[77]22], GO annotation was analyzed, and significant enrichment was identified at P < 0.05 with a false discovery rate (FDR) threshold (q-value) of <0.05. The Benjamini-Hochberg method was employed for p-value correction during these analyses. 2.9. Gene Set Enrichment Analysis (GSEA) GSEA [[78]23] is used to evaluate the distribution trend of a predefined gene set within a gene list ordered according to their relevance to a particular phenotype, thereby assessing the contribution of these genes to the phenotype. In this study, we used the logFC value to rank molecules and evaluate their enrichment in the predefined gene sets. The clusterProfiler package was then used to conduct enrichment analysis on all phenotype-related genes. The parameters for this GSEA enrichment analysis were as follows: seed set to 2020, 1000 permutations, minimum of 10 genes per gene set, maximum of 500 genes per gene set, and the p-value adjustment method was Benjamini-Hochberg (BH). We obtained the “h.all.v7.4.symbols.gmt” gene set from the Molecular Signatures Database (MSigDB) and conducted GSEA analysis to evaluate the expression of genes in the TCGA-CESC dataset. The criteria for significant enrichment were P.value < 0.05 and FDR (q.value) < 0.25. 2.10. Construction of angiogenesis diagnostic model To develop a diagnostic model for ARDEGs in the TCGA-CESC dataset, we initially employed the glmnet package [[79]24] to perform LASSO regression [[80]25], using the parameter family = "binomial" with ten-fold cross-validation. The process was iterated over 1000 cycles to prevent overfitting. LASSO regression, a prevalent machine learning algorithm for building diagnostic models, is based on linear regression enhanced by adding a penalty term (lambda × absolute value of the slope). This regularization helps prevent overfitting during curve fitting and improves the model's generalization ability. [MATH: riskScore=iCoefficient(hubgenei)*mRNAExpression(hubgenei) :MATH] Subsequently, we extracted the penalty coefficients (lambda) of the ARDEGs from the LASSO regression diagnostic model, and calculated the risk score of the ARDEGs diagnostic model, denoted as riskScore. Additionally, we utilized the R package RCircos to annotate the locations of the ARDEGs on human chromosomes. 2.11. Clinical prognosis analysis To elucidate the prognostic significance of target genes in cervical cancer, we employed univariate and multivariate Cox regression analyses using the TCGA-CESC dataset, incorporating key genes (hub genes and mRNA) and clinical variables to develop a predictive Cox regression model. We generated nomograms to estimate 1-, 3-, and 5-year survival probabilities for cervical cancer patients, based on multivariate regression scores for each clinical variable. The results were visualized using a risk factor map that depicted the association between molecular expression of ARDEGs, risk scores, and patient survival outcomes within the prognostic model. We further validated the nomogram's accuracy and resolution with a calibration curve, correlating actual survival probabilities against model predictions under varying conditions, utilizing the ggDCA package in R. 2.12. Prognostic evaluation using Kaplan-Meier curves Our analysis also included Kaplan-Meier curve assessments to analyze survival time and its association with various prognostic factors. The Kaplan-Meier method calculated survival probabilities, reflecting ongoing patient survival beyond specified time points. Significance was determined for genes impacting survival (P < 0.05). 2.13. Immune infiltration analysis with CIBERSORT CIBERSORT ([81]https://cibersort.stanford.edu/) [[82]26] is an innovative online algorithm that leverages gene expression data to accurately estimate the composition of immune cells within tissue samples. This method utilizes a signature matrix that characterizes 22 distinct human immune cell subtypes—including T cells, B cells, monocytes, and natural killer cells—through their unique gene expression profiles. Employing linear support vector regression, CIBERSORT effectively deconvolutes RNA expression data from mixed cell populations, enabling precise quantification of the relative abundances of different immune cell types. This algorithm serves as a crucial tool for analyzing the distribution and infiltration of immune cells in solid tumors and various complex tissues, thereby enhancing our understanding of immune landscapes in diverse pathological contexts. Using the CIBERSORT tool, we evaluated the immune cell infiltration in the TCGA-CESC samples based on 22 known immune cell subtype gene expression profiles. This analysis highlighted significant differences in immune cell infiltration among the CESC samples. A heatmap was created to illustrate significant correlations among these immune cells (P < 0.05). 2.14. Receiver Operating Characteristic(ROC) curve analysis To assess the diagnostic utility of hub gene expression, ROC curves were plotted for the Normal and CESC groups. We evaluated the area under the curve (AUC) values, which indicate the sensitivity and specificity of the models; higher AUC values suggest better diagnostic performance. 2.15. Statistical analysis All data processing and analyses in this study were performed using R software (Version 4.1.2). For assessing statistical significance between two continuous variables, we applied the independent Student t-test for normally distributed data and the Mann-Whitney U test (also known as the Wilcoxon rank-sum test) for non-normally distributed data. To evaluate differences between two categorical variables, we utilized either the Chi-square test or Fisher's exact test, depending on the data's distribution. Correlations among various molecules were determined using Spearman's correlation analysis, unless otherwise specified. We reported all statistical P-values as two-sided, adhering to the significance threshold of P < 0.05. 3. Results 3.1. Figure legends To meticulously elucidate our methodology for investigating the biological characteristics of cervical cancer using bioinformatics techniques, we have developed a detailed analysis flowchart ([83]Supplementary Fig. 1). 3.2. Merging of GEO datasets Initially, we processed the cervical cancer datasets [84]GSE44001, [85]GSE63514, and [86]GSE7803 to remove batch effects, resulting in the merged dataset named GEO dataset. We then compared the datasets before and after batch effect removal using distribution boxplots and Principal Component Analysis (PCA) graphs ([87]Supplementary Fig. 2). The results from the distribution boxplots and PCA graphs indicate that the batch effects in the samples of the GEO dataset were effectively eliminated after the batch removal process. 3.3. Analysis of the DEGs in cervical cancer We performed a differential expression analysis on the TCGA-CESC dataset using the limma package to identify DEGs between the normal and CESC groups. Our analysis revealed that 5083 genes met the criteria of absolute log fold change (|logFC|) greater than 1 and P-value less than 0.05. Specifically, we identified 2167 genes as significantly upregulated (logFC >1) and 2916 genes as significantly downregulated (logFC < −1) in the CESC samples compared to normal tissue. We visualized these changes using a volcano plot as depicted in [88]Fig. 1A. Furthermore, to identify ARDEGs, we intersected the DEGs and ARGs that also met the criteria of |logFC|>0 and P < 0.05 within the TCGA-CESC dataset, resulting in 202 ARDEGs. These intersections are represented in a Venn diagram ([89]Fig. 1B). Additionally, we employed the heatmap function from the R package ‘heatmap’ to illustrate the expression differences between the normal and CESC groups for these 202 ARDEGs. To further refine our list of ARDEGs, we intersected these 202 ARDEGs with 1942 prognostic molecules identified in cervical cancer, ultimately identifying 30 significant ARDEGs. This intersection is also illustrated in a Venn diagram ([90]Fig. 1C). Fig. 1. [91]Fig. 1 [92]Open in a new tab Differential Gene Expression Analysis in Cervical Cancer (A) Volcano plot illustrating the distribution of DEGs within the TCGA-CESC dataset. (B) Venn diagram depicting the intersections between ARGs and DEGs identified within the TCGA-CESC dataset. (C) Venn diagram of DEGs and prognostic molecular ARGs in the TCGA-CESC dataset. 3.4. GO and KEGG Pathway Enrichment Analysis of ARDEGs To elucidate the biological processes (BP), molecular functions (MF), cellular components (CC), and pathway associations of the 30 ARDEGs in cervical cancer, comprehensive GO ([93]Table 2) and KEGG ([94]Table 3) enrichment analyses were conducted. We applied stringent criteria for these analyses, selecting enrichment entries with both P-values and q-values less than 0.05. The results were visually represented through bubble charts ([95]Fig. 2A–B), circular network diagrams ([96]Fig. 2C–D), and histograms ([97]Fig. 2E–F). Table 2. GO enrichment analysis results of ARDEGs. ONTOLOGY ID Description GeneRatio BgRatio pvalue p.adjust qvalue BP GO:0045765 regulation of angiogenesis 10/29 383/18670 1.66e-10 2.33e-07 1.33e-07 BP GO:0043062 extracellular structure organization 10/29 422/18670 4.27e-10 2.33e-07 1.33e-07 BP GO:1901342 regulation of vasculature development 10/29 422/18670 4.27e-10 2.33e-07 1.33e-07 BP GO:0030198 extracellular matrix organization 9/29 368/18670 2.88e-09 1.18e-06 6.70e-07 BP GO:0048146 positive regulation of fibroblast proliferation 5/29 51/18670 1.41e-08 4.59e-06 2.62e-06 CC GO:0034774 secretory granule lumen 5/30 321/19717 1.13e-04 0.004 0.002 CC GO:0001726 ruffle 4/30 172/19717 1.28e-04 0.004 0.002 CC GO:0060205 cytoplasmic vesicle lumen 5/30 338/19717 1.44e-04 0.004 0.002 CC GO:0031983 vesicle lumen 5/30 339/19717 1.46e-04 0.004 0.002 CC GO:0005925 focal adhesion 5/30 405/19717 3.33e-04 0.005 0.003 MF GO:0048018 receptor ligand activity 8/29 482/17697 7.41e-07 7.11e-05 3.69e-05 MF GO:0050839 cell adhesion molecule binding 8/29 499/17697 9.62e-07 7.11e-05 3.69e-05 MF GO:0005161 platelet-derived growth factor receptor binding 3/29 15/17697 1.78e-06 7.11e-05 3.69e-05 MF GO:0005178 integrin binding 5/29 132/17697 2.20e-06 7.11e-05 3.69e-05 MF GO:0070851 growth factor receptor binding 5/29 134/17697 2.37e-06 7.11e-05 3.69e-05 [98]Open in a new tab Abbreviations: ARDEGs, Angiogenesis-related differentially expressed genes; GO: Gene Ontology; BP, biological process; CC, cellular component; MF, molecular function. Table 3. KEGG enrichment analysis results of ARDEGs. ONTOLOGY ID Description GeneRatio BgRatio pvalue p.adjust qvalue KEGG hsa04510 Focal adhesion 8/25 201/8076 9.63e-08 1.38e-05 8.42e-06 KEGG hsa05323 Rheumatoid arthritis 6/25 93/8076 2.94e-07 2.10e-05 1.29e-05 KEGG hsa04512 ECM-receptor interaction 5/25 88/8076 6.13e-06 2.02e-04 1.24e-04 KEGG hsa 05219 Bladder cancer 4/25 41/8076 6.70e-06 2.02e-04 1.24e-04 KEGG hsa04151 PI3K-Akt signaling pathway 8/25 354/8076 7.07e-06 2.02e-04 1.24e-04 [99]Open in a new tab Abbreviations: ARDEGs, Angiogenesis-related differentially expressed genes; KEGG, Kyoto Encyclopedia of Genes and Genomes. Fig. 2. [100]Fig. 2 [101]Open in a new tab GO and KEGG Pathway Enrichment Analysis of ARDEGs (A) GO analysis of ARDEGs and (B) KEGG analysis of ARDEGs. Results for the bubble diagram display. (C) Circular network diagram display of ARDEGs GO analysis and (D) KEGG pathway analysis results. (E) ARDEGs bar graph of GO analysis and (F) KEGG pathway analysis results. The screening criteria for the GO and KEGG enrichment entries were P < 0.05 and q < 0.05. In terms of BP, the ARDEGs predominantly contributed to the regulation of the JAK-STAT signaling cascade (GO:0046425), Notch signaling in cardiac development (GO:0061314), cellular response to hypoxia (GO:0001666), and the positive regulation of apoptosis execution phase (GO:1900119). For CC, notable enrichments included the secretory granule lumen (GO:0034774), ruffles (GO:0001726), ruffle membrane (GO:0032587), and lamellipodium membrane (GO:0031258). MF analysis revealed significant enrichment in Notch binding (GO:0005112), receptor-ligand activity (GO:0048018), platelet-derived growth factor receptor binding (GO:0005161), and growth factor receptor binding (GO:0070851). Additionally, KEGG pathway analysis identified significant enrichments in critical signaling pathways including the PI3K-Akt (hsa04151), MAPK (hsa 04010), and focal adhesion pathways (hsa04510), as well as those related to rheumatoid arthritis (hsa05323), ECM-receptor interactions (hsa04512), bladder cancer (hsa 05219), IL-17 signaling (hsa 04657), and melanoma(hsa0521). 3.5. GSEA and GSVA enrichment analysis of the TCGA-CESC dataset We applied GSEA to evaluate the associations between gene expression profiles and biological processes, cellular components, and molecular pathways in the Normal and CESC groups within the TCGA-CESC dataset. Our analysis revealed significant enrichment of gene expression in key oncogenic pathways including PI3K-AKT, WNT, NOTCH, and Hedgehog among others, as depicted in [102]Fig. 3A–E and summarized in [103]Table 4. Fig. 3. [104]Fig. 3 [105]Open in a new tab GSEA and GSVA Enrichment Analysis of the TCGA-CESC dataset (A) GSEA for the four principal biological pathways. (B-E) Marked enrichment observed in the PI3K-AKT, WNT, NOTCH, and Hedgehog pathways, respectively. (F) GSVA analysis differentiating the Normal and CESC groups within the dataset. Blue denotes samples from normal cervical tissue (Normal group), while red indicates samples from cervical cancer patients (CESC group). The screening criteria for significant enrichment for GSEA and GSVA were P < 0.05 and q < 0.25. Table 4. GSEA analysis of TCGA-CESC. Description setSize enrichmentScore NES pvalue p.adjust qvalues WP_PI3KAKT_SIGNALING_PATHWAY 339 −0.421338455 −1.583611504 0.002717391 0.028535354 0.02189818 KEGG_WNT_SIGNALING_PATHWAY 150 −0.382701026 −1.339565238 0.03652968 0.145911047 0.111972902 REACTOME_SIGNALING_BY_NOTCH 234 0.367222844 1.453964692 0.003367003 0.029921885 0.022962212 WP_HEDGEHOG_SIGNALING_PATHWAY 43 −0.650532216 −1.890314928 0.001694915 0.028535354 0.02189818 NABA_CORE_MATRISOME 274 −0.63471233 −2.343163134 0.00140056 0.028535354 0.02189818 REACTOME_PLATELET_ACTIVATION_SIGNALING_AND_AGGREGATION 260 −0.483379036 −1.774962916 0.001402525 0.028535354 0.02189818 WP_FOCAL_ADHESION 198 −0.473607215 −1.698359252 0.001453488 0.028535354 0.02189818 KEGG_FOCAL_ADHESION 199 −0.520045063 −1.865867367 0.001457726 0.028535354 0.02189818 NABA_ECM_GLYCOPROTEINS 195 −0.656522896 −2.349950105 0.001466276 0.028535354 0.02189818 REACTOME_MUSCLE_CONTRACTION 205 −0.607228932 −2.179930265 0.001466276 0.028535354 0.02189818 KEGG_CALCIUM_SIGNALING_PATHWAY 177 −0.484610146 −1.725678237 0.001485884 0.028535354 0.02189818 WP_MYOMETRIAL_RELAXATION_AND_CONTRACTION_PATHWAYS 156 −0.509049892 −1.787618711 0.001519757 0.028535354 0.02189818 REACTOME_DISEASES_OF_GLYCOSYLATION 142 −0.49027598 −1.701442089 0.001552795 0.028535354 0.02189818 REACTOME_CARDIAC_CONDUCTION 137 −0.588974012 −2.03917596 0.001557632 0.028535354 0.02189818 REACTOME_RESPONSE_TO_ELEVATED_PLATELET_CYTOSOLIC_CA2_ 131 −0.557798893 −1.920741604 0.0015625 0.028535354 0.02189818 [106]Open in a new tab Abbreviations: GSEA, Gene Set Enrichment Analysis. To further differentiate the molecular profiles between the Normal and CESC groups, we conducted GSVA of the TCGA-CESC dataset ([107]Fig. 3F). This analysis highlighted significant differences in the enrichment of the Hedgehog pathway and several other gene sets, illustrating distinct gene expression patterns between CESC and normal cervical tissues. 3.6. Development of an ARDEGs diagnostic model using LASSO regression analysis To assess the diagnostic potential of 30 ARDEGs within the TCGA-CESC dataset, we employed LASSO regression analysis to develop an ARDEGs diagnostic model, as depicted in [108]Fig. 4A. The model's risk factors were categorized using a novel risk factor graph ([109]Fig. 4B), which is split into two sections: risk grouping (determined through the Cox regression prognostic model and stratified by the median) and survival outcomes (illustrated using a dot plot of survival time and outcomes from TCGA-CESC clinical samples). Fig. 4. [110]Fig. 4 [111]Open in a new tab Development of an ARDEGs Diagnostic Model (A) Diagram of the LASSO regression diagnostic model for ARDEGs within the TCGA-CESC dataset. (B) Risk factor visualization illustrating the ARDEGs in the diagnostic model. (C) Trajectory diagram of variables within the LASSO regression for the ARDEGs diagnostic model. (D) Comparative diagram depicting group classifications in the ARDEGs diagnostic model. (E) Lollipop plots representing ARDEGs and their implications in the diagnostic model. We identified 15 key ARDEGs: BAIAP2L1, FGFR3, NRP1, E2F1, CA9, SPP1, DLL4, VAV3, ITGA5, PTX3, EMCN, NDRG2, EFNA1, CXCL8, and JUN. Enhancing the LASSO penalty's absolute value (lambda × slope) improved the model's generalizability and mitigated overfitting, as demonstrated by the LASSO variable trajectory map ([112]Fig. 4C). This map illustrated how gene inclusion varied inversely with the lambda coefficient of the LASSO penalty term. Further analysis revealed significant differences in the expression levels of the ARDEGs diagnostic model between high- and low-scoring patient groups diagnosed with cervical cancer (P < 0.001), as shown in [113]Fig. 4D. Additionally, we constructed a correlation Laplace diagram to further elucidate the relationship between ARDEG expression and the risk scores derived from the diagnostic model ([114]Fig. 4E). We stratified the prediction scores from 15 ARDEGs in TCGA-CESC samples into High/Low groups, and assessed expression variability using the Wilcoxon signed-rank test, as shown in [115]Fig. 5A. Our findings indicated significant differences in the expression of ARDEGs including NRP1, E2F1, CA9, SPP1, DLL4, ITGA5, PTX3, NDRG2, EFNA1, CXCL8, and JUN between the groups (P < 0.001). Fig. 5. [116]Fig. 5 [117]Open in a new tab Expression of ARDEGs in the TCGA-CESC Dataset (A) Comparative analysis of ARDEGs between different groups (CESC vs. Normal) in the TCGA-CESC diagnostic model. Blue denotes samples from the normal control group, whereas red indicates samples from the cervical cancer (CESC) group. (B) Differential expression of ARDEGs across risk categories (high vs. low) within the TCGA-CESC dataset. Blue signifies the low-risk expression group, and red denotes the high-risk expression group within the disease cohort. (C) Heatmap illustrating the correlation of ARDEGs between the groups (CESC vs. Normal) in the TCGA-CESC dataset. (D) Circular plots depicting the chromosomal distribution of ARDEGs. NS, non-significant. *P < 0.05, **P < 0.01, ***P < 0.001. Subsequent differential expression analysis demonstrated notable variance in the levels of these 15 ARDEGs between normal and CESC tissues within the TCGA-CESC cohort ([118]Fig. 5B). Specifically, genes such as BAIAP2L1, NRP1, E2F1, DLL4, and EMCN exhibited substantial expression differences (P < 0.01), while FGFR3, CA9, SPP1, VAV3, ITGA5, PTX3, NDRG2, and EFNA1 also showed significant differences (P < 0.05). Further analysis explored the interrelationships among the ARDEGs within the TCGA-CESC dataset, revealing statistically significant correlations, particularly between BAIAP2L1 and DLL4, ITGA5, PTX3 (P < 0.01) and between E2F1 and NDRG2 (P < 0.05), as illustrated in [119]Fig. 5C. Additionally, we mapped the chromosomal locations of these ARDEGs on a circular chromosome diagram ([120]Fig. 5D), highlighting that JUN, VAV3, and EFNA1 are located on chromosome 1, with BAIAP2L1 on chromosome 7. We generated ROC curves for 15 ARDEGs across normal and CESC samples from the TCGA-CESC dataset ([121]Supplementary Fig. 3). The diagnostic accuracy of these ARDEGs was notably high in distinguishing between normal and CESC conditions. Specifically, BAIAP2L1 demonstrated an Area Under the Curve (AUC) of 0.999, followed by DLL4 with an AUC of 0.966, NRP1 with an AUC of 0.938, FGFR3 with an AUC of 0.931, and CA9 with an AUC of 0.923. Similarly, VAV3 (AUC = 0.919), EFNA1 (AUC = 0.906), and PTX3 (AUC = 0.902) also exhibited high diagnostic performance. NDRG2 (AUC = 0.899), SPP1 (AUC = 0.871), ITGA5 (AUC = 0.841), JUN (AUC = 0.780), and CXCL8 (AUC = 0.770) demonstrated moderate diagnostic accuracy. 3.7. Prognostic Clinical Manifestations of ARDEGs in cervical cancer To elucidate the association between the expression of the 15 ARDEGs and the onset of cervical cancer, we conducted univariate and multivariate Cox regression analyses using clinical data from the TCGA-CESC dataset. The initial statistical analysis of these data is detailed in [122]Table 5. Table 5. Univariate and Multivariate Cox regression. Characteristics Total(N) Univariate analysis __________________________________________________________________ Multivariate analysis __________________________________________________________________ Hazard ratio (95 % CI) P value Hazard ratio (95 % CI) P value BAIAP2L1 306 Low 153 Reference High 153 1.835 (1.134–2.969) 0.013 2.228 (1.314–3.777) 0.003 FGFR3 306 Low 153 Reference High 153 0.574 (0.358–0.922) 0.022 0.716 (0.411–1.246) 0.237 NRP1 306 Low 153 Reference High 153 1.791 (1.113–2.882) 0.016 1.688 (0.945–3.013) 0.077 E2F1 306 Low 153 Reference High 153 0.581 (0.362–0.931) 0.024 0.678 (0.402–1.145) 0.146 CA9 306 Low 153 Reference High 153 1.547 (0.969–2.469) 0.068 1.009 (0.591–1.725) 0.972 SPP1 306 Low 153 Reference High 153 1.787 (1.104–2.892) 0.018 1.330 (0.776–2.281) 0.299 DLL4 306 Low 153 Reference High 153 1.816 (1.131–2.915) 0.014 1.226 (0.713–2.109) 0.461 VAV3 306 Low 153 Reference High 153 0.574 (0.358–0.920) 0.021 0.711 (0.412–1.227) 0.221 ITGA5 306 Low 153 Reference High 153 2.374 (1.458–3.866) <0.001 1.298 (0.715–2.353) 0.391 PTX3 306 Low 153 Reference High 153 1.984 (1.226–3.212) 0.005 1.453 (0.830–2.544) 0.191 EMCN 306 Low 153 Reference High 153 0.587 (0.365–0.943) 0.028 0.555 (0.321–0.959) 0.035 NDRG2 306 Low 153 Reference High 153 0.577 (0.361–0.923) 0.022 0.577 (0.327–1.019) 0.058 EFNA1 306 Low 153 Reference High 153 1.937 (1.207–3.107) 0.006 1.890 (1.123–3.182) 0.017 CXCL8 306 Low 153 Reference High 153 2.426 (1.497–3.934) <0.001 1.725 (1.012–2.941) 0.045 JUN 306 Low 153 Reference High 153 1.743 (1.092–2.783) 0.020 1.481 (0.868–2.526) 0.149 [123]Open in a new tab Abbreviations: CI, confidence interval. The findings from the Cox regression analyses are illustrated through forest plots ([124]Fig. 6A). Additionally, we developed a nomogram to ascertain the prognostic significance of these models, as depicted in [125]Fig. 6B. We further executed prognostic calibration analyses for 1-, 3-, and 5-year intervals based on the Cox regression variables, presenting the results in calibration curves ([126]Fig. 6C–E). These curves plot survival probabilities from actual data against predictions along the horizontal axis, with the model's accuracy at various forecast intervals denoted by lines and points in diverse colors. The proximity of these lines to the gray ideal line reflects the prediction's precision. Fig. 6. [127]Fig. 6 [128]Open in a new tab Prognostic Clinical Manifestations of ARDEGs in Cervical Cancer (A) Univariate regression analysis is visually summarized in ARDEG forest plots. (B) The corresponding nomograms provide a detailed graphical representation. (C-E) Calibration curves for the Cox regression prognostic model are shown for 1-year (C), 3-year (D), and 5-year (E) time points, illustrating model accuracy over these periods. (F–H) DCA for the Cox regression prognostic model is presented for 1-year (F), 3-year (G), and 5-year (H) intervals, highlighting the clinical utility of the model. Moreover, we utilized Decision Curve Analysis (DCA) to assess the clinical utility of the Cox regression prognostic model over 1-year ([129]Fig. 6F), 3-year ([130]Fig. 6G), and 5-year ([131]Fig. 6H) horizons. The DCA plots demonstrate the net benefits across various threshold probabilities, with the model's effectiveness indicated by a broader x-value range, suggesting superiority over scenarios with all-positive or all-negative assumptions. 3.8. Prognostic performance assessment of ARDEGs Kaplan-Meier (KM) survival curves were constructed for the fifteen autophagy-related differentially expressed genes (ARDEGs) exhibiting statistical significance (P < 0.05) ([132]Fig. 7A-N). Notably, elevated expression levels of pivotal genes including BAIAP2L1, NRP1, SPP1, DLL4, ITGA5, PTX3, EFNA1, and CXCL8 were significantly correlated with diminished overall survival (OS) rates. These findings underscore the prognostic relevance of ARDEGs in cervical squamous cell carcinoma (CESC) and highlight their potential as prognostic biomarkers in clinical assessments. Fig. 7. [133]Fig. 7 [134]Open in a new tab Kaplan-Meier Curves Illustrating Overall Survival Associated with ARDEGs (A–N) The blue represents the low expression (Low-expression group), while the red represents the cervical cancer patient sample (High-expression group). OS, Overall survival. KM curve, Kaplan-Meier curve. *P < 0.05, **P < 0.01, ***P < 0.001. 3.9. Mutation analysis of ARDEGs in CESC patients To systematically analyze the somatic mutations in 15 ARDEGs (BAIAP2L1, FGFR3, NRP1, E2F1, CA9, SPP1, DLL4, VAV3, ITGA5, PTX3, EMCN, NDRG2, EFNA1, CXCL8, JUN) associated with CESC, we conducted a mutation analysis on CESC patient samples from the TCGA-CESC dataset using the R package maftools. Our findings reveal that the predominant mutations in the dataset were missense mutations, which constituted the majority of alterations. Additionally, the mutation spectrum of the 15 ARDEGs in these patients predominantly consisted of single nucleotide polymorphisms (SNPs), with fewer instances of insertions and deletions. Notably, the most frequent single nucleotide variant (SNV) among CESC patients was C > T, followed by C > G. Furthermore, we investigated the copy number variations (CNVs) in these 15 ARDEGs within the same cohort. By downloading and processing CNV data and applying GISTIC 2.0 for analysis, we visualized our results, which identified significant amplifications and deletions in the ARDEGs. Notably, PTX3, EFNA1, and E2F1 exhibited the highest amplification frequencies, whereas FGFR3, SPP1, and EMCN were most frequently deleted ([135]Supplementary Fig. 4). Comprehensive Analysis of Immune Cell Infiltration Dynamics in Cervical Cancer via CIBERSORT and ssGSEA. In this study, we leveraged the expression profiles of 15 ARDEGs in cervical cancer samples to compute Activity Scores (As) for each patient using the ssGSEA algorithm. Following this, the CIBERSORT algorithm was applied to delineate the landscape of immune cell infiltration based on the expression data, stratifying the patients into high and low As groups ([136]Fig. 8A). Fig. 8. [137]Fig. 8 [138]Open in a new tab Comprehensive Analysis of Immune Infiltration in Cervical Cancer Utilizing CIBERSORT (A) The histogram illustrates the immune infiltration levels for 22 distinct types of immune cells, highlighting their distribution and prevalence. (B) The comparison chart delineates the relative abundance of these immune cells, providing a clear visual representation of their infiltration levels. (C) The heat map offers a detailed correlation analysis between aberrantly regulated differentially expressed genes (ARDEGs) and immune cell expressions. Positive correlations, depicted as red circles, indicate a direct relationship between gene expression and immune cell infiltration, with the intensity of the association reflected by the size of the circles. Conversely, blue circles signify negative correlations, with the circle size proportionate to the strength of the inverse relationship. Detailed immunoprofiling revealed distinct differences in the infiltration patterns of specific immune cells. Statistically significant variances were observed particularly in naive B cells, angiogenesis-associated macrophages (M0), activated mast cells, and resting mast cells, with p-values <0.001. Additionally, notable disparities were also found in the expression levels of CD8^+ T cells and angiogenic M1 macrophages between the high and low As groups, with p-values of <0.01 and <0.05 respectively ([139]Fig. 8B). To further elucidate the interaction between ARDEGs and immune cell infiltration, correlation heat maps were utilized, demonstrating a pronounced association between ITGA5 expression and the prevalence of activated mast cells in both high and low As groups ([140]Fig. 8C). These findings highlight the critical interplay between specific ARDEGs and immune cell populations, potentially offering new insights into the mechanisms driving immune evasion and tumor progression in cervical cancer. 3.10. Analysis of immunotherapy outcomes based on an angiogenesis risk scoring model To further analyze the predictive capability of the angiogenesis risk scoring model for immunotherapy, we downloaded cervical cancer-related angiogenesis scores (Angiogenesis score, As) from the TCIA database. We employed the ggplot function in R to create boxplots depicting the differences in As among different patient risk groups within the TCGA-CESC dataset. The results showed significant disparities between the high-risk and low-risk groups across four types of Immune Profiles Scores (IPS). These include IPS, IPS-PD1/PD-L1/PD-L2, IPS-CTLA4, and IPS-PD1/PD-L1/PD-L2 + CTLA4 ([141]Supplementary Fig. 5), with all four IPS being significantly higher in the high-risk group (p < 0.05). This indicates that the risk score can effectively predict the prognosis of immunotherapy. 3.11. Clinical Correlation Analysis of Prognostic ARDEGs We conducted a detailed examination of the expression levels of prognostic ARDEGs to ascertain their correlation with clinical outcomes in CESC) patients. This study analyzed the association between the expression profiles of selected predictive ARDEGs and various clinicopathological features linked to distinct prognoses. We also explored how these expression patterns influenced key prognostic indicators such as tumor progression-free interval (PFI), disease-specific survival (DSS), and overall survival (OS) in cervical cancer tissues. Our findings revealed that higher expression levels of specific prognostic ARDEGs—including BAIAP2L1, NRP1, E2F1, CA9, SPP1, DLL4, ITGA5, PTX3, EMCN, and CXCL8—were significantly correlated with improved OS in patients (P < 0.05) as shown in [142]Fig. 9A. Similarly, significant associations were observed between the expression of these genes and DSS (P < 0.05), presented in [143]Fig. 9B. Furthermore, the expression profiles of BAIAP2L1, NRP1, SPP1, DLL4, ITGA5, and PTX3 were significantly linked with extended PFI (P < 0.05), depicted in [144]Fig. 9C. Fig. 9. [145]Fig. 9 [146]Open in a new tab Clinical Correlation Analysis of Prognostic ARDEGs Prognosis of ARDEGs and (A) OS, (B) DSS, (C) PFI. NS, non-significant. *P < 0.05, **P < 0.01, ***P < 0.001. These results underscore the potential of ARDEGs as biomarkers for prognostic evaluation in cervical cancer, suggesting that their differential expression may serve as a valuable predictor of patient outcomes. 3.12. Analysis of Hub Gene Expression in Normal and CESC datasets We evaluated the expression profiles of 15 ARDEGs across normal and CESC samples within GEO datasets. Our analysis identified significant differential expressions: FGFR3, E2F1, CA9, SPP1, and NDRG2 displayed marked differences (P < 0.001), while ITGA5 showed substantial variance (P < 0.01). Additionally, the expressions of NRP1 and EFNA1 were significantly different (P < 0.05) ([147]Fig. 10A). Fig. 10. [148]Fig. 10 [149]Open in a new tab Analysis of Hub Gene Expression in Normal and CESC Datasets (A) Comparative analysis of ARDEGs expression in CESC versus normal tissue samples from the GEO dataset. (B) Heatmap illustrating the correlation of ARDEGs across the different groups (CESC/Normal) within the GEO dataset. (C-J) ROC curve analysis of significantly differentially expressed genes in the GEO dataset, including: (C) E2F1, (D) NDRG2, (E) SPP1, (F) FGFR3, (G) CA9, (H) ITGA5, (I) EFNA1, and (J) NRP1. NS, non-significant (P ≥ 0.05). *P < 0.05, **P < 0.01, ***P < 0.001. A correlation heatmap was constructed to investigate the relationships among the ARDEGs in the GEO dataset ([150]Fig. 10B). Notably, FGFR3 showed significant correlations with VAV3, NDRG2, and EFNA1 (P < 0.01), as well as with ITGA5, EMCN, and JUN (P < 0.05) in the TCGA-CESC dataset. Further, ROC curve analysis affirmed the diagnostic potential of the differential expressions among these genes. Specifically, E2F1 (AUC = 0.854, [151]Fig. 10C), NDRG2 (AUC = 0.757, [152]Fig. 10D), SPP1 (AUC = 0.738, [153]Fig. 10E), and FGFR3 (AUC = 0.728, [154]Fig. 10F) demonstrated diagnostic relevance in distinguishing between normal and CESC samples in the GEO dataset. Meanwhile, CA9 (AUC = 0.685, [155]Fig. 10G), ITGA5 (AUC = 0.650, [156]Fig. 10H), EFNA1 (AUC = 0.625, [157]Fig. 10I), and NRP1 (AUC = 0.611, [158]Fig. 10J) exhibited lower diagnostic values in the TCGA-CESC dataset. 4. Discussion Cervical cancer ranks as the fourth most common cause of cancer-related mortality among women worldwide and is a significant oncological challenge. Recent advancements in novel treatment modalities, such as immunotherapy, have notably enhanced survival rates. However, the prognosis for patients with recurrent or metastatic cervical cancer remains dismal. The tumor microenvironment (TME) plays a critical role in tumorigenesis and significantly influences the efficacy of immunotherapy. A pivotal feature of the TME is its capacity for angiogenesis, initiated by angiogenic factors released by tumor cells [[159]27]. Furthermore, angiogenesis induces tumors to evade immune monitoring by shaping the immunosuppressive microenvironment. Angiogenic factors can also activate immune cells that suppress the immune system, such as tumor-associated macrophages and T-regulatory cells, or suppress immune effectors and antigen-presenting cells. These suppressive immune cells may promote angiogenesis, resulting in a malignant pattern of reduced immune activity [[160]28]. These findings suggest a link between angiogenesis and innate immunity. Accumulating evidence indicates that targeting angiogenesis and immunotherapy are potential therapeutic strategies for cervical cancer [[161]29]. Previous reports have focused exclusively on single ARGs or specific immune cell subtypes, without considering their associations. Therefore, we focused on holistically understanding the relationship between ARGs and TME by Bioinformatics method. To the best of our knowledge, this study represents the inaugural exploration of ARDEGs and their influence on prognostic clinical outcomes and immunological responses in cervical cancer. We conducted comprehensive analyses using GO, KEGG, and GSEA to elucidate the functional roles of ARDEGs in this disease context. Our functional enrichment analysis of ARDEGs in CESC revealed significant involvement in biological processes and signaling pathways that are pivotal for cancer progression. Notably, the Notch signaling pathways, which were enriched among the ARDEGs, are known for their roles in cell proliferation, differentiation, and apoptosis [[162]30]. The Notch pathway is crucial for cell fate determination and has been shown to contribute to tumorigenesis through its influence on angiogenesis and the maintenance of cancer stem cells [[163]31]. Similarly, The PI3K-Akt signaling pathway, another enriched pathway, is a critical mediator of cell survival and proliferation, and its aberrant activation is a hallmark of many cancers [[164]32]. In conclusion, The significant correlation of these ARDEGs with patient prognosis further emphasizes their clinical relevance and potential utility as biomarkers for diagnosis and prognostication in CESC. Therefore, we investigated ARDEGs in cervical cancer, identifying key biomarkers—NRP1, E2F1, CA9, SPP1, DLL4, ITGA5, PTX3, NDRG2, EFNA1, BAIAP2L1, and EMCN—that are crucial for understanding the disease's aggressive clinical traits and immune evasion. We employed LASSO regression analysis to construct a diagnostic model that segregates patients into high and low angiogenesis subgroups based on their risk scores. The expression profiles of these ARDEGs significantly differed between the subgroups and compared to the CESC groups. Further, our ROC analysis confirmed that the hub genes' expression could diagnose cervical cancer with high specificity and sensitivity. To evaluate the prognostic significance of these biomarkers, we developed a risk prognostic model, constructed a comprehensive nomogram, and performed decision curve analysis to confirm the clinical utility of this prognostic approach. The calibration curves at 1, 3, and 5 years demonstrated strong agreement between the predicted and actual survival rates, underscoring the nomogram's discriminatory and calibration capacities. This study not only delineates the diagnostic potential of ARDEGs in cervical cancer but also highlights the added clinical value of our model in enhancing patient outcomes through tailored therapeutic strategies. We further examined the impact of hub gene expression levels on OS, DSS, and PFI in cervical cancer. We discovered significant associations between the expression levels of the hub genes BAIAP2L1, NRP1, SPP1, DLL4, ITGA5, and PTX3 with OS, DSS, and PFI (P < 0.05). Notably, elevated expression of these genes serves as a robust predictive biomarker for reduced OS, DSS, and PFI in cervical cancer patients. BAIAP2L1 is notably overexpressed in various malignancies including breast cancer [[165]33], ovarian cancer [[166]34], gastric cancer [[167]35], lung cancer [[168]36], and hepatocellular cancer [[169]37], where its higher expression correlates with advanced disease stages and metastasis [[170]33,[171]34,[172]38,[173]39]. Similarly, NRP1, which is abundantly expressed across a range of tumors such as glioma [[174]40], bladder cancer [[175]41] and cervical cancer [[176]42], is linked with unfavorable outcomes. These findings align with prior research, reinforcing the critical role of NRP1 in processes such as embryonic angiogenesis and neurogenesis—mechanisms essential for cell migration, proliferation, survival, and apoptosis [[177][43], [178][44], [179][45]]. Furthermore, recent studies underscore the importance of NRP1 in modulating immunoregulatory receptors, noting its upregulation in tumor-associated Tregs, which impacts the efficacy and stability of the immune response [[180]46]. NRP1 plays a pivotal role in inhibiting the immunological memory of CD8+T cells during anti-tumor responses, while simultaneously enhancing Treg function. The binding of NRP1 to semaphorin-3A suppresses the motility and the tumor-targeted destructive capabilities of cytotoxic T lymphocytes [[181]47]. Frequently, NRP1 is linked with a wide array of inhibitory receptors including CTLA-4, Tim-3, and LAG-3, defining a subset of dysfunctional PD-1^hi CD8^+ tumor-infiltrating lymphocytes. Inhibiting NRP1 can boost the efficacy of anti-PD-1 therapies and suppress tumor angiogenesis, thus amplifying the overall anti-tumor immune response [[182]48]. Additionally, SPP1, a secreted phosphoprotein, targets and modulates matrix metalloproteinases within cancerous tissues. Its fundamental biological functions involve regulating cell growth, proliferation, migration, apoptosis, and immune responses. Elevated SPP1 expression in cervical cancer correlates with increased disease occurrence and progression by modulating immune cell infiltration levels, reinforcing prior findings [[183]49]. DLL4, a principal Notch ligand, is critically involved in tumor angiogenesis, activation of cancer stem cells, tumor progression, and metastasis [[184][50], [185][51], [186][52]]. Similarly, ITGA5, a fibronectin receptor prevalent across various malignancies, is essential for tumor development, metastasis, and drug resistance. It promotes tumor growth through activation of the FAK/AKT signaling pathway, consistent with our observations [[187]53,[188]54]. ITGA5 also enhances angiogenesis and correlates with adverse outcomes in cervical cancer patients [[189]55]. Furthermore, PTX3, a soluble inflammatory mediator in TME and an innate immunomodulator, is associated with immune escape and plays a significant role in apoptosis and inflammation regulation [[190]56,[191]57]. Recent research has established that in colorectal cancer, PTX3 significantly enhances protumor immunity by promoting M2-like macrophage polarization, an effect mediated by stromal cells [[192]58]. The malignant properties of tumor cells are closely linked with immune cell infiltration, mediated by central hub genes. Utilizing the CIBERSORT algorithm, we analyzed the correlation between the expression profiles of 22 immune cell types in high- and low-scoring groups of cervical cancer patients. Our findings revealed significant differences in the expression levels of four specific immune cell types: naive B cells, M0 angiogenic cells, activated mast cells, and resting mast cells (P < 0.001). This suggests that angiogenesis is potentially linked to the tumor microenvironment (TME) and may regulate the initiation and progression of tumors. Tumor-infiltrating mast cells have been associated with resistance to anti-PD-1 therapies, as indicated in previous studies [[193]59]. Our analysis further explores the relationship between 15 ARDEGs and the significant infiltration of immune cells, revealing that hub genes are closely linked to 6 out of the 22 examined immune cell types. Notably, a strong correlation was observed between ITGA5 expression and the density of infiltrating immune cells, particularly activated mast cells. ITGA5 is known to promote angiogenesis, which correlates with lower survival rates in patients with cervical cancer. Additionally, mast cells contribute to the generation of pro-inflammatory environments and play pivotal roles in various biological processes including angiogenesis, immune modulation, tissue repair, and remodeling in tumor settings [[194]60]. As one of the earliest immune cells recruited to solid tumors, mast cells are implicated in facilitating angiogenesis during the initial stages of cervical carcinogenesis [[195]61]. Research has found that ITGA5 can facilitate the recruitment and activation of monocytes, tumor-associated macrophages (TAMs), Th2 cells, and M2 cells within the tumor microenvironment [[196]62]. It can serve as a cellular ‘anchor’, promoting the aggregation, adhesion, and migration of certain immune cells, and altering the composition of the tumor microenvironment. Therefore, the combination of antiangiogenic therapy with immunotherapy may represent a promising approach for treating cervical cancer. Our analysis of the angiogenesis risk-scoring model's predictive capacity for immunotherapy outcomes revealed significant disparities between high-risk and low-risk groups across four Immune Profiling Scores (IPS): IPS, IPS-PD1/PD-L1/PD-L2, IPS-CTLA4, and IPS-PD1/PD-L1/PD-L2+CTLA4 (P < 0.05). These findings suggest that the risk score is a potential predictor of immunotherapy prognosis. Despite the promising findings of this study, it is important to acknowledge its limitations. Firstly, the research did not incorporate wet-lab experiments to validate the bioinformatics predictions, which are essential for confirming the functional roles of ARDEGs in cervical cancer. Secondly, the sample size, particularly for the CESC cohort, was relatively small, which may limit the generalizability of the results. Thirdly, the absence of clinical validation analysis means that the prognostic and diagnostic models developed here require further testing in independent patient cohorts to confirm their utility in a clinical setting. Lastly, the use of multiple datasets could introduce batch effects, despite efforts to mitigate these through standardization and batch correction. These factors should be considered when interpreting the results and planning future research directions. 5. Conclusion In conclusion, this study has pinpointed pivotal ARDEGs in CESC, elucidating their potential roles in biological processes and signaling pathways through GO and KEGG enrichment analyses. Utilizing LASSO regression, we successfully constructed a diagnostic model based on ARDEGs. Furthermore, a prognostic model was developed and validated via a nomogram and decision curve analysis, underscoring its clinical applicability. The immune infiltration analysis revealed significant variances in immune cell subtypes within CESC, offering profound insights into the tumor microenvironment. The ROC curve analysis affirmed the diagnostic precision of the identified hub genes. Collectively, these findings significantly enhance our understanding of the contributions of angiogenic genes to the pathophysiology of cervical cancer. Funding * 1. Guangxi Natural Science Foundation Project (2022GXNSFAA035648) * 2. Guangxi Zhuang Autonomous Region Health Commission self-funded research project (Z20210600) * 3. Guangxi Medical and Health Appropriate Technology Development and Application Project (S2020096) Ethics approval and consent to participate Ethical review and approval was not required for this study in accordance with the local legislation and institutional requirements. Consent for publication Not applicable. Availability of data and materials The datasets used in this study are publicly available through several open-access repositories. Detailed data can be provided upon reasonable request. All pertinent data are included in the article or uploaded as supplemental information. The datasets are accessible from the following repositories: UCSC Xena database ([197]http://genome.ucsc.edu), TCGA Database ([198]https://portal.gdc.cancer.gov/), Gene Expression Omnibus ([199]https://www.ncbi.nlm.nih.gov/geo/), GeneCards Database ([200]https://www.genecards.org/), and the GEO database ([201]https://www.ncbi.nlm.nih.gov/geo/). Specifically, the CESC-related datasets [202]GSE44001 ([203]https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE44001), [204]GSE63514 ([205]https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63514), and [206]GSE7803 ([207]https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE7803) are sourced from the GEO database. CRediT authorship contribution statement Shuzhen Li: Writing – review & editing, Writing – original draft, Visualization, Validation, Supervision, Software, Resources, Project administration, Methodology, Investigation, Funding acquisition, Formal analysis, Data curation, Conceptualization. Kun Gao: Validation, Supervision, Resources, Formal analysis, Data curation. Desheng Yao: Writing – review & editing, Visualization, Validation, Resources, Project administration, Methodology, Investigation, Funding acquisition, Formal analysis, Data curation, Conceptualization. Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. List of Abbreviations ARGs angiogenesis-related genes ARDEGs angiogenesis-related differentially expressed gene GO Gene Ontology KEGG Kyoto Encyclopedia of Genes and Genomes GSEA Gene Set Enrichment Analysis GSVA Gene Set Variation Analysis LASSO Least Absolute Shrinkage and Selection Operator TCGA The Cancer Genome Atlas CESC Carcinoma and Endocervical Adenocarcinoma DEGs Differentially Expressed Genes DCA Decision Curve Analysis PCA Principal Component Analysis CNV Copy Number Variations PFI Progression-Free Interval DSS Disease-Specific Survival ROC Receiver Operating Characteristic NRP1 Neuropilin-1 FPKM Fragments Per Kilobase Million As Angiogenesis scores BP Biological Processes MF Molecular Functions CC Cellular Components KM Kaplan-Meier AUC Area Under the Curve Footnotes ^Appendix A Supplementary data to this article can be found online at [208]https://doi.org/10.1016/j.heliyon.2024.e33277. Contributor Information Shuzhen Li, Email: H2020016@sr.gxmu.edu.cn. Desheng Yao, Email: yaodesheng@gxmu.edu.cn. Appendix A. Supplementary data The following are the Supplementary data to this article: Multimedia component 1 [209]mmc1.xlsx^ (25.1KB, xlsx) Multimedia component 2 [210]mmc2.pdf^ (372.4KB, pdf) Multimedia component 3 [211]mmc3.pdf^ (414.7KB, pdf) Multimedia component 4 [212]mmc4.pdf^ (328KB, pdf) Multimedia component 5 [213]mmc5.pdf^ (353.3KB, pdf) Multimedia component 6 [214]mmc6.pdf^ (403.9KB, pdf) References