Abstract Background Cervical cancer is the fourth most common cancer in women globally, and the main cause of the disease has been found to be ongoing HPV infection. Cervical cancer remains the primary cause of cancer-related death despite major improvements in screening and treatment approaches, especially in low- and middle-income nations. Therefore, it is crucial to investigate the tumor microenvironment in advanced cervical cancer in order to identify possible treatment targets. Materials and methods In order to better understand malignant cervical cancer epithelial cells (EPCs), this study used bulk RNA-seq data from UCSC in conjunction with single-cell RNA sequencing data from the ArrayExpress database. After putting quality control procedures into place, cell type identification and clustering analysis using the Seurat software were carried out. To clarify functional pathways, enrichment analysis and differential gene expression were carried out. The CIBERSORT and ESTIMATE R packages were used to evaluate the immune microenvironment characteristics, and univariate and multivariate Cox regression analyses were used to extract prognostic features. Furthermore, assessments of drug sensitivity and functional enrichment were carried out. Results Eight cell types were identified, with EPCs showing high proliferative and stemness features. Five EPC subpopulations were defined, with C1 NNMT+ CAEPCs driving tumor differentiation. A NNMT CAEPCs Risk Score (NCRS) model was developed, revealing a correlation between elevated NCRS scores and adverse patient outcomes characterized by immune evasion. In vitro experiments validated that the prognostic gene PLOD2 significantly enhances proliferation, migration, and invasion of cervical cancer cells. Conclusion This investigation delineated eight cell types and five subpopulations of malignant EPCs in cervical cancer, establishing the C1 NNMT+ CAEPCs as a crucial therapeutic target. The NCRS model demonstrated its prognostic capability, indicating that higher scores are associated with poorer clinical outcomes. The validation of PLOD2 as a prognostic gene highlights its therapeutic potential, underscoring the critical need for integrating immunotherapy and targeted treatment strategies to enhance diagnostic and therapeutic approaches in cervical cancer. Keywords: cervical cancer, single-cell RNA sequencing, tumor microenvironment, prognostic model, immune evasion, therapeutic targets Introduction Cervical cancer ranks as the fourth most prevalent malignancy among women globally, with incidence and mortality rates of 13.1% and 6.1%, respectively ([43]1, [44]2). It continues to be the primary cause of cancer-related mortality in nations with few resources ([45]3), primarily due to persistent infection with HPV. Although significant reductions in cervical cancer rates have been achieved in certain developed regions through HPV vaccination, effective screening programs, and advancements in diagnostic and adjuvant therapies ([46]4), a study from 2020 revealed that nearly 90% of cervical cancer fatalities occur in low- and middle-income countries ([47]5). Current treatment modalities encompass conventional chemotherapy regimens utilizing cisplatin and paclitaxel, radiotherapy ([48]6), targeted therapies like bevacizumab, and immunotherapies employing immune checkpoint inhibitors ([49]7). The dearth of treatment choices for cervical cancer that is advanced, recurring, or metastatic emphasizes the critical need for additional research into the tumor microenvironment in order to identify novel therapeutic targets. An increasing number of studies have highlighted the critical role of the tumor microenvironment (TME) in understanding tumor progression and drug resistance ([50]8). Exploring the TME in different types of tumors holds significant clinical value ([51]9–[52]11). Techniques such as high-throughput sequencing and single-cell RNA sequencing (scRNA-seq) have enabled researchers to deeply analyze cellular heterogeneity, immune evasion mechanisms, and the tumor’s response to therapy within the TME ([53]12). ScRNA-seq has transformed our understanding of tumor heterogeneity, particularly enhancing the exploration of the TME ([54]13). It has also significantly advanced our understanding of the mechanisms and evolutionary pathways underlying tumor development ([55]14), For example, single-cell sequencing analysis has been widely used in breast cancer research to analyze intercellular communication within the TME and explore key factors that influence tumor heterogeneity ([56]15–[57]18), scRNA-seq has been widely employed to investigate cervical cancer heterogeneity ([58]19, [59]20). Our research focuses on malignant epithelial cells in cervical cancer, as chronic HPV infection primarily targets the squamous epithelium, and HPV’s life cycle is intricately tied to the differentiation of host epithelial cells ([60]21). This study sought to explore the variability of particular cell types in order to clarify the reasons behind the advancement of cervical cancer. Using scRNA-seq, we performed a thorough investigation of malignant cervical cancer EPCs and created a unique predictive model to find possible treatment targets. To confirm our findings, we also looked at the tumor immune microenvironment and ran cellular tests. Our study offers new perspectives on potential cervical cancer treatment approaches. Methods Acquisition of cervical cancer scRNA-seq data The ArrayExpress database provided the scRNA-seq data for cervical cancer used in this investigation (accession number E-MTAB-12305) ([61]22). Additionally, bulk RNA-seq data, which included information on somatic mutations and clinical variables (such as age, race, tumor stage, and survival time), were retrieved from the University of California, Santa Cruz (UCSC, [62]https://xena.ucsc.edu/). ScRNA-seq data quality control and cell type identification The “Seurat” R package (version 4.3.0) was used to transform the scRNA-seq data into Seurat objects ([63]23). To identify and exclude doublet cells, the “DoubletFinder” R package (version 2.0.3) ([64]24) was applied. Subsequent filtering was conducted based on specific criteria: features between 300 and 6,000 (nFeature), counts ranging from 500 to 50,000 (nCount), and fewer than 5% of genes associated with red blood cells. Additionally, cells with mitochondrial gene expression levels higher than 25% of the total were not included in the analysis. The filtered cell data was then normalized using the “NormalizeData” function, and the “FindVariableFeatures” function ([65]25–[66]27) was utilized to identify the top 2,000 highly variable genes. The “ScaleData” function was then used to normalize the data ([67]28, [68]29). The “CellCycleScoring” function was utilized to evaluate cell cycle phases, followed by dimensionality reduction via “RunPCA,” where the first 30 principal components (PCs) were selected for subsequent analysis. To mitigate batch effects across samples, the “harmony” R package (v0.1.1) was implemented ([69]30, [70]31). The “FindNeighbors” and “FindClusters” projects were used to cluster cells, and the annotation of the clusters was done using previously identified marker genes. Visualization of the clusters was performed using Uniform Manifold Approximation and Projection (UMAP) ([71]32, [72]33). InferCNV analysis To differentiate tumor cells from non-tumor cells, we utilized single-cell sequencing data and chromosomal sorting, employing the CNV algorithm via the “InferCNV” R package ([73]34). EPCs exhibiting heightened copy number variations were classified as Cancer-Associated EPCs (CAEPCs). Additionally, we conducted InferCNV analysis on the identified tumor subpopulations. Identification and enrichment of DEGs in CAEPC subpopulations Differentially expressed genes (DEGs) within CAEPC subpopulations were identified using the “FindAllMarkers” function from the Seurat package. We further employed the Ro/e algorithm to evaluate tissue-specific expression patterns and cell cycle preferences within these