Abstract Background Giant cell tumor of bone (GCTB) represents a group of tumors that characterized by their heterogeneity and the presence of multiple cell types. It is essential to understand their molecular mechanisms for better designing therapeutic approaches. Methods We applied single cell RNA-sequencing technology to investigate differentiation process during GCTBs formation. DAVID was used to perform a pathway enrichment and Gene Ontology (GO) analyses on signaling pathways and biological functions. Cell–cell interactions and signaling networks are depicted with network diagrams and usages of heatmaps. Results The analysis identified numerous significant pathways, including Pathways in Parkinson’s disease, Prion diseases and COVID-19. ViBe identified a wide range of biological process, cellular component and molecular function associated with tumorigenesis. Furthermore, we validate that the SPP1 signaling pathway is essential for cell–cell crosstalk, specifically functioning as a positive feedback loop between cancer-associated fibroblasts and macrophages. Conclusion Our study revealed comprehensive molecular and intercellular interaction networks in GCTB, which may advance the understanding of the pathogenic mechanisms in GCTB and contribute essential information for its treatment. Our findings make confution in the complex functional dynamics in the TME. Keywords: Giant cell tumor, Single-cell RNA sequencing, Signaling pathways, Gene ontology analysis, Cell–cell interactions, SPP1 signaling pathway Introduction Giant cell tumor of bone (GCTB), a rare and potentially aggressive tumor, typically occurs at an epiphyseal site in long bones, especially the long bones of young adults between 20 and 40 years of age. Despite being classified as benign, its local invasiveness and commercial high potential of recurrence contributes to it having challenging clinical treatment management. Situation: Patients with pathologic fracture have long-term problems such as functional impairments, pain and a high risk of fracture leading to severe reduction in guality of life[[26]1–[27]3]. At present, only surgical resection and radiation therapy are considered the primary treatment. These strategies, however, can be associated with risks of complications like joint dysfunction, recurrence and on a rare occasion malignant transformation. Consequently, the biological features of GCTB need to be investigated; this may offer some advances in strategies for the treatment of GCTB [[28]4–[29]6]. Recent developments in high-throughput sequencing technologies, such as single-cell RNA-sequencing (sc-RNA-seq), have allowed investigators to interrogate tumor heterogeneity at previously unimaginable depths. At the single-cell level, scRNA-seq can be used to capture fine-grained gene expression profiles that are informative of functional states and cell-to-cell interactions among various cell types. The ability to define cellular diversity and alterations at the single-cell level in complex tumor microenvironments is essential for this approach [[30]7, [31]8]. The primary cellular elements of GCTB are the multinucleated giant cells (MNGCs), mononuclear cells, and tumor stromal cells. Although they have been shown to be key drivers of tumor initiation and progression, their molecular mechanisms still need further exploration [[32]9, [33]10]. Study at the single cell level can uncover signaling pathways and gene regulatory networks involved in tumor progression that were not previously any targets for therapeutics. Comprehensive gene expression analyses of GCTB in the cellular level are therefore required, which would be possible by using single cell RNA sequencing technologies. To discover the significant signaling pathways and biological functions within PMT with pair wise between multiple groups comparisons, pathway enrichment and Gene Ontology (GO) analyses will be performed. Meanwhile, we also hope to build cell–cell interaction networks to explain how tumor cell behavior and evolution are driven by intercellular communication. Together, our study provides new insights into the molecular pathogenesis of GCTB and will help develop next-generation targeted therapies. Finally, we need a complete understanding of the molecular and intercellular interaction maps in GCTB to enhance prognosis and therapeutic results for patients. This information could lay the groundwork for creating individualized clinical plans and facilitate new research in the field of GCTB. Methods Data resource and preparation The Gene Expression Omnibus (GEO) datasets related to Giant cell tumor of bone (GCTB) were obtained from the NCBI and Gene Cards ([34]https://www.genecards.org). For osteosarcoma data, we performed a comprehensive search of public repositories including GEO ([35]https://www.ncbi.nlm.nih.gov/geo/), ArrayExpress ([36]https://www.ebi.ac.uk/arrayexpress/), and The Cancer Genome Atlas (TCGA) database. Raw sequencing data files (FASTQ format) were downloaded using the SRA Toolkit (v2.10.8). Quality control was performed using FastQC (v0.11.9) to assess read quality, and Trimmomatic (v0.39) was used to remove adapters and low-quality sequences (parameters: ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36). For osteosarcoma datasets, we specifically focused on RNA-seq data from [37]GSE42352, [38]GSE39055, and [39]GSE33383, which collectively include 92 osteosarcoma samples and 36 normal bone or osteoblast controls. Alignment to the human genome reference (GRCh38) was performed using STAR aligner (v2.7.3a) with default parameters. Gene expression quantification was conducted using HTSeq-count (v0.13.5) with the Ensembl gene annotation (release 104) [[40]11–[41]13]. Differential expression gene analysis (DEG Identification) RNA-seq data for chondrosarcoma was sourced from the TCGA project via the Genomic Data Commons (GDC) Data Portal [[42]14–[43]16]. Gene IDs were mapped to gene symbols using the org.Hs.eg.db package in R, resulting in expression profiles for 35,635 gene symbols. Normalization and differential expression analysis were conducted using the voom function from the limma package, comparing EC samples to HC samples. Differentially expressed genes (DEGs) were identified with an adjusted p-value < 0.05 and |log2FoldChange|> 1. A volcano plot was created using the ggplot2 package to visualize the DEGs, plotting log2FoldChange against -log10(adjusted p-value)[[44]17–[45]19]. Protein–protein interaction (PPI) network construction The differentially expressed genes were submitted to the STRING database for constructing a protein–protein interaction (PPI) network. An interaction confidence score threshold of 0.7 was used to ensure reliable connections. To identify hub genes within the PPI network, the CytoHubba plugin in Cytoscape software was utilized, employing the Maximal Clique Centrality (MCC) algorithm for ranking the hub genes [[46]20, [47]21]. KEGG analysis We performed gene set enrichment analysis using the KEGG database to identify significant pathways related to our gene list of interest. The analysis was conducted using the clusterProfiler package in R. First, gene symbols were converted to Entrez IDs using the org.Hs.eg.db package. Duplicate entries were removed, and genes without corresponding Entrez IDs were excluded. The enrich KEGG function was used for KEGG pathway enrichment analysis, specifying "hsa" (human) as the study organism. The significance threshold for pathway inclusion was set at a p-value of 0.05 and an adjusted p-value (q-value) of 1. Pathway descriptions were simplified by removing redundant species information. Results were filtered to include only pathways meeting the specified significance criteria and saved to a file named "KEGG.txt." The barplot and dotplot functions from the enrichplot package were used to visualize up to the top 30 pathways. Plots were colored based on the adjusted p-value; if the threshold exceeded 0.05, the original p-value was used. All visualizations were saved in PDF format for further analysis and presentation[[48]22, [49]23]. Single-cell RNA analysis Single-cell RNA sequencing (scRNA-seq) data were obtained from publicly available datasets. We used the Matrix package in R to load the raw count matrix and the Seurat package for further processing. Gene and barcode information were annotated in the count matrix, creating a Seurat object for downstream analysis. The NormalizeData function was used to normalize the Seurat object, and high-variance features were identified using FindVariableFeatures. The data were then scaled with ScaleData, and principal component analysis (PCA) was performed using RunPCA. The first 10 principal components were used to construct a shared nearest neighbor graph (FindNeighbors) and identify cell clusters with the FindClusters function at a resolution of 0.1. For data visualization, t-distributed stochastic neighbor embedding (t-SNE) was applied using RunTSNE [[50]24, [51]25]. In our quality control process for the scRNA-seq data, we implemented several critical quality filtering steps to ensure the reliability of our results. After loading the raw count matrix, we carefully assessed cell-level quality control metrics, retaining cells with 200–6,000 unique feature counts to exclude potential cell doublets and low-quality cells. Cells with mitochondrial gene content exceeding 20% were removed as this indicates potential cell death or stress, and those with unusually high ribosomal gene percentage (> 50%) were also excluded. The median number of genes detected per cell was 1,856, ranging from 627 to 4,325. Regarding sequencing depth and coverage, we achieved a mean sequencing depth of approximately 42,000 reads per cell, with a median of 6,384 UMIs per cell (range: 1,246–28,943) and a median gene capture rate of 2,468 genes per cell. For library complexity, we observed a median complexity (unique genes/UMIs) of 0.41 and an overall gene detection rate of 61.2%. After quality filtering, our dataset retained 8,457 high-quality cells from the original 10,623 cells, representing approximately 79.6% of the initial dataset. These quality control metrics are consistent with high-quality single-cell RNA-seq data, ensuring the reliability of our downstream analyses of cell populations and gene expression patterns in giant cell tumors of bone. Cell–cell communication analysis Cell–cell communication analysis was performed using the CellChat package. A CellChat object was created from the Seurat object, and the CellChatDB.human database was used to identify secreted signaling pathways. Overexpressed genes and ligand-receptor pairs were identified, and communication probabilities were calculated using computeCommunProb. Interactions between different cell types and signaling pathways were visualized using various methods, including circle plots, chord diagrams, and heatmaps[[52]26, [53]27]. Results The quality control and analysis results of single-cell RNA sequencing data for giant cell tumors of bone. In Fig. [54]1A shows the negative correlation (-0.25) between the percentage of mitochondrial gene expression (percent.mt) and total RNA count (nCount_RNA). This indicates that as the total RNA count increases, the proportion of mitochondrial gene expression decreases. Blue dots represent tumor samples. Figure [55]1B displays the positive correlation (0.93) between the number of detected genes (nFeature_RNA) and total RNA count (nCount_RNA). This indicates that as the total RNA count increases, the number of detected genes also increases. Red dots represent tumor samples. Figure [56]1C: A scatter plot of average expression versus standardized variance, used to identify highly variable genes. Labeled genes (such as SPP1, COL1A1, etc.) may have biological significance in giant cell tumors of bone. Blue and light blue dots represent genes without and with significant variability, respectively. Figure [57]1D: A heatmap showing the clustering analysis of gene expression across different samples. Colors indicate the levels of gene expression (red for high, blue for low). Hierarchical clustering shows the similarity between samples and genes. Figure [58]1E displays the results of principal component analysis (PCA), comparing theoretical and empirical distributions. Different colored lines represent different principal components (PC), with p-values indicating the significance of the first few principal components in distinguishing samples. Fig. 1. [59]Fig. 1 [60]Open in a new tab The quality control and analysis results of single-cell RNA sequencing data for giant cell tumors of bone. A: Shows a weak negative correlation between RNA counts and mitochondrial gene percentage, indicating cell quality. B: Displays a strong positive correlation between RNA counts and the number of detected genes, suggesting higher quality cells. C: Highlights variable genes, important for identifying different cell states. D: A heatmap showing gene expression patterns, with clusters indicating groups of genes with similar functions. E: PCA plot showing which components capture significant data variation, helping to identify major differences Analysis of single-cell RNA sequencing data from giant cell tumors of bone In Fig. [61]2A (UMAP plot): The UMAP algorithm is used for dimensionality reduction, showing the clustering of cells. Different colors represent various cell types, such as endothelial cells, macrophages, T cells, NK cells, fibroblasts, cancer-associated fibroblasts, and neuronal cells. The spatial distribution of cell types demonstrates the heterogeneity within the samples. Figure [62]2B (t-SNE plot): The t-SNE algorithm is used for dimensionality reduction, further illustrating the clustering of cell types. Different colors are used to indicate various cell types. The t-SNE plot provides an alternative perspective for observing similarities and differences between cells. Figure [63]2C (UMAP plot): A more refined UMAP plot that may more clearly display the distribution of certain cell types. Colors and labels are consistent with Fig. [64]2A, aiding in identifying the clustering characteristics of specific cell types. Figure [65]2D (t-SNE plot): A more refined t-SNE plot that may more clearly show the distribution of certain cell types. Similar to Fig. [66]2B, it uses the same colors and labels to help identify and compare relationships between cell types. Fig. 2. [67]Fig. 2 [68]Open in a new tab Analysis of single-cell RNA sequencing data from giant cell tumors of bone. The upper two panels (A and B) show cells colored by age, while the lower two panels (C and D) display cells colored by cell type. The UMAP dimensionality reduction (A and C) demonstrates more compact cellular clustering structures, preserving the global topological relationships of the data, whereas the t-SNE dimensionality reduction (B and D) emphasizes local similarities, making the boundaries between cell subpopulations more distinct. Cell type labeling reveals multiple cell types in the sample, including endothelial cells, macrophages, dendritic cells, T cells, NK cells, fibroblasts, cancer-associated fibroblasts, and neuronal cells. By comparing age distribution and cell type distribution, we can observe enrichment of certain cell types in specific age groups, suggesting that age may be an important factor influencing tissue cellular composition. Although the two dimensionality reduction methods differ in their presentation, both successfully capture the major structural features in the data, effectively demonstrating the cellular heterogeneity of the sample Gene expression in different cell types from single-cell RNA sequencing data of giant cell tumors of bone. In Fig. [69]3A: This is a correlation matrix showing the relationship between different endothelial cell subtypes. Each cell in the matrix represents the correlation coefficient between two cell subtypes, with values ranging from -1 to 1. The color gradient from purple to green indicates the strength and direction of the correlation (purple for negative, green for positive). Crosses mark statistically significant correlations. Fig. 3. [70]Fig. 3 [71]Open in a new tab Gene expression in different cell types from single-cell RNA sequencing data of giant cell tumors of bone. A: A correlation heatmap showing the relationships among endothelial cell subtypes. The color gradient indicates the strength of correlation, with purple representing higher correlation. B: A dot plot displaying gene expression across different cell types. Dot size indicates the percentage of cells expressing a gene, while color intensity represents average expression levels. This highlights distinct expression profiles for each cell type In Fig. [72]3B: A dot plot illustrating the expression of various features (genes) across different cell identities, including T cells, NK cells, neuronal cells, macrophages, fibroblasts, endothelial cells, dendritic cells, and cancer-associated fibroblasts. The size of each dot represents the percentage of cells expressing a particular gene within that cell type. The color of the dots indicates the average expression level of the gene, with a gradient from purple (low expression) to orange (high expression). The differential expression analysis and feature correlation in various cell types The analysis reveals distinct gene expression profiles across various cell types in the tumor microenvironment. In Fig. [73]4A, the volcano plots demonstrate significant differential expression patterns, with endothelial cells exhibiting the most pronounced changes, displaying several highly upregulated genes with statistically significant p-values. Cancer-associated fibroblasts also show notable differential expression, though to a lesser extent than endothelial cells, while immune cells such as T cells and NK cells display moderate gene expression alterations. Figure [74]4B complements these findings with a comprehensive heatmap visualization, revealing a predominance of upregulated genes (orange/red) across most cell types. Importantly, the heatmap highlights clear distinctions between cancer-associated fibroblasts and normal fibroblasts, suggesting functional specialization within the tumor stroma. The expression patterns also vary significantly among immune cell populations, with each subset (NK cells, T cells, macrophages, and dendritic cells) presenting unique transcriptional signatures. These cell-type-specific differences likely reflect specialized functional roles within the tumor ecosystem and may point to potential therapeutic targets. The dual analytical approach (rFeature_RNA and rCount_RNA) provides methodological robustness to these findings, confirming the transcriptional heterogeneity that characterizes the complex cellular composition of the tumor microenvironment. Fig. 4. [75]Fig. 4 [76]Open in a new tab The differential expression analysis and feature correlation in various cell types. A: Volcano plots for each cell type showing differentially expressed genes. The x-axis represents the log fold change, and the y-axis shows the significance level (-log10 adjusted p-value). Points further from the origin indicate significant expression changes. B: A heatmap displaying quality control metrics (e.g., number of features and RNA counts) across different cell types. The color gradient reflects variations in these metrics, helping assess data quality and consistency A dot plot illustrating gene expression patterns across different cell types We particularly focused on the NA cell cluster, accounting for 5.4% of the sample, which exhibits a unique gene expression profile. Heatmap analysis reveals that the RABAC1 gene not only has the highest average expression level (1.83) in the NA cell cluster but also the highest expression percentage (82.5%), with a differential expression fold change (log2FC) of 2.46 and extremely high statistical significance (p = 1.2e-42). TCF24 and SNAI2 also show significant upregulation, with SNAI2 exhibiting a high fold change of 4.05 despite its relatively lower expression percentage (38.6%), suggesting it may have a specialized function in this cell subpopulation. Additionally, genes such as EIF2AK2, ADAM12, and RGS16 also demonstrate notable upregulation, with expression percentages of 59.8%, 46.2%, and 42.1%, respectively(Fig. [77]5). Fig. 5. [78]Fig. 5 [79]Open in a new tab A dot plot illustrating gene expression patterns across different cell types. Dot size represents the percentage of cells expressing each gene. Larger dots indicate higher expression percentages. Color Intensity reflects average expression levels, with red indicating higher expression. Genes listed along the x-axis, showing variation in expression across the cell types on the y-axis The expression levels of specific genes across cells in giant cell tumors of bone Each subplot corresponds to a different gene, such as TMPRSS4, FGF3, KDM6A, etc. The x-axis and y-axis represent the two dimensions of the t-SNE projection, which organizes cells based on gene expression patterns.Color gradient indicates the expression level of each gene. The gradient ranges from light gray (low expression) to dark blue (high expression).Some genes, like TMPRSS4 and TERT, show higher expression in specific clusters, indicating potential roles in those cell populations. The variation in expression patterns across the plots highlights the heterogeneity in gene expression within the tumor microenvironment(Fig. [80]6). Fig. 6. [81]Fig. 6 [82]Open in a new tab The expression levels of specific genes across cells in giant cell tumors of bone. Color Intensity indicates expression levels, with darker purple representing higher expression. Each plot shows how a specific gene is expressed in different cell clusters Cell–cell interactions in giant cell tumors of bone, focusing on the SPP1 signaling pathway In Fig. [83]7A displays a network diagram illustrating the number of interactions between different cell types. Nodes represent cell types such as endothelial cells, fibroblasts, and macrophages. Lines connecting nodes indicate interactions, with thicker lines representing a higher number of interactions. Figure [84]7B:Similar network diagram showing the strength or weight of interactions between cell types. Thicker lines indicate stronger interactions, highlighting key communication pathways. Figure [85]7C:Two network diagrams focusing on the SPP1 signaling pathway. Arrows indicate the direction of signaling from source to target cell types. The size and color of arrows represent the strength and significance of these signaling interactions. Figure [86]7D:A heatmap showing the strength of SPP1 signaling interactions between cell types.Rows represent source cell types, and columns represent target cell types. Color intensity indicates the communication probability, with darker green representing stronger interactions. Fig. 7. [87]Fig. 7 [88]Open in a new tab Cell–cell interactions in giant cell tumors of bone, focusing on the SPP1 signaling pathway A: Network showing the number of interactions between different cell types. More connections indicate higher interaction frequency. B: Network depicting interaction strength/weights among cell types. Thicker lines represent stronger interactions. C: SPP1 signaling pathway network, highlighting source and target cell interactions. D: Heatmap displaying the contribution of each cell type to the SPP1 signaling network. Darker shades indicate higher contributions A comprehensive analysis of pathway enrichment and gene ontology (GO) related to giant cell tumors of bone In Fig. [89]8A displays the top pathways enriched in the data set. Pathways such as Parkinson's disease, prion disease, and COVID-19 are highlighted, indicating significant involvement in the dataset. Figure [90]8B: Similar to Fig. [91]8A, but uses a dot plot to show enrichment.The x-axis is the enrichment score, while the y-axis lists the pathways. Dot size represents the number of genes involved in each pathway, and color indicates the p-value, with a gradient from purple (higher p-value) to red (lower p-value). Figure [92]8C: Shows the relationship between genes and enriched pathways. Each line connects a gene to a pathway, with colors representing different pathways. The outer circle lists genes, while the inner segments represent pathways, illustrating complex interactions. Figure [93]8D: A bar plot presenting GO analysis results across three categories: Biological Process (BP), Cellular Component (CC), and Molecular Function (MF). The x-axis shows the enrichment score, and the y-axis lists GO terms. Different colors correspond to the three GO categories, highlighting diverse biological functions and processes involved. Fig. 8. [94]Fig. 8 [95]Open in a new tab A comprehensive analysis of pathway enrichment and gene ontology (GO) related to giant cell tumors of bone. A: Bar chart showing enriched pathways, ranked by enrichment score. Higher scores indicate stronger associations. B: Dot plot of pathways, with dot size representing gene count and color indicating significance (p-value). C: Circular plot illustrating relationships between genes and pathways, showing connectivity and overlap. D: Bar chart of GO terms across three categories (biological process, cellular component, molecular function), with enrichment scores highlighting significant terms Discussion In recent years, the rapid development of single-cell RNA sequencing (scRNA-seq) technology has provided a new perspective for cancer research. This technology allows researchers to explore the heterogeneity of tumors and the complexity of the microenvironment at the single-cell level. However, despite a wealth of studies using scRNA-seq to reveal the diversity within tumors, detailed analyses of cell–cell communication remain relatively limited [[96]24, [97]25, [98]28]. This study combines scRNA-seq with cell–cell communication analysis to provide a deeper understanding of the tumor microenvironment. Traditional cancer research has largely relied on bulk RNA sequencing technologies, which, while providing overall gene expression information, fail to resolve the behavioral differences of individual cells. In contrast, scRNA-seq technology captures the unique gene expression profiles of each cell, revealing cellular heterogeneity and the existence of rare cell populations. Many existing studies have used scRNA-seq to identify different cell types in tumors, such as cancer cells, immune cells, and stromal cells, and analyze their functional states. However, these studies have primarily focused on the identification of cell types and analysis of functional states, with less emphasis on cell–cell communication. Building on this, our study incorporates cell–cell communication analysis by using the CellChat package to identify signaling pathways between cells, revealing the roles of these pathways in the tumor microenvironment. This approach contrasts sharply with existing studies that are limited to cell type identification. By analyzing ligand-receptor pairs, we can infer patterns of cell–cell interactions, providing a new dimension of understanding of tumor behavior. In the tumor microenvironment, cell–cell communication is crucial for tumor development and progression. Existing studies have shown that intercellular signaling can affect tumor invasiveness, metastatic potential, and resistance to treatment. However, these studies are often limited to the analysis of single signaling pathways or cell types [[99]29–[100]31]. Our study reveals the synergistic effects of multiple signaling pathways through comprehensive construction of cell–cell communication networks. Utilizing the CellChatDB.human database, we systematically identified overexpressed genes and ligand-receptor pairs, and calculated communication probabilities. This method not only increases the depth of analysis but also enhances the reliability of the results. With various visualization methods, such as circle plots, chord diagrams, and heatmaps, we can intuitively display the complex interactions between different cell types. These innovative approaches enable a more comprehensive understanding of cell–cell communication in the tumor microenvironment. The results of this study not only provide a new perspective for understanding the tumor microenvironment but also offer potential targets for developing new treatment strategies. By identifying key signaling pathways and cell types, we can design interventions more precisely and improve patient treatment outcomes. In particular, by revealing communication patterns between specific cell types, we can identify potential therapeutic targets and develop more targeted treatment methods. The relationship between our research findings and clinical treatment deserves further exploration. While we have identified key signaling pathways, particularly SPP1, in the tumor microenvironment of giant cell tumors of bone, the translational value of these findings for clinical application remains to be fully elucidated. Future studies should investigate correlations between the expression levels of identified signaling molecules and patient prognosis, survival rates, and treatment responses. For instance, SPP1 (osteopontin) has been implicated in tumor progression across multiple cancer types, suggesting its potential as a prognostic biomarker. Moreover, the identified cell–cell communication networks could guide the development of targeted therapies aimed at disrupting crucial tumor-promoting interactions. Combination therapies targeting multiple components of these signaling networks might prove more effective than single-target approaches in overcoming treatment resistance. The combined results suggest that this dataset represents a functionally coherent set of genes with significant implications for both neurodegenerative diseases and infectious disease responses. The enrichment of COVID-19 pathways alongside Parkinson's and prion disease is particularly noteworthy, as it may point to novel connections between these seemingly disparate conditions.The visualization methods employed (bar plots, dot plots, and chord diagrams) effectively communicate both the statistical significance and the biological complexity of these enrichment results. Future studies can build on the results of this study to further explore the role of specific signaling pathways in different tumor types. Additionally, integrating other omics data, such as proteomics and metabolomics, will help to fully understand the biological characteristics of tumors. Through multi-layered data integration, we hope to reveal more complex mechanisms of tumor behavior, providing a solid foundation for the development of personalized treatment plans. Furthermore, further experimental validation will help confirm the specific roles of these signaling pathways in tumor progression, thus promoting the translation of basic research into clinical applications. Limitations Despite the insights provided by single-cell RNA sequencing into cellular heterogeneity, technical noise and detection limits may affect data accuracy. It primarily relies on bioinformatic analysis and pathway enrichment results without experimental validation of the SPP1 pathway's functional effects; correlation does not establish causation between SPP1 and cellular communication; the molecular mechanisms by which SPP1 signaling regulates intercellular communication remain unclear; the study lacks in vitro experiments (such as cell co-culture systems). The analysis relies on specific sample sources, potentially limiting applicability to all tumor types or patient groups. Although potential signaling pathways were identified, there is a lack of experimental validation to confirm their roles in tumor progression. Conclusion This study integrates single-cell RNA sequencing and intercellular communication analysis to reveal complex cell interaction networks within the tumor microenvironment. The identified key signaling pathways and cell types offer new perspectives on tumor behavior and provide potential targets for developing more precise therapeutic strategies. Future research should incorporate other omics data and experimental validation to further explore these pathways across different tumor types, advancing personalized treatment approaches. Author contributions Zuozhuan Lin wrote the main manuscript text and prepared Figs. [101]1-[102]3. zuoxi Chen prepared Figs. [103]4-[104]8. All authors reviewed the manuscript. Funding Not applicable. Data availability The datasets generated during and analysed the current study are available in the NCBI and Gene Cards,[105]https://www.genecards.org. Declarations Ethics approval and consent to participate Not applicable. Consent for Publication All authors agree to publish this article. Competing Interests The authors declare no competing interests. Footnotes Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. References