Abstract Cancer-associated fibroblasts (CAFs) are a heterogeneous cell population that plays a crucial role in remodeling the tumor microenvironment (TME). Here, through the integrated analysis of spatial and single-cell transcriptomics data across six common cancer types, we identified four distinct functional subgroups of CAFs and described their spatial distribution characteristics. Additionally, the analysis of single-cell RNA sequencing (scRNA-seq) data from three additional common cancer types and two newly generated scRNA-seq datasets of rare cancer types, namely epithelial-myoepithelial carcinoma (EMC) and mucoepidermoid carcinoma (MEC), expanded our understanding of CAF heterogeneity. Cell–cell interaction analysis conducted within the spatial context highlighted the pivotal roles of matrix CAFs (mCAFs) in tumor angiogenesis and inflammatory CAFs (iCAFs) in shaping the immunosuppressive microenvironment. In patients with breast cancer (BRCA) undergoing anti-PD-1 immunotherapy, iCAFs demonstrated heightened capacity in facilitating cancer cell proliferation, promoting epithelial-mesenchymal transition (EMT), and contributing to the establishment of an immunosuppressive microenvironment. Furthermore, a scoring system based on iCAFs showed a significant correlation with immune therapy response in melanoma patients. Lastly, we provided a web interface ([41]https://chenxisd.shinyapps.io/pancaf/) for the research community to investigate CAFs in the context of pan-cancer. Supplementary Information The online version contains supplementary material available at 10.1186/s12943-023-01876-x. Keywords: Spatial transcriptomics, Single-cell RNA sequencing, Pan-cancer analysis, Cancer-associated fibroblasts, Tumor microenvironment, Tumor immunotherapy Introduction Tumors display extensive heterogeneity, with cancer cells engaging in reciprocal interactions with their microenvironment, forming a complex ecosystem [[42]1]. Cancer-associated fibroblasts (CAFs), as one of the most prominent and abundant cell populations in the tumor microenvironment (TME) [[43]2], have garnered significant attention in recent years. CAF's intricate interactions with stromal components and immune cells play a crucial role in orchestrating TME reorganization, encompassing processes such as angiogenesis, extracellular matrix (ECM) remodeling, and immune evasion [[44]3–[45]5]. At present, the crucial role of CAFs has been largely overlooked by most therapies, including immunotherapy and chemotherapy [[46]3]. Our current understanding of the interplay between CAFs and components of TME is insufficient to support the development of reliable treatment strategies. Further research is needed to deepen our understanding of these interactions and pave the way for effective therapeutic interventions. In recent years, the application of single-cell transcriptomics has unraveled the heterogeneity of CAFs within many cancer types, such as bladder carcinoma (BC) [[47]6], head and neck squamous cell carcinoma (HNSCC) [[48]7], papillary thyroid carcinoma (PTC) [[49]8], and lung cancer (LC) [[50]9]. Furthermore, two recent unbiased studies based on single-cell RNA sequencing (scRNA-seq) explored the heterogeneity and plasticity of CAFs from a pan-cancer perspective and revealed the conservation of CAF phenotypes across cancer types [[51]10, [52]11]. Although scRNA-seq provides an unprecedented opportunity to systematically dissect the heterogeneity of CAFs, the loss of spatial information during tissue dissociation hinders the study of the crosstalk between CAFs and TME. Recently developed spatial transcriptomics (ST) can obtain whole-transcriptome data within tissue sections, thereby preserving the spatial position information of cells [[53]12]. Therefore, orthogonal integration of scRNA-seq data and ST data will help determine the spatial distribution characteristics of CAFs and further dissect the cellular communication between CAFs and TME. In this study, we have delineated the landscape of CAFs in six common cancer types and described the unique functional features of these subtypes. We also analyzed scRNA-seq data of three additional common tumors and two newly sequenced rare tumors to expand our understanding of CAF heterogeneity. A spatial single-cell transcriptomic atlas spanning six tumors, including 744,289 cells, generated by integrating scRNA-seq data and ST data was used to describe the spatial distribution characteristics of CAFs and to characterize the complex interactions between CAFs and TME. Notably, a score generated based on inflammatory CAFs (iCAFs) showed a significant correlation with the response of melanoma patients to immunotherapy. In summary, our integrated data resources provide novel insights and guidance for the development of therapeutic strategies targeting CAFs in TME. Results Construction of a pan-cancer spatial single-cell transcriptome atlas To establish a spatial single-cell landscape in pan-cancer, we acquired scRNA-seq data from 69 samples of 56 patients diagnosed with one of the six prevalent cancer types, along with ST data from 22 tissue slices of 22 patients (Fig. [54]1a and b; Table S[55]1 and S[56]2). Among them, the ST data of 10 tissue slices had corresponding scRNA-seq data from the same patient (Fig. [57]1c; Table S[58]1 and S[59]2). The data we collected included six types of cancer: BRCA, colorectal cancer (CRC), liver hepatocellular carcinoma (LIHC), ovarian cancer (OVCA), prostate adenocarcinoma (PRAD), and uterine corpus endometrial carcinoma (UCEC) (Fig. [60]1a; Table S[61]1 and S[62]2). After strict quality control and filtration, a total of 163,919 cells in the scRNA-seq data and 59,529 spots in ST data were retained for downstream analysis (Fig. [63]1d and S[64]1a). In the scRNA-seq dataset, the median number of unique molecular identifiers (UMIs) per cell was 3955, and the median number of genes per cell was 1425 (Figure S[65]1b and c). For ST analysis, the median number of UMIs per spot was 11,139 and the median number of genes per spot was 3,863 (Figure S[66]1d and e). To minimize the batch effect between different scRNA-seq datasets, we independently analyzed each dataset. Taking CRC as an example, we used graph-based clustering and identified seven major clusters based on typical markers of different cell types (Table S[67]3), including epithelial cells, fibroblasts, endothelial cells, T&NK, B cells, myeloid cells and mast cells (Fig. [68]1e and f). CopyKAT was used to estimate the single-cell copy number variation (CNV) landscape of tumors, in order to distinguish malignant epithelium from non-malignant epithelium (Fig. [69]1e). The myeloid cells were further divided into monocytes, macrophages and dendritic cells (Fig. [70]1e). The CD8 + T cells, CD4 + T cells, regulatory T cells (Treg cells) and natural killer cells (NK cells) were identified from the T&NK cluster (Fig. [71]1e). Similarly, cells from the other 5 types of cancer were clustered into roughly the same subgroups (Figure S[72]2, S[73]3, S[74]4 and S[75]5). Of note, we detected neutrophils in the scRNA-seq data from the inDrop platform, which were not detected in the scRNA-seq data from the 10 × Genomics platform (Figure S[76]2). Neutrophils are very fragile and have low RNA content [[77]13], which may be the main reason for the capture failure. Fig. 1. [78]Fig. 1 [79]Open in a new tab A pan-cancer spatial single-cell transcriptome atlas. a Schematic depicting the study design. The cancer types included in this pan-cancer study were displayed in the first image on the left, created by Figdraw. b The number of samples in the pan-cancer analysis of scRNA-seq and ST. c Pie chart showing the proportion of ST sections that have corresponding scRNA-seq data from the same patient compared to those without such corresponding scRNA-seq data. d The number of cells in the pan-cancer analysis of scRNA-seq and ST. e Uniform Manifold Approximation and Projection (UMAP) plots showing the major cell types in CRC. f Bubble heatmap showing the expression of marker genes for the major cell types in CRC. g Spatial cell charting of CRC using CellTrek CellTrek is a computational toolkit that enables direct mapping of individual cells back to their spatial coordinates in tissue sections based on scRNA-seq and ST data [[80]14]. Unlike ST deconvolution methods, this approach transferred ST coordinates to single cells, thereby achieving single-cell resolution [[81]14]. We applied it to quality-controlled scRNA-seq and ST data in pan-cancer to reconstruct spatial single-cell atlases. Even without corresponding scRNA-seq data from the same patient, ST datasets were still largely covered by scRNA-seq datasets based on the co-embedding results (Figure S[82]6). Due to some cells being repeatedly mapped, we ultimately obtained a pan-cancer spatial single-cell transcriptomic atlas containing 744,289 cells (Fig. [83]1d and g). CAF heterogeneity in pan-cancer To compare the similarity of the main cell lineages of different cancer types, we constructed a phylogenetic tree (Figure S[84]7a). Compared with the biased distribution of epithelial cells, fibroblasts from different cancer types clustered together (Figure S[85]7a), indicating that fibroblasts had similar transcriptional features in different cancer types. Interestingly, NK cells and B cells originating from UCEC demonstrated unique features (Figure S[86]7a), implying that the TMEs across diverse cancer types could have potentially exerted distinct effects on immune cell phenotypes. We subsequently investigated the heterogeneity of fibroblasts in scRNA-seq datasets of the 6 cancer types (Fig. [87]2a). The reclustering of the fibroblast cluster identified four CAF subtypes, as well as pericytes and smooth muscle cells (SMCs) (Fig. [88]2a). After applying Harmony for batch correction, all cells with local inverse Simpson's Index (LISI) greater than 1 indicate that no obvious batch effects were observed (Figure S[89]8). CFD + fibroblasts showed high expression of chemokines (CCL11, CXCL12, and CXCL14) (Fig. [90]2b; Table S[91]4), similar to the previously reported iCAFs in various types of tumors such as BC [[92]6] and PTC [[93]8]. GO enrichment analysis of its marker genes showed their association with the response to mechanical stimulation, reactive oxygen species, epithelial cell proliferation, immune system, and cell migration (Fig. [94]2c). POSTN + fibroblasts showed high expression levels of several ECM remodeling genes (MMP11, CTHRC1, COL1A1, COL1A2, COL3A1, COL10A1, and COL11A1) and enriched signatures of ECM (Fig. [95]2b and c; Table S[96]4), which were consistent with the previously reported matrix CAFs (mCAFs) in cervical squamous cell carcinoma (CESC) [[97]15]. Interestingly, a cluster of cells was related to the response to hypoxia and canonical glycolysis (Fig. [98]2c), resembling the reported metabolic CAFs (meCAFs) in pancreatic ductal adenocarcinoma (PDAC) [[99]16]. Notably, we also found a cluster of cells that exhibited higher expression of a set of cell cycle-related genes (CENPF, NUSAP1, PTTG1, STMN1, TOP2A, and TUBA1B) (Fig. [100]2b; Table S[101]4), which was consistent with proliferative CAFs (pCAFs) in a previous pan-cancer study of CAFs [[102]10]. Immunofluorescence on tissue microarrays from BRCA patients further substantiated the existence of the four CAF subtypes (Figure S[103]9). Next, we further investigated the heterogeneity of CAFs using the AUCell algorithm, based on the functional features of CAFs summarized by Lavie et al. [[104]17] (Fig. [105]2d; Table S[106]5). iCAFs exhibited the highest activity in immune-related functions, including complement activation, chemokine production, and inflammatory response (Fig. [107]2d). Additionally, the biological processes of angiogenesis, wound healing, regulation of ECM organization and collagen biosynthetic process were all enriched in mCAFs (Fig. [108]2d). As expected, meCAFs exhibited a high level of glycolytic activity (Fig. [109]2d). Interestingly, in addition to the cell cycle, pCAFs were also involved in IFN − I production and muscle contraction (Fig. [110]2d). Fig. 2. [111]Fig. 2 [112]Open in a new tab CAF heterogeneity in pan-cancer. a UMAP plots showing the integration of fibroblasts across six different cancer types by Harmony. b Differential expression analysis showing the upregulated genes for each fibroblast subtype. An adjusted p value < 0.05 is indicated in red, while an adjusted p value ≥ 0.05 is indicated in blue. c GO enrichment analysis of upregulated genes in each CAF subtype. d Heatmap showing pathway activities scored by AUCell in each CAF subtype. e Proportion of CAF subtypes across multiple cancer types. f Heatmap showing the ORs of CAF subtypes in each cancer type. g Scatter plot showing the RSSs in each CAF subtype. The top 5 regulons are highlighted. h SCAP analysis of metabolic pathways in meCAFs. i Slingshot trajectory analysis of CAFs. j GeneSwitches analysis of pathway activity changes in the transition pathway from pericytes to iCAFs Although iCAFs and mCAFs were the major CAFs celltypes across 6 cancer types, different subtypes of CAFs still exhibited significant cancer preferences (Fig. [113]2e and f). iCAFs were enriched in BRCA and CRC,