Graphical abstract graphic file with name fx1.jpg [37]Open in a new tab Highlights * • Immunosuppression in the tumor microenvironment of O and P * • A 30-gene signature of ovarian-derived TAMs was validated to predict O * • Dynamic changes in immune and stromal cell states during the PT-to-M progression * • Identification of new transcription factor CLOCK in the TGPRNs of metastatic GC __________________________________________________________________ Biological sciences; Cancer; Transcriptomics Introduction Gastric cancer (GC) is a major global malignancy, ranking fifth in cancer incidence and fourth in cancer-related deaths, with particularly high prevalence in Asia, Eastern Europe, and Central America, where incidence rates are notably elevated.[38]^1 Despite the implementation of surgical interventions, the prognosis of GC remains poor, with an overall five-year survival rate of less than 30%, primarily due to late-stage diagnoses and challenges associated with organ-specific metastasis.[39]^2 Metastasis is a major contributor to the poor prognosis of GC, yet the molecular mechanisms underlying this cellular behavior remain largely elusive.[40]^3 Tumor cells possess the ability to invade distant sites via various routes, including blood circulation, the lymphatic system, direct infiltration, and transcoelomic spread.[41]^4 During this metastatic process, tumor cells undergo adaptive changes, activating specific genes and pathways that enable them to thrive in particular organs. The interactions between the primary tumor, the host microenvironment, and immune cells recruited by the tumor play a pivotal role in driving metastatic progression.[42]^5 To develop targeted therapeutic strategies and identify potential biomarkers for clinical diagnosis, it is crucial to accurately delineate the organ-specific features of metastasis. However, the current understanding of organ-specific metastasis in GC remains incomplete. The exploration of GC diversity has advanced significantly with the application of single-cell transcriptome sequencing, a cutting-edge methodology.[43]^6 While the conventional approach to investigating organ-specific metastasis typically involves transcriptome profiling, some researchers have employed bulk RNA sequencing (RNA-seq) to identify genes critical to site-specific metastases. The expression of these genes in the primary tumor has been linked to the recurrence of GC metastasis.[44]^7 However, bulk methods have inherent limitations, particularly in discerning cell diversity, resulting in an oversight of the intricate intra- and inter-tumor complexity in GC.[45]^8 The emergence of single-cell RNA sequencing (scRNA-seq) technology has revolutionized this landscape, enabling precise and comprehensive examination of intra- and inter-tumor heterogeneity across diverse cancer types.[46]^9 Transcription factors (TFs) play a crucial role in regulating gene expression and thereby exert indirect control over various cellular processes and states. Overexpression of specific TFs has the potential to induce significant alterations in cell fate, making them a promising approach to cancer diagnosis, prognosis, and treatment.[47]^10 Therefore, to enhance our understanding of GC and develop potential therapeutic interventions, it is essential to identify cell-specific TFs in both primary and metastatic GC. In this study, we characterized a total of 42,968 high-quality cells from six patients across 10 samples to identify seven major cell types and 27 unique immune and stromal cell subsets within the TME of primary and metastatic GC. Our analysis comprehensively explored cellular compositions, developmental trajectories, cell interactions, and transcriptional regulation between primary tumors and metastases. These findings provide valuable insights into the cellular, molecular, and functional changes during GC progression. Furthermore, we constructed a cell-specific TFs regulatory network in both primary and metastatic GC, enhancing our understanding of the transcriptional mechanisms driving the progression from primary to metastatic GC. Our study paves the way for future investigations and offers a valuable resource for subsequent academic research. Results Remodeling the single-cell transcriptional landscape reveals insights into primary-to-metastatic GC progression We performed an in-depth analysis of single-cell transcriptomic data obtained from the Gene Expression Omnibus (GEO) dataset [48]GSE163558 ,[49]^8 which comprises 10 human tissue samples from six patients. Our sample cohort included three primary tumor samples (PT1, PT2, and PT3), one adjacent non-tumor sample (NT1), and six unique metastatic samples (M): liver metastasis (Li1 and Li2), lymph node metastasis (LN1 and LN2), peritoneal metastasis (P1), and ovarian metastasis (O1). Specifically, PT1 and Li1 were from patient 1 (p1); PT2 and NT1 were from patient 2 (p2); O1 was from patient 3 (p3); PT3, Li2, and LN1 were from patient 4 (p4); LN2 was from patient 5 (p5); and P1 was from patient 6 (p6). After rigorous quality control, 42,968 cells were retained for subsequent analysis, with an average of 1,639 genes, 5,529 unique molecular identifiers (UMIs), and only about 6% mitochondrial genes per cell detected ([50]Figures S1B–S1D). Interestingly, the proportion of ribosomal genes, as well as their correlation with nCount_RNA and nFeature_RNA, was found to be very low. Unsupervised clustering, based on transcriptome similarity, identified 12 distinct cell clusters, which were visualized using the t-distributed stochastic neighbor embedding (tSNE) analysis. The spatial distribution of cells from each patient and sample was also obtained ([51]Figure S1A). Correlation analysis validated the reliability of the clusters, demonstrating that clusters from the same cell lineage exhibited higher similarity than those from different lineages ([52]Figure S1E). Immune cells accounted for the majority of our analyzed cells from the GC samples, particularly in the lymph node and liver, which has also been observed in other scRNA-seq studies of cancer.[53]^11 We identified seven major cell types using the following markers: T cells (e.g., CD3D, CD3E, CD2), B cells (e.g., CD79A, IGHG1, IGHG3), myeloid cells (e.g., CD68, CXCL8), natural killer (NK) cells (e.g., KLRD1, GNLY, KLRF1), epithelial cells (e.g., EPCAM, KRT19, ELF3), stromal cells (e.g., VWF, COL1A1, COL12A1), and hematopoietic stem cells (HSCs) (e.g., AREG, ABCC1, GATA2). Intriguingly, gene ontology (GO) analysis showed that multiple mast cell-associated pathways were enriched in HSCs ([54]Figure S1G), indicating that HSCs have the potential to differentiate into mast cells under certain conditions, may be involved in mast cell-related immune responses or tumor progression. These cell clusters were characterized by marker gene expression and distinct physiological functions, exhibited distinct distributions during GC primary to metastatic (PT-to-M) progression ([55]Figures 1A–1C and [56]S1F and [57]Table S1). Notably, myeloid, epithelial, and stromal cells showed decreasing trends during PT-to-M progression, whereas B cells exhibited an increasing trend during GC lymph node metastasis progression, and T cells showed an increasing trend during both GC lymph node and liver metastasis progression ([58]Figures 1A and 1B). Figure 1. [59]Figure 1 [60]Open in a new tab Single-cell atlas of primary tumor, metastasis, and adjacent non-tumor samples (A) tSNE visualization highlighting seven distinct cell types among the 42,968 cells (left), delineating the TME cell clusters across different tissue groups (right). (B) Proportion of each cell type in various samples (NT, PT, M, Li, LN, P, and O) and patients (p1-p6). (C) Left: heatmap showing the expression profiles of the top five genes ranked by LogFC of each cell type. Right: enriched KEGG pathways for the marker genes of each cell type. (D) Venn diagram for the overlap analysis of the marker genes across the seven cell types and five cancer samples. (E) Ring heatmap visualizing the marker genes specific to each of five cancer samples. (F) Dot plot showing the enrichment pathways in the five cancer samples based on the DEGs using KEGG enrichment analysis. (G) Functional changes during PT-to-M progression. Next, we explored the dynamic changes in the TME from additional perspectives. At the molecular level, we identified 114 overlapping genes as potential markers during PT-to-M progression, revealing a correlation between changes in the cellular compositions and the expression of cellular marker genes ([61]Figure 1D). CD79A, a marker gene of B cells, was specifically expressed in LN, suggesting a link between its expression and the increasing trend of B cells in LN ([62]Figure 1E). Additionally, we performed Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis on the differentially expressed genes (DEGs) in our cancer samples, which provided insights into the functional differences between PT and M ([63]Figure 1F and [64]Table S2). The activation of the NF-κB signaling, IL-17 signaling, and TNF signaling pathways in PT suggested dynamic interactions between cancer cells and the immune and inflammatory components of the TME.[65]^12 Furthermore, Li exhibited enrichment for the MAPK signaling pathway, and O showed specific enrichment for cholesterol metabolism. We also observed the activation of ferroptosis in P, a phenomenon reported in other studies,[66]^13 indicating the involvement of cell death regulation in GC peritoneal metastasis.[67]^14 Together, the KEGG pathway enrichment analysis indicated that cells undergo significant functional reprogramming during PT-to-M progression, with upregulated antigen processing and presentation pathways observed during the progression of GC liver and peritoneal metastasis ([68]Figures 1G and [69]S1H). Immunosuppression in the TME of GC ovarian and peritoneal metastasis and a 30 gene signature of ovarian-derived TAMs was validated to predict ovarian metastasis Sub-clustering analyses of immune and stromal cells identified 27 cell subsets ([70]Figures 2A and [71]S2A). T cells were divided into nine clusters: CD8^+ T cells, naive CD8^+ T cells, naive CD4^+ T cells, CD8^+ T effector memory (Tem) cells, natural killer T (NKT) cells, regulatory T cells (Tregs), CD4^+ T effector memory (Tem) cells, exhausted CD8^+ T cells, and natural regulatory T cells. B cells were primarily composed of six subgroups: T cell-like B cells,[72]^8 naive B cells, memory B cells, plasma cells, germinal center (GC) B cells, and myeloid-like B cells (M-B cells). Myeloid cells were further divided into five subsets: neutrophils, monocytes, macrophages, tumor-associated macrophages (TAMs), and plasmacytoid dendritic cells (pDCs). NK cells were subdivided into four subclusters: NK cell progenitors (Pro-NK cells), CD3/CD28-stimulated NK cells, IL7R + NK cells, and SLFN13+ NK cells. Stromal cells were categorized into endothelial cells, pericytes, and cancer-associated fibroblasts (CAFs). Notably, each cluster exhibited a distinct gene expression pattern ([73]Figure 2B and [74]Table S3). Correlation analysis revealed that clusters from the same cell lineage showed higher similarity to one another than to those from other lineages ([75]Figure S2B), confirming the reliability of the clustering. The distribution of UMIs in each cell lineage is shown in [76]Figure S2C. These results highlight a high degree of cell state diversity within the TME of primary and metastatic GC. Figure 2. [77]Figure 2 [78]Open in a new tab Subclusters of immune and stromal cells characterized during GC progression (A) tSNE showing 27 unique immune and stromal cell subsets (upper panels) and their corresponding TME cell clusters (lower panels) across different tissue groups. (B) Heatmap of the expression of subset-specific markers across cell subsets. (C) Reproducible cell subset distributions across the five cancer samples. (D) Heatmap showing tissue preferences of clusters in each immune and