Abstract

   Tumor-infiltrating T cells are essential players in tumor
   immunotherapy. Great progress has been achieved in the investigation of
   T cell heterogeneity. However, little is well known about the shared
   characteristics of tumor-infiltrating T cells across cancers. In this
   study, we conduct a pan-cancer analysis of 349,799 T cells across 15
   cancers. The results show that the same T cell types had similar
   expression patterns regulated by specific transcription factor (TF)
   regulons across cancers. Multiple T cell type transition paths were
   consistent in cancers. We found that TF regulons associated with CD8^+
   T cells transitioned to terminally differentiated effector memory
   (Temra) or exhausted (Tex) states were associated with patient clinical
   classification. We also observed universal activated cell-cell
   interaction pathways of tumor-infiltrating T cells in all cancers, some
   of which specifically mediated crosstalk in certain cell types.
   Moreover, consistent characteristics of TCRs in the aspect of variable
   and joining region genes were found across cancers. Overall, our study
   reveals common features of tumor-infiltrating T cells in different
   cancers and suggests future avenues for rational, targeted
   immunotherapies.

   Keywords: MT: Bioinformatics, tumor-infiltrating T cells, pan-cancer,
   single-cell sequencing, TF regulon, state transition, cell-cell
   communication, T cell receptor

Graphical abstract

   graphic file with name fx1.jpg
   [47]Open in a new tab
     __________________________________________________________________

   Jiang and colleagues integrated extensive single-cell data from 15
   cancers as well as bulk gene expression, systematically delineated the
   immune ecosystem of tumor-infiltrating T cells and reveals the
   landscape of shared characteristics of tumor-infiltrating T cells
   across cancers.

Introduction

   Cancer development and progression are associated with alterations of
   the tumor microenvironment, which is complex and continuously
   evolving.[48]^1 Tumor-infiltrating T cells are important components of
   the tumor microenvironment,[49]^2 and they display a wide range of
   (dysfunctional) functional states that are shaped by a variety of
   inhibitory signals emerging in the tumor microenvironment.[50]^3 CD8^+
   T cells recognize and kill cancer cells directly after recognizing
   antigens, and adjust their developmental lineage and status in response
   to various cues. CD4^+ T cells coordinate a variety of immune responses
   to enhance or suppress immunity against cancer cells.[51]^4 The rise of
   single-cell technology enables the study of T cell state transition in
   the tumor microenvironment at single-cell resolution. Existing studies
   of tumor-infiltrating T cells cover a wide range of cancers. T cells
   show a continuous trajectory of activation and differentiation in
   breast cancer (BRCA).[52]^5 In melanoma, a continuous transition of
   CD8^+ T cells from an effector state to a dysfunctional state has been
   found.[53]^6 In addition to exhausted CD8^+ T cells, pre-exhausted
   cells were also identified in lung cancer patients.[54]^7 The study of
   bladder cancer (BLCA) patients identified several tumor-specific CD4^+
   T cells, including regulatory T cells (Tregs) and cytotoxic CD4^+
   T cells.[55]^8 Meanwhile, the presence of cytotoxic CD4^+ T cells was
   also observed in melanoma patients.[56]^9 Most of these previous
   studies have focused on the heterogeneity of different T cell types
   within cancer. Elucidating the common functional status and
   differentiation process of T cells in different cancers is of great
   significance for understanding cancer immunity.

   A wide range of transcription factor (TF) regulation has been observed
   in cancers.[57]^10 For example, hypoxia induces the expression of
   HIF-1α and subsequently stimulates the expression of CXCL12.[58]^10
   Continuously activated STAT3 and STAT5 increase the proliferation,
   survival, and invasion of tumor cells, and inhibit antitumor
   immunity.[59]^11 Moreover, The complex regulatory network of TFs is a
   key factor controlling the development and differentiation of
   T cells.[60]^12 For instance, TCF1 is essential for the early
   differentiation of T follicular helper (Tfh) cells, and the lack of
   TCF1 impairs the generation of Tfh cells.[61]^13 FOXP3 is a TF of Tregs
   that can inhibit immune response and play a critical role in
   maintaining immune tolerance.[62]^14 However, the development of
   T cells is dynamic and involves a variety of cellular states, and
   current studies are insufficient to elucidate the regulation of the
   complex immune microenvironment in tumors. Further studies are still
   needed to investigate the key TF regulators in pan-cancer.

   Here, we conducted a systematically comprehensive analysis of T cells
   in 15 cancers. We identified T cells with different functional states,
   as well as active TF regulons for each cell state in pan-cancer. T cell
   state transitions were accompanied by changes in TF regulon expression.
   Furthermore, on the basis of cell-cell interaction analysis, we
   identified common activated interaction pathways in cancers. In
   addition, we characterized clonal expansion, preference variable
   (V) and joining (J) segment usage, and CDR3 characteristics of T cell
   receptor (TCR) in CD4^+ and CD8^+ T cells across cancers. Overall, our
   study of common features in tumor-infiltrating T cells across cancers
   may provide new insights into tumor immunity.

Results

Sub-phenotyping of T cells across 15 cancers

   We collected 15 cancer single-cell RNA sequencing (scRNA-seq) datasets
   on T cells ([63]Figure 1A; [64]Table S1). A standard single-cell
   analysis process was performed to identify T cells in esophageal
   squamous cell carcinoma (ESCC) and nasopharyngeal carcinoma (NPC)
   datasets because their cell type labels were not available
   ([65]Figure S1). After quality-control filtering, a total of 349,799
   T cells were obtained from tumor, adjacent normal tissue, and
   peripheral blood of 101 patients, in which 35.5%–87.5% cells had TCR
   data ([66]Figure 1A). To further understand the characteristics of
   tumor-infiltrating T cells, we performed unsupervised clustering
   analysis of CD4^+ and CD8^+ T cells, respectively. A total of 19 CD4^+
   and 17 CD8^+ T cell clusters were identified ([67]Figures 1B, [68]S2A,
   and S2B). These clusters exhibited distinct tissue distributions
   ([69]Figures 1C and [70]S2C). Naive clusters were dominant in PBMCs,
   while Trm were mainly distributed in normal tissue. CD4_CX3CR1^+ Temra
   was mainly distributed in adjacent normal tissue and PBMCs and
   CD8_CX3CR1^+ Temra was mainly distributed in PBMCs. CD8_CXCL13^+ Tex
   and Tregs were tumor enriched. In addition, we identified cell clusters
   with high expression of proliferative genes in both CD4^+ and CD8^+
   T cells. Cell-cycle score showed that these clusters were in G2M or S
   phase, and these clusters also showed high expression of exhausted or
   Treg-related marker genes ([71]Figures 1D, [72]S2A, and S2B).

Figure 1.

   [73]Figure 1
   [74]Open in a new tab

   T cell clusters across cancers

   (A) The data overview of 15 cancers. (B) UMAP visualization of CD8^+
   and CD4^+ T cell clusters. (C) The R(O/E) values of each cluster in
   tumor, normal, and PBMCs. The chi-square test was used to calculate the
   R(O/E) value. R(O/E) > 1 (above the dashed line) indicates enrichment.
   (D) Cell cycle score of T cell clusters.

   We observed similar distribution of typical T cell types in different
   cancers, including naive, memory, and exhausted T cells.
   ([75]Figure S3). However, there are some cell types that were
   heterogeneous across cancers. For example, MAIT cells were enriched in
   hepatocellular carcinoma (HCC). Studies have shown that MAIT cells can
   represent up to 45% of liver lymphocytes.[76]^15 Consistent with a
   previous study, CD8_CD160^+ T cells were intraepithelial lymphocytes,
   which were mainly distributed in colorectal cancer (CRC).[77]^16
   Intraepithelial lymphocytes are an important contributors to intestinal
   immune homeostasis.[78]^17

Shared T cell state-transition paths in 15 cancers

   Because T cells can transition between different functional states, we
   employed Monocle3 to infer cell trajectory. The results showed a
   variety of state-transition paths between T cell clusters. For CD4^+
   T cells, we found that CD4_CCL5^+ Tm and CD4_CXCR6^+ Th17 appeared to
   be a hub for the development of naive cells into other cell types
   ([79]Figure 2A). When developmental trajectories were inferred
   individually for each cancer, this phenomenon was found to be retained
   in B cell lymphoma (BCL), BLCA, esophageal cancer (ESCA), renal
   carcinoma (RC), thyroid carcinoma (THCA), and uterine corpus
   endometrial carcinoma (UCEC) ([80]Figure S4A). However, in ESCC and
   multiple myeloma (MM), only CD4_CXCR6^+ Th17 was shown as a hub, while
   in BRCA, CRC, HCC, NPC, non-small cell lung cancer (NSCLC), ovarian
   cancer (OV), and pancreatic cancer (PACA), CD4_CCL5^+ Tm was shown as a
   hub ([81]FigureS4A). CD4_CCL5^+ Tm and CD4_CXCR6^+ Th17 significantly
   share marker genes with downstream cell clusters ([82]Figure 2B).
   Enrichment analysis showed that the marker genes of CD4_CCL5^+ Tm and
   CD4_CXCR6^+ Th17 were simultaneously enriched in cell adhesion,
   differentiation, and migration ([83]Figure S4B). CD4_CX3CR1^+ Temra was
   transitioned from CD4_GZMK^+ Tem, and TCR similarity analysis showed
   that these two groups shared TCR clonotypes ([84]Figure S4C). We
   observed several clusters with high expression of Treg marker genes,
   among which CD4_TNFRSE9^+ Treg and CD4_TNFRSF9^− Treg were enriched in
   tumor and PBMCs, respectively ([85]Figures 1C and [86]S2C). In
   addition, high TCR similarity was observed among these clusters,
   especially between CD4_TNFRSE9^+ Treg and CD4_HSPA1A^+ Treg
   ([87]Figure S4C). Trajectory analysis revealed that CD4_ISG^+ Treg,
   CD4_NMB^+ Treg, and CD4_HSPA1A^+ Treg appeared as transitional states
   in different cancers and then developed into CD4_TNFRSE9^+ Treg
   ([88]Figures 2A and [89]S4A). The development path between CD4_STMN1^+
   Treg and CD4_TNFRSE9^+ Treg was observed in ESCA, OV, RC, and PACA
   ([90]Figure S4A).

Figure 2.

   [91]Figure 2
   [92]Open in a new tab

   CD8_TXNIP^+ T cells migrated from PBMCs to tumor tissue and developed
   into Temra or Tex cells

   (A) UMAP showing the development trajectories of T cell clusters in all
   cancers inferred by Monocle3. (B) The overlapped marker genes of
   CD4_CXCR6^+ Th17, CD4_CCL5^+ Tm, CD4_Th1-like, CD4_ANXA1^+ Tm, and
   CD4_GZMK^+ Tem. Significance p value was calculated by hypergeometric
   test. (C) Shared marker genes between CD8_TXNIP^+ T cells and other
   T cells. Significance was calculated by hypergeometric test. ∗p < 0.05,
   ∗∗p < 0.01, ∗∗∗∗p < 0.0001. (D) Migration potentials from PBMCs to
   tumor of CD8_TXNIP^+ T cells quantified by STARTRAC-pMigr. (E)
   Developmental transition of CD8_TXNIP^+ T cells with other T cells
   quantified by STARTRAC-pTran. (F) The similarity of GO terms enriched
   by CD8_TXNIP^+ T cell marker genes and annotated with word clouds.

   In CD8^+ T cells, we observed multiple exhausted paths, of which the
   path from CD8_Tn to CD8_IL7R^+ Tm to CD8_ZNF683^+ Trm to Tex in CRC,
   ESCA, MM, OV, PACA, THCA, and UCEC has been found previously
   ([93]Figures 2A and [94]S4D).[95]^18 In addition, the path from CD8_Tn
   to CD8_IL7R^+ Tm to CD8_CD69^+ Trm to CD8_ISG^+ T to Tex was found in
   CRC, ESCA, NPC, and THCA. Remarkably, we observed a branched structure
   in all cancers, whether trajectory inference was made collectively or
   individually for each cancer, which from CD8_TXNIP^+ T to Temra or Tex,
   with CD8_KLRG1^+ T and CD8_GZMK^+ T located in between ([96]Figures 2A
   and [97]S4D). High TCR similarity was also observed among these
   clusters ([98]Figure S4E). Moreover, we found that the marker genes of
   CD8_TXNIP^+ T were significantly shared with that of CD8_ZNF683^+ Trm,
   including ZNF683, CXCR6, ITGAE, and KLRC1 ([99]Figure 2C). CD8_TXNIP^+
   T cells were enriched in all tissues, especially in PBMCs, but
   the clonality was highest in the tumor ([100]Figures S2C and S4F).
   STARTRAC-migr analysis revealed a high degree of TCR sharing of
   CD8_TXNIP^+ T cells between PBMCs and tumor, especially in MM
   ([101]Figure 2D).[102]^16 At the same time, STARTRAC-tran analysis
   indicated that CD8_TXNIP^+ T cells were highly associated with Trm and
   CD8_RPL36A^+ T cells ([103]Figure 2E). Enrichment analysis showed that
   CD8_TXNIP^+ T cells involved immune cell activation and proliferation,
   including T cell activation, lymphocyte differentiation, and regulation
   of T cell proliferation ([104]Figure 2F).

Clinically relevant state-transition TF regulons across cancers

   TFs have been long recognized to play a central role in the maintenance
   of cell identity.[105]^19 We applied SCENIC to identify key regulons.
   SCENIC can simultaneously reconstruct gene regulatory networks and
   identify cell states from scRNA-seq data.[106]^20 Significantly active
   regulons were identified in each cancer ([107]Tables S2 and [108]S3).
   By calculating regulon specificity score (RSS), we identified
   cell-type-specific regulons. For tumor-infiltrating T cells, most
   regulons were cancer-type specific, indicating that T cells undergo
   specific transcriptional regulation in specific contexts
   ([109]Figure S5A). However, we also observed universal
   cell-type-specific regulons of tumor-infiltrating T cells across
   cancers. We screened each cell type for regulons that were present in
   at least five cancers and obtained 67 and 58 regulons in CD4^+ and
   CD8^+ T cell clusters, respectively. These regulons were significantly
   shared between CD4^+ and CD8^+ T cells ([110]Figures 3A, [111]S5B, and
   [112]S6A). The regulons were related to the function of the
   corresponding cell type. For example, naive specific regulons TCF7 and
   LEF1 were essential for early T cell development.[113]^21 Regulons in
   CD69^+ Trm included multiple zinc finger TFs (EGR1/2/3/4, KLF5/10).
   EGR1, EGR2, and EGR3 have been shown to regulate T cell activation,
   antigen-induced T cell proliferation, and effector
   differentiation.[114]^22^,[115]^23 TBX21 was identified as
   Temra-specific regulon in both CD4^+ and CD8^+ T cells
   ([116]Figure 3B). RUNX3, CD4_CX3CR1+ Temra-specific regulon, was a TF
   that regulated the cytotoxic phenotype in CD4^+ cytotoxic
   T cells.[117]^24 Another CD8_CX3CR1+ Temra-specific regulon, KLF2, was
   shown to inhibit tumor cell growth and migration in HCC,[118]^25
   NSCLC,[119]^26 and ccRCC.[120]^27 ETV1, a Tex-specific regulon, was
   associated with immune invasion and poor prognosis in CRC.[121]^28 At
   the same time, we also calculated the correlation between the regulon
   activity score (RAS) of specific regulons and the TCR clonality in each
   cluster ([122]Figures 3B, [123]S5B, and [124]S6A). The activity of
   these regulons was correlated with TCR clonality in different cancers.
   For example, TBX21 was positively correlated with TCR clonality in five
   cancers.

Figure 3.

   [125]Figure 3
   [126]Open in a new tab

   Shared state-transition TF regulons across cancers

   (A) Venn diagrams showing the overlap cluster-specific regulons of CD4
   and CD8. (B) The statistics of cluster-specific regulons (left) and the
   cluster-specific regulons in at least five cancers (right). The colors
   around the dot represent the relationship between RSS and TCR
   clonality. (C) The differential regulons between CD8_TXNIP^+ T,
   CD8_CX3CR1^+ Temra, and CD8_CXCL13^+ Tex. (D) Differential expression
   of regulons (C) in tumor compared with normal or PBMCs. (E) The
   expression level of regulons (C) in KIRC subtypes. (F) The Kaplan-Meier
   plot of KIRC patient-based classifications generated from consensus
   clustering. The survival difference among groups was calculated by log
   rank test. (G) Pathologic stage distribution of KIRC patients in each
   subtype. (H) The volcano plot of differentially expressed genes between
   pre-therapy and post-therapy patients in melanoma ([127]GSE115821).
   Differential expression analysis was performed using wilcox.test. (I)
   The volcano plot of differentially expressed genes between responder
   and non-responder patients in melanoma (PRJEB23709). Differential
   expression analysis was performed using wilcox.test.

   In view of the above-mentioned trajectories, we analyzed the
   differential regulons between CD8_TXNIP^+ T, Temra, and Tex in tumor
   tissue. Finally, in five or more cancers, we identified 14 regulons
   that were upregulated in Temra and downregulated in Tex
   (Temra-activated TFs) or downregulated in Temra and upregulated in Tex
   (Tex-activated TFs) compared with CD8_TXNIP^+ T, which we termed
   state-transition TFs ([128]Figure 3C). The expression of
   state-transition TFs was significantly different between tumor and
   other tissues ([129]Figure 3D). Remarkably, ETV1 was upregulated in
   tumor-infiltrating T cells of all cancers and its target genes shared
   by cancers were also upregulated ([130]Figure S6B). We further
   investigated the association between the expression of state-transition
   TFs and patient survival using TCGA cancer data, and the result showed
   that they significantly affected overall survival of patients with
   different cancers ([131]Figure S6C). Next, we performed cancer
   subtyping based on the expression of state-transition TFs in TCGA. In
   13 cancers, the results showed that they divided tumor samples into
   different subtypes with significantly different survival probability
   ([132]Figure S7A). We observed that the C2 subtype in KIRC with high
   expression levels of Temra-activated TFs had a better prognosis
   ([133]Figures 3E and 3F). There were no significant differences in
   gender and age between the two classes, but the pathologic stage of
   most samples in C2 was low stage ([134]Figures 3G, [135]S7B, and S7C).

   Next, we explored the changes of these TFs after immune checkpoint
   blocking therapy by using cancer immunotherapy data that were
   downloaded from the TIGER database.[136]^29 The result showed that the
   expression of state-transition TFs was significantly upregulated after
   treatment in melanoma, except for IRF4 ([137]Figure 3H). Moreover, in
   another melanoma immunotherapy dataset, state-transition TFs were found
   to be highly expressed in responders, except for ETV1 and EZH2, which
   were highly expressed in non-responders ([138]Figure 3I). Overall, the
   T cell state-transition TF regulons we identified were helpful in tumor
   classification and could be used as independent prognostic factors and
   immunotherapy targets.

Universal activated T cell interaction pathways in cancers

   Cancer development relies on a complicate network of cell-cell
   interactions.[139]^30 We inferred communications for all T cell
   clusters by CellChat.[140]^31 There were more interactions in tumor
   compared with other tissues, with the highest number in CRC and the
   fewest in BLCA ([141]Figure S8A). By calculating incoming and outgoing
   interaction strength, we found that there were significant differences
   in the interaction strength of cell clusters between different tissues
   ([142]Figure S8B). For example, in tumor and normal tissues, the
   interaction strength of CD4_TNFRSF9^+ Treg, CD8_ZNF683^+ Trm, and
   CD8_CXCL13^+ Tex was increased, while it was decreased in naive and
   Temra.

   Cell-cell interactions involve a variety of signaling pathways. Next we
   mapped the signaling pathways inferred from the different tissues of
   each cancer onto a two-dimensional manifold and clustered into
   different groups ([143]Figure S9A). Tumor-specific groups were observed
   in BCL, OV, and PACA, and there were normal-specific groups in PACA and
   OV. Groups dominated by different tissues have also been observed in
   other cancers. By computing the Euclidean distance between any pairs of
   shared signaling pathways in different tissues, we observed a large
   distance for CCL, CXCL, and FASLG pathways and a lesser distance for
   the IFN-II pathway ([144]Figure 4A). Furthermore, according to the
   information flow of each signaling pathway, we found that the changes
   of some pathways were different in cancers, including TNF, PARs, CLEC,
   ITGB2, and IL-16. In contrast, we also found 11 pathways (e.g., CCL,
   CXCL, and CD70) that were turned on or increased in the tumor of all
   cancers compared with other tissues, except BLCA, due to fewer
   interaction pathways found. The contribution of ligand-receptor (LR)
   pairs in these pathways was consistent in all cancers ([145]Figure 4B),
   and the expression of these LR genes was upregulated in
   tumor-infiltrating T cells ([146]Figure S9B). There was no cell type
   bias in the incoming and outgoing patterns of CADM, CCL, CXCL, CD70,
   FASLG, GALECTIN, and LIGHT pathways. But an interaction pattern
   preference for CD4^+ T cells or CD8^+ T cells was observed in VCAM,
   BLTA, and LT pathways ([147]Figure S9C).

Figure 4.

   [148]Figure 4
   [149]Open in a new tab

   Universal activated T cell interaction pathways in 15 cancers

   (A) The Euclidean distances of overlapping signaling pathways between
   different tissues in the shared two-dimensional manifold. Tumor vs.
   normal (left), tumor vs. PBMCs (right). (B) Circos plot of signaling
   pathways that significantly differ in overall information flow between
   different tissues (left) and the relative contribution of
   ligand-receptor (right). (C) The incoming signaling patterns of
   CD8_TXNIP^+ T cells in all cancers. (D) The outgoing signaling patterns
   of the GALECTIN pathway. (E) The expression of LGALS9 in each T cell
   cluster. (F) PPI network of ligand and receptor in the GALECTIN
   pathway; TFs in ISG^+ T cells and differential regulons between
   CD8_TXNIP^+ T, CD8_Temra, and CD8_Tex.

   Next, we found that GALECTIN was the incoming signaling pathway for the
   CD8_TXNIP^+ T cells in all cancers except OV ([150]Figure 4C). The
   outgoing cells of GALECTIN were mainly from STMN1^+ and ISG^+ T cell
   clusters ([151]Figure 4D). The contribution LR pair of the GALECTIN
   pathway was LGALS9-HAVCR2/CD45/CD44 ([152]Figure 4B). LGALS9 was
   specifically expressed in ISG^+ and STMN^+ T cells, especially in ISG^+
   T cells ([153]Figure 4E). Protein-protein interaction (PPI) network
   analysis of LR pairs in GALECTIN, ISG^+ T cell cluster-specific
   regulons, and previously identified differential regulons between
   CD8_TXNIP^+ T, CD8_Temra, and CD8_Tex revealed that they interact with
   each other ([154]Figure 4F). This seems to imply a regulatory
   relationship between ISG^+ T cells and CD8_TXNIP^+ T cells.

   In addition, we observed that the outgoing signaling of the LT pathway
   was only found in Tregs ([155]Figure 5A). Its ligand LTA was the target
   gene of Treg-specific regulons REL and NFKB2, and the downstream TFs of
   its paired receptor were TNF signaling pathway-related genes
   ([156]Figure 5B). At the same time, we observed that TNF signaling
   pathway-related genes were simultaneously upregulated in tumor tissue
   ([157]Figure 5C). The activity of the TNF signaling pathway was
   different in each T cell type ([158]Figure 5D). These results suggested
   that Tregs activate the TNF signaling pathway in receptor cells through
   LTA-TNFRSF1B and then influence the fate of other cells
   ([159]Figure 5E).

Figure 5.

   [160]Figure 5
   [161]Open in a new tab

   Tregs activated the TNF pathway in other cells through the LT pathway

   (A) The outgoing signaling patterns of the LT pathway. (B) The Sankey
   plot of ligand-receptor in the LT pathway and TFs upstream of the
   ligand and downstream of the receptor. (C) Differentially expressed TNF
   signaling pathway-related genes in tumor compared with normal or PBMCs.
   (D) The TNF signaling pathway activity in each CD4 (top) and CD8
   (bottom) clusters. (E) Schematic illustration of the LTA-TNFRSF1B
   interaction activating the TNF signaling pathway. The gene expression
   changes are consistent with (C).

TCR landscape of T cells in pan-cancer

   TCR is a key factor mediating T cell immunity, and TCR analysis is
   helpful to understand T cell immunity. First, we analyzed the baseline
   repertoire of CD4^+ and CD8^+ T cells. The TCR diversity and clonality
   of T cells in tumor tissue were significantly higher than that in
   normal tissue and PBMCs ([162]Figure 6A). TCR clustering by
   GLPH2[163]^32 obtained 26,922 and 14,565 groups in CD4^+ and CD8^+
   T cells, of which 22.1% and 47.6% were cancer type specific
   ([164]Tables S4 and [165]S5). T cells with the same V, J, and CDR3
   sequences were defined as the same clonotype. The percentage of clonal
   TCRs of CD8^+ T cells was significantly higher than that of CD4^+
   T cells ([166]Figure 6B). The clonal TCRs in CD8^+ T cells ranged from
   7.6% of BCL to 20.1% of RC, and from 0.5% of NPC to 7.1% of RC in CD4^+
   T cells. The clone size of individual clonotypes was proportional to
   the overall number of clonal TCRs in cancer ([167]Figure 6C). The
   maximum TCR clone size can reach to more than 2,000 in THCA, which was
   200 times the maximum TCR clone size in NPC ([168]Table S6). In
   addition, there were significant differences in the proportion of V and
   J gene usage between CD4^+ and CD8^+ T cells ([169]Figure S10A). For
   example, TRAV1-2 was the most frequently used Vα gene segment in CD8^+
   T cells, while TRAV9-2 was the most frequently used Vα gene segment in
   CD4^+ T cells. Next, we compared V and J segments across different
   tissues and found significant differences in the usage of V and J
   segments between tumor and other tissues, but only a few segments were
   consistent across cancers ([170]Figure S10B).

Figure 6.

   [171]Figure 6
   [172]Open in a new tab

   The consistent features of V and J segments and CDR3s across cancers

   (A) The TCR diversity and clonality of CD4^+ and CD8^+ T cells. The
   significant p values were calculated by Wilcoxon rank-sum test. (B)
   Comparison of clonal TCRs between CD4^+ and CD8^+ T cells. The
   significant p values were calculated by Wilcoxon rank-sum test. (C) The
   statistics of clonal TCRs in CD8^+ T cells and CD4^+ T cells. (D) The
   TCR clonality of each cluster in tumor, normal, and PBMCs. The
   significant p values were calculated by Wilcoxon rank-sum test.
   ∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001,∗∗∗∗p < 0.0001. (E) The fold change
   of clonality between CD8_CX3CR1^+ Temra and CD8_CXCL13^+ Tex in each
   cancer. (F) The line plot represents the trend of TCR clonality between
   T cell clusters on CD8_TXNIP^+ T cell development trajectories. (G) The
   proportion of amino acid types in CDR3s between T cell clusters on
   CD8_TXNIP^+ T cell development trajectories. The significant p values
   were calculated by Wilcoxon rank-sum test.

   Next, by analyzing the TCR characteristics of T cell clusters, we found
   that there was little clonal expansion of naive T cells, and the highly
   expanded TCRs were mainly in Temra and Tex cells ([173]Figure 6D). In
   addition, we compared the expansion of CD8_CX3CR1^+ Temra
   and CD8_CXCL13^+ Tex in tumor, and found that CD8_CX3CR1^+ Temra was
   highly expanded in BRCA, MM, THCA, and RC, while CD8_CXCL13^+ Tex was
   highly expanded in UCEC, PACA, ESCA, and ESCC ([174]Figure 6E). Since
   trajectory analysis showed that CD8_TXNIP^+ T cells transition to Temra
   or Tex, we investigate the changes of TCR characteristics among these
   clusters. The result showed that the clonality of TCR increased
   gradually along the developmental trajectory ([175]Figure 6F). In
   addition, we found that the amino acid types in the CDR3 region were
   different between the two development paths ([176]Figure 6G). For
   example, on the Tex path the proportion of polar amino acids in CDR3β
   significantly increased, but on the Temra path the proportion of
   non-polar amino acids was significantly increased.

Discussion

   The last decade has witnessed tremendous progress in cancer
   immunotherapy, such as immune checkpoint blockade (ICB), chimeric
   antigen receptor T cell therapy, and TCR-T.[177]^33^,[178]^34^,[179]^35
   Despite the successful application of immunotherapy across a broad
   range of cancers, only a minority of patients benefit from it. T cells
   mediate anti-tumor immune responses as targets of ICB therapy.[180]^36
   Each T cell is endowed with a clonally restricted surface TCR, which is
   a key mediator in peptide-MHC recognition and cell
   activation.[181]^37^,[182]^38 The heterogeneity of T cells has made
   great progress. However, the similar characteristics of T cells in
   different cancers remain to be characterized. Therefore, a clear
   understanding of the co-altered biology of T cells across cancers is
   needed to fully exploit the therapeutic capacity of immunotherapy. In
   this study, we performed a systematic analysis of T cells from 15
   cancer datasets. We identified T cell types in a variety of functional
   states that were enriched in different tissues. In particular,
   we observed multiple Treg subsets in CD4^+ T cells, with CD4_TNFRSF9^+
   Treg representing the activated state. Tregs inhibit aberrant immune
   response against autoantigens and also suppress anti-tumor immune
   response.[183]^39 Trajectory analysis revealed the transition of these
   Treg clusters to the activated state across cancers. Among them, the
   transition from CD4_TNFRSF9^− Treg and CD4_ISG^+ Treg to the activated
   state has been observed in pan-cancer.[184]^18 We also observed the
   transition from CD4_HSPA1A^+ Treg to the activated state in multiple
   cancers. Overexpression of HSPA1A can inhibit the antitumor effect of
   histone deacetylase inhibitors.[185]^40 These results suggest a
   potential contribution of interferon-stimulated genes and heat shock
   protein genes in Treg activation, which may promote tumor immune
   evasion.

   T cell exhaustion is a state of T cell dysfunction. It prevents optimal
   control of malignant cells.[186]^41 In addition to previously reported
   T cell exhausted paths,[187]^18 we also found the state transition from
   CD8_TXNIP^+ T to CD8_KLRG^+ T and CD8_GZMK^+ T, and finally to
   CD8_Temra and CD8_Tex in all cancers. This trajectory persisted even in
   HCC with the smallest number of cells. Moreover, there were 14 TF
   regulons that significantly changed their activity in CD8_CX3CR1^+
   Temra and CD8_CXCL13^+ Tex compared with CD8_TXNIP^+ T. Also,
   clinically relevant cancer subtypes were identified in TCGA cancers
   based on the expression of these TFs. Among them, ETV1 has been shown
   to be a drug therapeutic target for gastrointestinal stromal
   tumor.[188]^42 While the expression of TBX21 in tumor CD8^+ T cells was
   induced by immune checkpoint blocking.[189]^43 In addition, CD8_TXNIP^+
   T cells were enriched in all tissues and the STARTRAC analysis
   suggested its ability to migrate from the periphery to the tumor and
   then differentiate into effector cells. These results implied that
   CD8_TXNIP^+ T cells may be a target for tumor immunotherapy.

   Analysis of cell-cell communication patterns can reveal coordinated
   responses between different cell types. We separately analyzed
   cell-cell interactions of all cell types in different tissues and
   revealed the rewired communications in T cells. We found that 11
   pathways play important roles in mediating the tumor-infiltrating
   T cell communications in all cancers. These included chemokine (CCL,
   CXCL) and cell adhesion (CADM, VCAM) pathways associated with T cell
   trafficking. CXCR6 is essential for antitumor effect of CD8^+
   T cells,[190]^44 and the CXCL13/CXCR5 axis plays an important role in
   tumor immunity.[191]^45 We observed that LT was a Treg-specific
   outgoing signaling pathway. Its LR pair and downstream genes were
   involved in the TNF signaling pathway. TNF pathway-related genes were
   significantly overexpressed in tumor-infiltrating T cells, and TNF
   pathway activity was highest in CD69^+ Trm. Studies have shown that
   certain types of Tregs can affect Trm function.[192]^46^,[193]^47 These
   results suggested that Tregs may regulate other cells by activating the
   TNF signaling pathway. Further studies of the TNF pathway are still
   needed to explore its effect on tumor immunity. In addition, the
   GALECTIN pathway mediated the interaction between ISG^+ T cells and
   CD8_TXNIP^+ T cells via LR pairs for LGALS9-HAVCR2/CD45/CD44. LGALS9 is
   highly expressed by induced Tregs.[194]^48 LGALS9 interacting with
   HAVCR2 can regulate T cell exhaustion and apoptosis,[195]^49 while
   LGALS9 interacting with CD44 can promote T cell
   activity.[196]^48^,[197]^50 This may be the potential mechanism that
   triggers CD8_TXNIP^+ T cells to differentiate into different cell
   states. Limiting the interaction between LGALS9 and HAVCR2 may prevent
   T cell exhaustion.

   Then we systematically characterized the TCR of all cancers.
   Tumor-infiltrating T cells had both higher TCR diversity and higher TCR
   clonality. CD8^+ T cells had a significantly higher percentage of
   clonal expansion than CD4^+ T cells, which was found in all cancers.
   Studies have shown that CD8^+ T cells have larger clones than CD4^+
   T cells, and the size of CD4 clones was more tightly regulated.[198]^51
   Further analysis revealed differences in TCR characteristics between
   CD8_CX3CR1^+ Temra and CD8_CXCL13^+ Tex cells.

   In summary, our study systematically delineates the immune ecosystem of
   tumor-infiltrating T cells and reveals the landscape of shared
   characteristics of tumor-infiltrating T cells across cancers, which
   will provide new insights into the underlying mechanisms of tumor
   immunity.

Materials and methods

Single-cell data collection

   We downloaded 15 cancer scRNA-seq datasets from the Gene Expression
   Omnibus (GEO) ([199]https://www.ncbi.nlm.nih.gov/geo/), which were
   generated from different platforms, including 10x and Smart-Seq2. Each
   dataset contains single-cell transcriptome count matrix and paired TCR
   data. A total of 521,472 cells involved 103 patients and 3 different
   tissue types, including tumor, adjacent normal tissue, and peripheral
   blood ([200]Table S1). We extracted T cells for subsequent analysis.
   Low-quality cells (<200 genes/cell and >10% mitochondrial genes) were
   excluded. Finally, a total of 349,799 T cells were obtained from 101
   patients.

Data integration, unsupervised clustering, and cell type annotation

   To subcluster T cells from all cancer data, we used Seurat (version
   3.2.3) to create the gene expression matrix object. All functions were
   run with default parameters, unless otherwise specified. Genes shared
   by all datasets were retained. Then data normalization, variable
   feature identification, data scaling, and principal-component analysis
   were performed. To integrate T cells from different platforms and
   tissues, we used the harmony algorithm.[201]^52 Next, a KNN graph was
   constructed using the FindNeighbors function in Seurat. Finally, cells
   were clustered using the FindClusters function in Seurat.
   The resolution parameter was set to 1 and 1.5 for CD4^+ T cells and
   CD8^+ T cells, respectively. Uniform Manifold Approximation and
   Projection (UMAP) was used to visualize T cells in a two-dimensional
   space.

   Cluster biomarkers were identified using the FindAllMarkers procedure
   in Seurat, which identified differentially expressed genes for each
   cluster using a Wilcoxon rank-sum test. The significant differentially
   expressed gene was determined by the following thresholds: (1) the gene
   was expressed in at least 10% of the cells in one group, (2) a
   Bonferroni-adjusted p value lower 0.05, (3) the average log-scale
   fold-change between two groups greater than 0.25. We annotated the
   clusters as different major cell types based on the top ranked
   differentially expressed genes in each cluster.

Trajectory analysis

   Monocle3 (version 1.0.0) was utilized to investigate transcriptional
   and functional trajectories of T cell clusters.[202]^53 Specifically,
   principal components were calculated from the expression matrix for
   each cancer. The cells were divided into large and well-separated
   groups using Louvain/Leiden community detection by Cluster_Cells
   function, and within each group a principal graph was fitted using the
   function Learn_Graph. The principal graph was shown as a trajectory on
   the UMAP, then the function order_cells was used to calculate the
   pseudotime.

SCENIC analysis

   TF activity was analyzed using pySCENIC per cancer with raw count
   matrices as input.[203]^20 First, the grnboost2 function was used to
   infer gene co-expression modules and then identify the regulons
   enriched for the motifs of the TF. Finally, the aucell function was
   used to calculate the RAS. The RSS was calculated by calcRSS function
   in R.

Subtype classification of TCGA cancer samples

   We classified samples into optimal clusters by referencing the
   expression of TF regulons using the R package “ConsensusClusterPlus.”
   The parameters were “maxK = 3, reps = 1000, clusterAlg = pam,
   distance = spearman.”

Survival analysis

   R package TCGAbiolinks was used to obtain the gene expression of 33
   cancer types. We also downloaded the clinical data by UCSC Xena
   ([204]https://xenabrowser.net/datapages/).[205]^54 We used univariate
   Cox regression analysis to evaluate the association between
   survival time and the expression level of each gene. Genes with hazard
   ratio (HR) > 1 and p < 0.5 were identified as risk factors in cancer,
   while genes with HR < 1 and p < 0.05 were identified as protective
   factors in cancer. The Kaplan-Meier method was used to estimate the
   overall survival time between different groups, and differences in
   survival time were analyzed using the log rank test (p < 0.05).

Pathway enrichment analysis

   We performed pathway enrichment analysis for target genes of
   cell-type-specific TF by clusterProfiler (version 4.4.4).[206]^55 The
   parameters were “pvalueCutoff = 0.05, pAdjustMethod = ‘BH,’ minGSSize =
   10.” Next, all the GO terms were clustered based on the
   implifyEnrichment R package.[207]^56 The pathway activity of TNF
   signaling pathway was calculated by gene set variation analysis
   (version 1.38.1).[208]^57

Cell-cell communication analysis

   We used CellChat (version 1.4.0) for cell-cell communication analysis
   based on known LR pairs.[209]^31 Specifically, the combined CD4 and CD8
   Seurat object was subjected as input for CellChat. Then the functions
   computeCommunProb and computeCommunProbPathway were used to infer
   interactions and communication probability. The
   computeNetSimilarityPairwise function was used to calculate the
   similarity between pathways, and pathways were clustered into different
   groups based on functional similarities. In addition, we calculated the
   difference of the same pathway between different tissues, using the
   function rankSimilarity. The downstream TFs list of receptors were
   downloaded from CellCall
   ([210]https://github.com/ShellyCoder/cellcall).[211]^58

PPI analysis

   The PPI network was constructed by STRING database
   ([212]https://cn.string-db.org/).[213]^59

TCR diversity and clonality

   The clonal diversity was calculated using Shannon’s entropy:
   [MATH: <mrow><mtable><mtr><mtd><msup><mrow><mi
   mathvariant="normal">S</mi><mi mathvariant="normal">h</mi><mi
   mathvariant="normal">a</mi><mi mathvariant="normal">n</mi><mi
   mathvariant="normal">n</mi><mi mathvariant="normal">o</mi><mi
   mathvariant="normal">n</mi></mrow><mo>′</mo></msup><mi
   mathvariant="normal">s</mi><mspace width="0.25em"></mspace><mi
   mathvariant="normal">e</mi><mi mathvariant="normal">n</mi><mi
   mathvariant="normal">t</mi><mi mathvariant="normal">r</mi><mi
   mathvariant="normal">o</mi><mi mathvariant="normal">p</mi><mi
   mathvariant="normal">y</mi><mo linebreak="badbreak">=</mo><mo
   linebreak="badbreak">−</mo><mrow><munderover><mo>∑</mo><mrow><mi>i</mi>
   <mo
   linebreak="badbreak">=</mo><mn>1</mn></mrow><mi>N</mi></munderover><mro
   w><mi>P</mi><mrow><mo>(</mo><mi>i</mi><mo>)</mo></mrow><mi>ln</mi><mrow
   ><mi>P</mi><mrow><mo>(</mo><mi>i</mi><mo>)</mo></mrow></mrow></mrow></m
   row></mtd></mtr></mtable></mrow> :MATH]
   (Equation 1)

   where N is the number of all clonotypes in each cancer, P(i) represents
   the frequency of the ith clonotype. The higher Shannon’s entropy, the
   more diverse the distribution of CDR3 clones.

   Clonality is defined based on diversity:
   [MATH: <mrow><mtable><mtr><mtd><mi mathvariant="normal">C</mi><mi
   mathvariant="normal">l</mi><mi mathvariant="normal">o</mi><mi
   mathvariant="normal">n</mi><mi mathvariant="normal">a</mi><mi
   mathvariant="normal">l</mi><mi mathvariant="normal">i</mi><mi
   mathvariant="normal">t</mi><mi mathvariant="normal">y</mi><mo
   linebreak="badbreak">=</mo><mn>1</mn><mo
   linebreak="badbreak">−</mo><mfrac><mrow><mo>−</mo><mrow><munderover><mo
   >∑</mo><mrow><mi>i</mi><mo
   linebreak="badbreak">=</mo><mn>1</mn></mrow><mi>N</mi></munderover><mro
   w><mi>P</mi><mrow><mo>(</mo><mi>i</mi><mo>)</mo></mrow><mi>ln</mi><mrow
   ><mi>P</mi><mrow><mo>(</mo><mi>i</mi><mo>)</mo></mrow></mrow></mrow></m
   row></mrow><mrow><mi>ln</mi><mspace
   width="0.25em"></mspace><mi>M</mi></mrow></mfrac></mtd></mtr></mtable><
   /mrow> :MATH]
   (Equation 2)

   where M is the number of unique TCR in each cancer.

TCR similarity analysis

   The similarity of V-J pairs between different tissues and the
   similarity of TCRs between different cell clusters were calculated
   using Jaccard similarity coefficient:
   [MATH: <mrow><mtable><mtr><mtd><mi
   mathvariant="normal">J</mi><mrow><mo>(</mo><mi
   mathvariant="normal">A</mi><mo>,</mo><mi
   mathvariant="normal">B</mi><mo>)</mo></mrow><mo
   linebreak="badbreak">=</mo><mfrac><mrow><mi>A</mi><mo
   linebreak="badbreak">∩</mo><mi>B</mi></mrow><mrow><mi>A</mi><mo
   linebreak="badbreak">∪</mo><mi>B</mi></mrow></mfrac></mtd></mtr></mtabl
   e></mrow> :MATH]
   (Equation 3)

   For TCR similarity, we calculated the median similarity across all
   cancers as the final similarity. Then Cytoscape (V.3.7.2) was used for
   network visualization.[214]^60

TCR clustering analysis

   GLIPH2 ([215]http://50.255.35.37:8080/) was used for TCR
   clustering.[216]^32 The input information includes CDR3b, Vb, Jb,
   CDR3a, clone size, and sample information of each TCR. We clustered the
   TCRs of all cancers together.

Statistics analysis

   Statistical analyses were performed using R. Fisher’s exact test was
   used to assess differences in usage of V-J pairs (p < 0.05). The
   Pearson correlation coefficient was used to measure the relation of
   regulon activity score and TCR clonality (p < 0.05). The Wilcoxon
   rank-sum test was used for differentially expressed genes between tumor
   and other tissues (adjust p <0.05).

Acknowledgments