Graphical abstract
graphic file with name fx1.jpg
[25]Open in a new tab
Highlights
* •
Immune cells in the tumor microenvironment show metabolic
heterogeneity
* •
CD8^+ T cells with similar immunological state can demonstrate
metabolic differences
* •
A metabolic gene signature is capable of predicting patient
response to ICI
* •
A unique tumor metabolic signature contributes to acquired ICI
resistance
__________________________________________________________________
Human metabolism; Immunology; Cancer
Introduction
Over the past decade cancer therapy has been revolutionized by the
usage of immune checkpoint inhibitors (ICI), resulting in unprecedented
rates of long-lasting tumor responses in patients with a variety of
cancers.[26]^1 Nevertheless, most patients do not respond to treatment
or acquire resistance,[27]^2^,[28]^3 reaching response rates of less
than 40% in solid tumors[29]^4 with a large number of partial
responders.[30]^5 To address this challenge, multiple biomarkers of
response have been suggested, including different gene
signatures,[31]^6^,[32]^7^,[33]^8 T cell abundance
levels,[34]^9^,[35]^10^,[36]^11 as well as metrics related to the
tumor’s genetics, including the tumor mutational burden and copy number
alteration.[37]^12^,[38]^13^,[39]^14^,[40]^15^,[41]^16 These and other
metrics are, however, rarely predictive across datasets, and certainly
across different cancer types.
More recently, differences in the metabolic adaptation of tumor and
immune cells and their competition for resources have been shown to
contribute to patient response.[42]^17^,[43]^18^,[44]^19^,[45]^20
Alterations in cellular metabolism lead to tumor-immune interactions
that simultaneously deplete vital nutrients from the common environment
and produce immunosuppressive metabolic byproducts.[46]^20^,[47]^21 For
instance, enzymatic activation of IDO depletes tryptophan levels, a
vital nutrient for T cell response. At the same time, this activity
generates kynurenine, a metabolite causing upregulation of PD1 on
activated CD8^+ T cells.[48]^22^,[49]^23 Another example involves the
high lactate levels generated by tumor cells due to up-regulation of
lactate dehydrogenase (LDH), resulting in an acidic environment that
suppresses natural killer (NK) cells cytotoxicity, blocks monocytes and
dendritic cells differentiation, and inhibits effector T cells
activity.[50]^24^,[51]^25^,[52]^26^,[53]^27 These and other findings
were mainly established in in vitro systems or in animal models.
Studying the metabolic transcriptomic changes that occur in immune
cells found in the TME of patients treated with ICI, can therefore shed
light on mechanisms of response to therapy and highlight immune
metabolic vulnerabilities.
Here, we perform a semi-supervised analysis of single-cell RNA-seq data
from tumor biopsies of patients treated with ICI, focusing on ∼1700
metabolic genes that are associated with ∼100 metabolic
pathways.[54]^28 Focusing on this well-defined cellular process and its
narrowed gene set, enabled us to discover novel associations with
patient response that were masked under the genome-wide analyses. We
found that different immune cell states share similar metabolic
activity and vice versa, resulting in a novel classification to
cellular states, mainly involving CD8^+ T cells. Furthermore, we
detected metabolic clusters that are significantly associated with
patient response. A non-discrete analysis of metabolic programs
revealed a metabolic gene signature that is significantly associated
with tumor progression or regression across various cancer types,
including melanoma, Merkel cell carcinoma (MCC), lung, and breast
cancers. Finally, we show a metabolic involvement in the polarization
of macrophages to a suppressive M2-like phenotype in an acquired
resistance patient, together with the associated tumor metabolic
changes.
Results
Immune cellular metabolism and its association with patient response to ICI
To identify immune metabolic changes at the single-cell level and study
their association with ICI treatment and patient response, we
re-analyzed our previously published dataset that includes single-cell
RNA sequencing (scRNA-seq) of 16,291 CD45^+ cells from 48 tumor
biopsies taken from 32 stage IV metastatic melanoma patients treated
with ICI[55]^29 ([56]Table S1). Eleven patients had longitudinal
biopsies and 20 patients with one or two biopsies, taken either at
baseline or during treatment. Tumor samples were classified based on
radiologic assessments into progression/non-responder (NR; n = 31,
including stable disease (SD)/progressive disease (PD) samples) or
regression/responder (R; n = 17, including complete response
(CR)/partial response (PR) samples).[57]^30 Using the previously
defined cell classification into 11 clusters,[58]^29 we devised a score
for 97 metabolic pathways based on the expression of their associated
metabolic genes[59]^28 ([60]Table S2, [61]STAR Methods).
This analysis showed several metabolic pathways which are significantly
more expressed than others across all cells, as well as specific cell
types and states that are more metabolically active ([62]Figures 1B,
1C, and [63]S1). Specifically, we found that pathways such as
glycolysis, oxidative phosphorylation (OXPHOS), and tricarboxylic acid
(TCA) cycle, are generally active across many cell types with
heterogeneous pattern of activation, as was similarly demonstrated in
melanoma and in head and neck squamous cell carcinoma (HNSCC)
patients.[64]^31 In contrast, other pathways such as nucleotide salvage
pathway, heme degradation, and purine synthesis, are active only in
specific cell types ([65]Figure 1D). Moreover, we found that cells from
the myeloid lineage and cycling exhausted T cells, that were previously
associated with negative response to immunotherapy,[66]^29 are the most
metabolically active. On the other hand, memory T cells and B cells
that were previously associated with positive response to
immunotherapy,[67]^29 show lower metabolic expression across all
pathways ([68]Figure 1C). Overall, this continuous metabolic score
showed high heterogeneity across cell types, going beyond sequencing
reading depth ([69]Figure S2).
Figure 1.
[70]Figure 1
[71]Open in a new tab
Immune cellular metabolism and its association with patient response to
ICI
(A) Schematic workflow of the study.
(B) Top 20 active metabolic pathways across all single-cells, colored
by the mean score of each pathway across all single-cells.
(C) Rank of metabolic activity by immune cell types, colored by the
mean score of all metabolic pathways in each cell type.
(D) Selected metabolic pathways and their mean scores across different
cell types separated by commonly expressed, cell-type specific, and
lowly expressed pathways.
(E) Fraction of cells expressing metabolic genes associated with
patient response across all cells within each sample. Wilcoxon rank sum
test.
(F) Fraction of cells expressing metabolic genes associated with
patient response across CD8^+ T cells within each sample. Wilcoxon rank
sum test.
(G) Mean score of metabolic pathways associated with non-responding
samples across all cells within each sample. Wilcoxon rank sum test.
(H) Mean score of metabolic pathways associated with non-responding
samples across CD8^+ T cells within each sample. Wilcoxon rank sum
test. Abbreviations: R = Responders, NR = Non-responders. See also
[72]Figures S1 and [73]S2 and [74]Table S3.
Examining the association between metabolic genes and patient response
to ICI, we found a couple of genes that are significantly up-regulated
in responding samples (PLA2G4B and DGKD), both are involved in
glycerophospholipid metabolism ([75]Figure 1E; [76]Table S3). This
pathway was shown to be involved in different immune cellular
functions, including T cell proliferation and activation, and
macrophages differentiation.[77]^32^,[78]^33^,[79]^34 Additional genes
were found to be significantly up-regulated in non-responding samples,
including glycolytic genes (PGAM1, GPI, TPI1, ENO1, GAPDH), pentose
phosphate pathway genes (TALDO1, G6PD), TCA cycle genes (IDH2, MDH2),
and oxidative phosphorylation genes (NDUFA9, NDUFA12, UQCRFS1, NDUFA13,
NDUFB4, COX5B, [80]Figure 1E; [81]Table S3). When focusing only on
CD8^+ T cells, the N-glycosylation enzyme MGAT4A and PLA2G4B were
significantly expressed in responders, while GAPDH, G6PD, CD38, MDH2,
DBI and others, were significantly expressed in non-responders
([82]Figure 1F; [83]Table S3). At the pathway level, while no metabolic
pathways were found to be significantly expressed in responding
samples, in non-responding samples we saw a significant expression of
pathways such as heme degradation, bile acid synthesis, purine
catabolism, oxidative phosphorylation, fatty acid oxidation (FAO), and
glycolysis ([84]Figure 1G; [85]Table S3). Repeating the pathway
analysis using only CD8^+ T cells, we found a significant expression of
multiple pathways including purine catabolism, glycolysis, OXPHOS, and
pyrimidine catabolism in non-responding samples ([86]Figure 1H;
[87]Table S3). Of note, as previously found in the genome-wide
analysis,[88]^29 no significant metabolic changes were found between
baseline and post-treatment samples. Overall, our analysis identified
specific cell states that are more metabolically active, and metabolic
gene markers associated with regression or progression of individual
tumors in response to ICI therapy.
Metabolic-based clusters of melanoma infiltrated immune cells and their
association with patient response to ICI
To metabolically define the different immune cells in an unbiased
manner, we clustered all ∼16,000 cells using the pre-defined set of
1689 metabolic genes[89]^28 ([90]Table S2). This process yielded 10
distinct metabolic clusters that are partially different from the 11
original clusters obtained using all genes ([91]Figures 2A and 2B;
[92]Table S3). Specifically, the previously defined clusters of B and
plasma-cells (C5 and C9), cycling cells (C6), and myeloid cells (C7, C8
and C10), were also found to have a distinct metabolic signature
([93]Figures 2B and [94]S3; [95]Table S3): C5 significantly expressed
genes related to lipid and glycan metabolism, such as PLCG2, ALOX5,
PIK3C2B, ST6GAL1, and MGAT5; C6 was enriched with genes of OXPHOS, TCA
cycle, and FAO, as well as genes related to nucleotide interconversion
(RRM1, RRM2, DUT, TK1, TYMS) and genes involved in folate metabolism
(DHFR, GGH, MTHFD1 and FPGS), all contribute to cell proliferation.
Macrophages were divided into two metabolic clusters (C7 and C8), where
C8 demonstrated a general upregulation of metabolic pathways compared
to C7, including OXPHOS, TCA cycle, metabolism of fructose and mannose,
pyrimidine synthesis, cholesterol metabolism, and phosphatidylinositol
phosphate metabolism ([96]Table S3). When comparing the two clusters at
the gene level, C8 significantly expressed genes related to its
enriched metabolic pathways, including apolipoproteins (APOC1, APOC2,
APOE), OXPHOS genes (NDUF genes, COX genes and CYC1), and TCA cycle
genes (IDH1, MDH1, SDHA, SDHB, SDHD, and FH), while C7 expressed more
IDO1, AGPAT9, SDS, and others ([97]Table S3).
Figure 2.
[98]Figure 2
[99]Open in a new tab
Metabolic-based clusters of melanoma infiltrated immune cells and their
association with patient response to ICI
(A) Classification of CD45^+ cells to different types and cellular
states according to Sade-Feldman et al.[100]^29
(B) Metabolic clusters of immune cells (left) and their composition by
immune cell-types (right).
(C) Percentage of immune cells found in metabolic clusters associated
with patient response, separated by their response status. Wilcoxon
rank sum test.
(D) Percentage of CD8^+ T cells found in metabolic clusters associated
with patient response, separated by their response status. Wilcoxon
rank sum test.
(E) Abundance of CD8^+ T cells from metabolic clusters C2 and C3 in
each sample by the two response groups. Wilcoxon rank sum test.
(F) The predictive power for response of the metabolic score calculated
as the ratio of CD8^+ T cells from C3 and C2 in each sample. Wilcoxon
rank sum test.
(G) Spearman correlation between the metabolic score of each sample vs.
the predictive score suggested by our previous study.[101]^29
(H) High abundance of CD8^+ T cells from C4 in the misclassified
non-responders (left), including B2M mutated samples (right), and their
response prediction according to the immunological predictor of
Sade-Feldman et al.[102]^29
(I) The abundance of CD8^+ T cells from C4 in each sample colored
according to the fraction of memory-activated CD8^+ T cells in C4 that
express CD38, out of the total number of CD8^+ T cells in each sample.
See also [103]Figures S3–S5, as well as [104]Table S3.
Unlike these metabolic clusters that corresponded to known immune cell
types, each of the metabolic clusters C1-C4 was comprised a mixture of
T cells states, including cytotoxic, exhausted and memory
([105]Figures 2A and 2B). This finding demonstrates that different
immunological states can share metabolic similarity and vice versa. To
examine if any of these clusters is associated with patient response,
we quantified the fraction of cells in each sample and in each
metabolic cluster. We found seven clusters as significantly abundant in
one of the response groups ([106]Figure 2C). Two of them, C2 and C3,
were among the newly defined metabolic clusters, such that C2 was found
to be more abundant in non-responders (two-sided Wilcoxon rank sum p
value = 0.009) and C3 was more abundant in responders (two-sided
Wilcoxon rank sum p value = 0.009). Comparing the two clusters, C2 was
enriched with glycolytic genes (TPI1, PGAM1, PKM, LDHB, GPI), OXPHOS
genes from the cytochrome c oxidase and NADH dehydrogenases complexes,
as well as antioxidant genes such as SOD1, PRDX1, PRDX2, and PRDX3. On
the other hand, C3 was characterized by a general down-regulation of
metabolic genes ([107]Table S3; [108]Figure S3). This finding
corresponds to the general downregulation of metabolic pathways seen in
cell types associated with responding samples ([109]Figure 1C). When we
compared the two clusters at the level of metabolic pathways, C2 was
enriched with pathways such as OXPHOS, TCA cycle, glycolysis, and
pentose phosphate pathway, while C3 had no significant enrichment of
metabolic pathways ([110]Table S3). No significant difference between
baseline and post-therapy samples was detected in any of the metabolic
clusters, in similar to our previous report on the genome-wide
clusters.[111]^29 Finally, a trajectory analysis demonstrated the
metabolic transition from a favorable metabolic cluster (C3), toward an
unfavorable metabolic cluster (C2), eventually reaching to cluster C6,
which has properties of terminal exhaustion[112]^29 ([113]Figure S4).
Overall, our semi-supervised analysis that was focused only on
metabolic genes, detected a sub-classification of known immunological
T cell states, and identified metabolic clusters and genes that were
not previously associated with patient response to ICI.
A metabolic classification of CD8^+ T cells is predictive of patient response
to ICI
Based on the high abundance of CD8^+ T cells in the clusters associated
with patient response, and their role in recognizing tumor antigens, we
next focused our analysis on the metabolism of 6350 CD8^+ T cells found
in this dataset.[114]^29 Examining their fraction in the clustering
groups C2 and C3, we observed the same direction of association with
patient response, but with a higher significance level (two-sided
Wilcoxon rank sum p value = 4.57∗10^−4 for both clusters,
[115]Figure 2D). Notably, the two metabolic states co-existed in all
samples, such that per sample, C2 was generally more abundant in
non-responders (two-sided Wilcoxon rank sum p value = 6.4∗10^−4) and C3
was more abundant in responders (two-sided Wilcoxon rank sum p value =
5.19∗10^−5) ([116]Figure 2E). Classifying samples as responders or
non-responders based on C3/C2 ratios achieved high predictive power
(area under the curve [AUC] of receiver operating characteristic
[ROC] = 0.86, n = 48; two-sided Wilcoxon rank sum p value = 4.41∗10^−5)
([117]Figure 2F).
In our previous study CD8^+ T cells were similarly clustered into two
cellular states, one associated with activation and memory functions
and the other with exhaustion.[118]^29 Classifying samples based on the
ratio between these immunological states achieved similar performance
with AUC = 0.87. However, the R^2 between the previous predictor and
the current metabolic one is 0.55 ([119]Figure 2G), suggesting that the
latter provides additional information. Indeed, a few samples that were
misclassified as responders by the immunological predictor due to a
high abundance of memory and activated CD8^+ T cells, were correctly
classified using the metabolic predictor as non-responders
([120]Figures 2H and [121]S5). This set also includes non-responding
samples which had a mutation in the
[MATH: β2 :MATH]
microglobulin gene, associated with immune evasion ([122]Figure 2H).
From a metabolic point of view, we found that nine out of these ten
misclassified non-responding samples had a high abundance (30–70%) of
CD8^+ T cells from cluster C4, showing a significant difference between
the correctly and incorrectly classified non-responding samples
(two-sided Wilcoxon rank sum p value = 4.98∗10^−5, [123]Figure S5).
While this metabolic cluster was not significantly associated with
patient response, we were able to identify several marker genes that
significantly differentiate between memory-activated CD8^+ T cells in
the true responding samples and the nine samples wrongly classified as
responders ([124]Table S3). The most significantly up-regulated gene in
the samples wrongly classified as responders is CD38 ([125]Figure 2I),
a glycoprotein found on the cell surface of many immune cells and
converts NAD^+ to cyclic ADP ribose (cADPR) and to ADP ribose
(ADPR).[126]^35^,[127]^36 This gene is known to exert immunosuppressive
effects through multiple mechanisms, among them is the depletion of
NAD^+ from the tumor microenvironment which impairs T cell activation
and differentiation.[128]^37 This finding therefore highlights a
sub-cellular state of activated and memory CD8^+ T cells that expresses
CD38 and is abundant in non-responding samples. Overall, our analysis
identified a highly predictive metabolic score for classifying patients
according to their response status, emphasizing the importance of
cellular metabolism in determining patient response.
Metabolic programs predict response to checkpoint immunotherapy
To generalize these findings into a predictive model that can be used
in other datasets, one of two approaches can be taken: in the first,
clustering can be performed in each given dataset. However, it would be
difficult to receive the exact same clustering solution. The second
approach is to use top genes from the two clusters, thus identifying
the predictive metabolic cellular states. This solution sets a
challenge in this case as the main characterization of C3 was a general
down-regulation of metabolic genes. To overcome this obstacle we took a
different approach that goes beyond discrete cell clusters and
identifies transcriptional metabolic programs that exist as continuous
activity programs across and within different cell clusters. This was
done by applying consensus non-negative matrix factorization
(cNMF),[129]^38 an approach that performs soft clustering and assigns
each cell a gene expression program with a usage value ranging between
0 and 1. We identified ten different metabolic programs
([130]Figure 3A, [131]STAR Methods; [132]Table S4), four of them are
“identity programs” that are active in specific cell clusters, and the
other six are “activity programs” that are active across different cell
clusters ([133]Figure 3A). Examining the association of the activity
programs with patient response, we found that two of them, P2 and P6,
are more abundant in responders and non-responders, respectively
(two-sided Wilcoxon rank sum p values = 0.075 and 5.08∗10^−5,
respectively, [134]Figure 3B). As expected, at the single-cell level
these two metabolic programs are also related to the identified C2 and
C3 clusters, such that P2 is more abundant in C3 (two-sided Wilcoxon
rank sum p value = 4.37∗10^−130) and P6 in C2 (p value = 8.09∗10^−178,
[135]Figure 3C).
Figure 3.
[136]Figure 3
[137]Open in a new tab
Metabolic programs predict response to checkpoint immunotherapy
(A) 10 metabolic programs obtained using cNMF.[138]^38
(B) Distribution of the mean usage values in programs P2 and P6,
separated by the samples’ response status. Wilcoxon rank sum test.
(C) The usage of programs P2 and P6 in the single-cell level of
metabolic clusters C2 and C3. Wilcoxon rank sum test.
(D) The association of DGKA (as marker of program P2) with overall
survival in melanoma[139]^29 (n = 32). Logrank test.
(E) The association of DGKA, as well as the glycolytic genes PKM and
ENO1 (as markers of program P6) with progression free survival in
triple-negative breast cancer[140]^51 (TNBC, n = 8). Logrank test.
(F) Classification of CD8^+ T cells into major immunological cellular
states, as previously determined[141]^29 (left) and into major
metabolic states (right) according to the top selected markers of each
metabolic program.
(G) Spearman correlation between the metabolic score calculated as the
ratio of CD8^+ T cells from metabolic states A and B in each sample vs.
the predictive score suggested by Sade-Feldman et al.[142]^29
Abbreviations: R = Responders, NR = Non-responders. See also
[143]Figure S6 and [144]Table S4.
To examine the robustness of these programs we analyzed an additional
scRNA-seq dataset from non-small cell lung cancer patients (NSCLC)
treated with ICI.[145]^39 Similar programs were identified using this
dataset, with similar association with patient response
([146]Figure S6; [147]Table S4). Focusing on these two transcriptional
programs, we found that P6 is enriched with OXPHOS (hypergeometric
one-sided p value = 1.9∗10^−10), glycolysis (p value = 3.8∗10^−4) and
purine synthesis (p value = 0.005), and P2 is enriched with inositol
phosphate metabolism (p value = 0.012, [148]Table S4, [149]STAR
Methods). Indeed, a previous report showed that high OXPHOS in a
specific CD8^+ T cells subset is predictive of immunotherapy resistance
in melanoma patients.[150]^40
Intersecting the gene markers of P2 and P6 between the two datasets, we
assembled a narrowed list of the top metabolic markers ([151]STAR
Methods, [152]Table S4). Following a robustness analysis ([153]STAR
Methods), we converged on the top 6 markers in each program. The first,
P2, that we termed “Metabolic state A” contains GSTK1, DGKA, LDHB,
MGAT4A, NMRK1, and APRT. The second, P6, that we termed “Metabolic
state B” contains PKM, ENO1, COX5A, COX6B1, NDUFB3, and COX8A. While
the second group clearly points to activation of glycolysis and OXPHOS,
the first contains genes that are active across different metabolic
pathways. However, we noticed that most exert known protective effect.
Specifically, GSTK1 is a glutathione S-transferase that plays a role in
cellular detoxification by conjugating reduced glutathione to
electrophilic substrates.[154]^41^,[155]^42 LDHB converts lactate to
pyruvate and by that reduces the acidity of the
environment.[156]^43^,[157]^44 DGK deficient mice were shown to contain
fewer antigen specific memory CD8^+ T cells after LCMV
infection,[158]^45 and NMRK1 synthesizes NAD^+ which is an important
factor for T cell activation and differentiation.[159]^46 MGAT4A is an
N-acetylglucosaminyltransferase which attaches N-acetylglucosamine
(GlcNAc) to the nitrogen atom of an asparagine side-chain, thus playing
a crucial role in the regulation of many intracellular and
extracellular functions.[160]^47^,[161]^48 It was also found to be
differentially expressed in CD4^+ and CD8^+ tumor infiltrating memory
T cells in melanoma patients.[162]^49 In hepatocellular carcinoma
patients (HCC), it was found to be downregulated in PD1^high tumor
infiltrating CD8^+ T cells compared to PD1^intermediate CD8^+
T cells.[163]^50 Here, we found that MGAT4A is differentially expressed
in CD8^+ T cells from samples of responding melanoma patients
([164]Figure 1F). Lastly, APRT is part of the nucleotide salvage
pathway that converts adenine to AMP, and thus may contribute to cell
proliferation. Examining the association of these genes with patient
survival, we found that the expression of some of them clearly
separates between patients, though the difference was not significant
after correction for multiple hypotheses due to a relatively small
sample size. We found that the expression of DGKA in CD8^+ T cells is
associated with a longer overall survival in melanoma[165]^29 (adjusted
log rank p value = 0.242, [166]Figure 3D) and a longer progression free
survival (PFS) in triple-negative breast cancer (TNBC)[167]^51
(adjusted log rank p value = 0.06, [168]Figure 3E). On the other hand,
PKM and ENO1 were associated with shorter PFS in TNBC[169]^51 (adjusted
log rank p value = 0.124 for both genes, [170]Figure 3E).
Based on the expression of these metabolic markers, we divided all
CD8^+ T cells into two major groups of metabolic state A and B
([171]Figure 3F, [172]STAR Methods). We then classified samples as
responders or non-responders based on the ratio between the number of
cells associated with either of these two metabolic states ([173]STAR
Methods). Examining the correlation between this metabolic score and
that achieved by our previous immunological one,[174]^29 we found that
it explains only 47% of the variance, emphasizing the added information
captured by the metabolic classification (R^2 = 0.47, p value =
8.75∗10^−8, [175]Figure 3G). We first tested the predictive power of
this metabolic score in the aforementioned melanoma and NSCLC datasets.
In melanoma,[176]^29 our predictor achieved an AUC of 0.83 (n = 48, p
value = 2∗10^−4, [177]Figure 4A). A significant difference was
maintained also when examining baseline and post-therapy samples
separately (n = 19 and 29, p values = 0.003 and 0.036, respectively).
In NSCLC containing only post-therapy samples,[178]^39 our metabolic
metric achieved an AUC of 0.8 (n = 57, p value = 1.41∗10^−4,
[179]Figure 4B). These results, as well as the sensitivity and
specificity measures, were robust to a different number of top markers
([180]Figure S7; [181]Table S4). Of note, shuffling the markers across
single cells resulted in random results ([182]STAR Methods),
demonstrating that the existence or absence of these genes is mostly a
biological rather than a technical result ([183]Figure S8).
Figure 4.
[184]Figure 4
[185]Open in a new tab
The predictive power of metabolic states across different datasets and
cancer types in biopsies and blood samples
(A) The performance of the metabolic predictor in melanoma (n =
48).[186]^29 ROC and the corresponding AUC achieved by the metabolic
score are shown on the right; Distribution of the metabolic score in
responders and non-responders in the different time points is shown on
the left (n = 19 and 29 for pre and post treatment, respectively).
Wilcoxon rank sum test.
(B) The performance of the metabolic predictor in NSCLC (n = 57, only
post treatment samples).[187]^39 ROC and the corresponding AUC achieved
by the metabolic score are shown on the right; distribution of the
metabolic score in responders and non-responders is shown on the left.
Wilcoxon rank sum test.
(C) The performance of the metabolic predictor in triple negative
breast cancer (TNBC, n = 13).[188]^51 ROC and the corresponding AUC
achieved by the metabolic score are shown on the right; distribution of
the metabolic score in responders and non-responders in the different
time points is shown on the left (n = 8 and 5 for pre and post
treatment, respectively). Wilcoxon rank sum test.
(D) The metabolic states in four datasets including only non-responding
patients from three studies, spanning melanoma[189]^40^,[190]^52 (n =
5,3 respectively) and MCC[191]^53 (n = 1).
(E) The performance of the metabolic predictor in breast cancer samples
having annotations for clonal expansion (n = 28).[192]^54 ROC and the
corresponding AUC achieved by the metabolic score are shown on the
right; distribution of the metabolic score for the two annotated groups
in the different time points is shown on the left (n = 14 and 14 for
pre and post treatment, respectively). Wilcoxon rank sum test.
(F) The performance of the metabolic predictor in breast cancer dataset
annotated by environmental exhaustion state (n = 14).[193]^55 ROC and
the corresponding AUC achieved by the metabolic score are shown on the
right; distribution of the metabolic score for the different annotated
groups is shown on the left. Wilcoxon rank sum test.
(G) A shift in the metabolic states of CD8^+ T cells in tumor samples
and their matched blood samples taken from non-responding patients with
TNBC[194]^51 (8 pairs of tumor and their matched blood samples, with
additional 6 unpaired blood samples), melanoma[195]^40^,[196]^52 (5 and
3 pairs, respectively), and MCC[197]^53 (1 pair). Wilcoxon rank sum
test.
(H) Similar metabolic states of CD8^+ T cells in blood samples (n = 14)
and tumor samples (n = 6) of a single complete pathological responder
having NSCLC.[198]^39
(I) Similar metabolic states of CD8^+ T cells in blood samples of
responding and non-responding patients having TNBC[199]^51 (n = 22)
(left) and cutaneous and ocular melanoma[200]^57 (n = 16) (right).
Wilcoxon rank sum test. Abbreviations: R = Responders, NR =
Non-responders, MPR = Major pathological response, SD = Stable disease,
PR = Partial response, PD = Progressive disease, CR = Complete
response, E = Clonal expansion, NE = No (or low) clonal expansion,
TME = Tumor microenvironment. See also [201]Figures S7–11, as well as
[202]Tables S1 and [203]S4.
To further establish the robustness of this predictor, we analyzed
seven additional datasets from the public domain having scRNA-seq data
from patients treated with
ICI.[204]^40^,[205]^51^,[206]^52^,[207]^53^,[208]^54^,[209]^55 In a
TNBC dataset,[210]^51 our predictor achieved an AUC of 0.88 (n = 13, p
value = 0.03, [211]Figure 4C). In three melanoma datasets (n = 8) and
one MCC dataset (n = 1), all with only non-responding
patients,[212]^40^,[213]^52^,[214]^53 our predictor correctly
classified all samples showing a greater number of CD8^+ T cells found
in metabolic state B ([215]Figure 4D). Importantly, our results
remained significant also when separating patients according to the
different ICI treatments ([216]Figure S9). Next, we applied our
predictor to a dataset of HER2 positive and TNBC patients annotated
with a clonal expansion phenotype.[217]^54 We found that samples
enriched with metabolic state A are more abundant where no (or low)
T cell expansion was observed, while metabolic state B was more
abundant in samples with high clonal expansion (AUC = 0.86, n = 28, p
value = 0.001, [218]Figure 4E). This significant trend was observed
also when examining baseline and post-therapy samples separately (n =
14 and 14, p values = 0.028 and 0.02, respectively, [219]Figure 4E).
These findings are in agreement with the authors’ original findings
that low clonal expansion is enriched in cells with a memory phenotype
(TCF7^+, GZMK^+), while clonal expansion is enriched with cells
expressing exhaustion markers (HAVCR2, LAG3, [220]Figure S10). They
also coincide with a study in melanoma patients showing that highly
expanded clonotype families were distributed predominantly in cells
with exhausted phenotype having decreased diversity of T cell receptors
(TCRs).[221]^56
Finally, we applied our predictor to a dataset containing mostly
treatment-naive breast cancer patients having a classification for
exhausted and non-exhausted TME provided by the authors.[222]^55 This
classification is based on the presence or absence of
PD1^high/CTLA4^high/CD38^high T cells. Our predictor achieved an AUC of
0.92 and significantly differentiated between these two tumor
environmental states (n = 14, p value = 0.009, [223]Figure 4F).
Importantly, our metabolic predictor outperformed the predictor based
on immunological cellular states[224]^29 in all three breast cancer
datasets analyzed in this study,[225]^51^,[226]^54^,[227]^55 further
demonstrating its importance ([228]STAR Methods, [229]Figure S11).
The metabolic states of blood peripheral CD8^+ T cells are non-predictive of
patient response
Applying our metabolic classification to CD8^+ T cells from blood
samples of seven different
datasets,[230]^39^,[231]^40^,[232]^51^,[233]^52^,[234]^53^,[235]^57
spanning melanoma, NSCLC, TNBC, and MCC, we found that CD8^+ T cells in
the blood are mostly found in metabolic state A, both in responders and
non-responders. This finding corresponds with their general
naive-bystander phenotype and lack of checkpoint genes expression. A
shift in the metabolic states of CD8^+ T cells reflects this finding
when comparing tumor samples and their matched blood samples taken from
non-responding patients with melanoma, TNBC, and
MCC[236]^40^,[237]^51^,[238]^52^,[239]^53 ([240]Figure 4G). Measuring
the metabolic score in blood samples taken at different time points
from an NSCLC patient experiencing a complete pathological
response,[241]^39 we observed similar ranges of scores as seen in blood
samples of non-responding patients ([242]Figure 4H). We also could not
identify significant differences between the metabolic scores obtained
from blood samples of responding and non-responding TNBC
patients,[243]^51 as well as cutaneous and ocular melanoma
patients[244]^57 ([245]Figure 4I). These findings demonstrate that the
metabolic state of CD8^+ T cells is mainly predictive within the TME
and emphasize the challenge of identifying reliable blood-based
biomarkers.
A metabolically distinct macrophage cluster in an acquired resistance patient
Clustering the immune cells using the metabolic genes subdivided the
original macrophages/monocytes cluster into two metabolic clusters, C7
and C8 ([246]Figure 2B). While all the metabolic clusters were evenly
distributed across samples, 82% of the cells associated with C8 were of
two specific biopsies (73% and 9%, respectively) from a patient
developed acquired resistance following therapy[247]^29
([248]Figure 5A). Considering known macrophages markers, we found that
C7 cells have a higher expression of the RNA editing enzyme APOBEC3A
and of interferon related genes such as IFITM1, ISG20, IFI44L and IDO1
([249]Table S3; [250]Figure S12), typical to the pro-inflammatory
classic macrophages and interferon-primed
macrophages.[251]^58^,[252]^59 C8 cells have a higher expression of
metallothionein genes such as MT1F, MT1X, MT1E and MT1G, corresponding
to the alternatively activated anti-inflammatory macrophages,[253]^60
together with lipid-related genes (APOE, APOC1, ACP5 and FABP5)
corresponding to lipid-associated macrophages having canonical M2-like
pathways.[254]^58^,[255]^61^,[256]^62^,[257]^63 Focusing on C8 that was
highly enriched in the acquired resistance patient, we found a high
expression of several genes in the heme degradation pathway. These
include, HMOX1, HMOX2, BLVRA, and BLVRB ([258]Table S3;
[259]Figure S12). Interestingly, it was previously shown in vitro and
in animal models that HMOX1 is involved in the polarization of
macrophages into the alternative M2-like phenotype.[260]^64 Here, we
identify these genes in this unique patient, suggesting a potential
mechanism of resistance to ICI, and highlighting metabolic drug targets
for enhancing treatment response.
Figure 5.
[261]Figure 5
[262]Open in a new tab
A metabolically distinct macrophage cluster in an acquired resistance
patient
(A) Three biopsies obtained by a single acquired resistance patient
(P28),[263]^29 with two biopsies containing 73% and 9% respectively of
the cells related to metabolic cluster C8.
(B) Six samples of CD45^− malignant cells[264]^65 and selected
differentially expressed genes in the tumor sample of the resistant
patient.
(C) Metabolic score of sugar-related pathways across the different
tumor samples. See also [265]Figure S12 and [266]Table S5.
We next analyzed the corresponding tumor sample of this patient, and
compared it to 5 additional tumor samples available in this dataset
that lack the M2-like signature[267]^65 ([268]Figure 5B, [269]STAR
Methods). We found a significant increase in the expression of CSF1,
CCL2, and VEGFA (Fisher’s exact test p values = 4.18∗10^−92,
1.04∗10^−29, and 5.78∗10^−68, respectively), all are tumor-derived
factors associated with recruitment of macrophages to the TME and their
polarization to the M2-like phenotype[270]^66^,[271]^67
([272]Figure 5B; [273]Table S5). In addition, we identified a
significantly higher expression of HIF1A (p value = 2.9∗10^−49) which
is known to induce VEGFA secretion.[274]^68 Considering the metabolic
activity of this tumor sample, we found a significantly higher
expression of fructose and mannose metabolism (hypergeometric one-sided
p value = 0.027) and aminosugar metabolism (p value = 0.027,
[275]Figure 5C; [276]Table S5, [277]STAR Methods). This tumor also
differentially expressed 50% of the genes related to glycolysis and
additional genes related to sugar-uptake such as GLUT1 (SLC2A1) and
GLUT3 (SLC2A3), compared to a maximum of 2 glycolytic genes in all
other tumors ([278]Figure 5B; [279]Table S5). In addition, we observed
a high expression of LDHA in all of these tumor cells, and of MCT4
(SLC16A3), an enzyme catalyzing the secretion of lactate, in 84% of the
cells ([280]Figure 5B; [281]Table S5). This finding coincides with
previous reports regarding the role of lactate in inducing the
polarization of macrophages toward the pro-tumorigenic M2-like
phenotype.[282]^69^,[283]^70 Overall, our findings highlight a specific
mechanism of resistance involving a tumor with high activation of
sugar-related pathways and its association with anti-inflammatory
macrophages, which may be controlled, at least in part, by metabolic
genes.
Discussion
In this study, we utilized scRNA-seq data and performed an in-depth
transcriptomic characterization of the metabolic changes that occur in
immune cells found in the TME of patients treated with ICI. We found a
general up-regulation of the metabolic activity in cell types
associated with non-responding samples such as exhausted cycling cells,
dendritic cells, and macrophages. On the other hand, we observed a
general metabolic down-regulation in cell types associated with
responding samples such as B cells and memory T cells. Re-clustering
the data based on metabolic genes uncovered novel metabolic clusters
associated with patient response, each is a mix of different known
immunological cellular states. These clusters demonstrate that
different immunological states can share metabolic similarity, while
similar immunological states can vary in their metabolic properties.
With more data becoming available, the transition between these
metabolic states can be studied using TCR sequencing. Furthermore, we
detected continuous metabolic programs with different activity levels
across CD8^+ T cells. Utilizing the top genes of two key programs, we
were able to accurately classify patients by their response status
across various cancer types, including melanoma, MCC, lung, and breast
cancers. Finally, we found that CD8^+ T cells in blood samples are
mostly found in a single metabolic state, regardless of their response
status.
An unsupervised analysis of scRNA-seq data typically includes the
investigation of all genes passing a set of quality control criteria.
This type of analysis naturally highlights the most significant signals
found in the data. However, at the same time, it might miss important
changes that are weaker and are lost due to multiple hypothesis
correction. While this study is a re-analysis of published data, our
semi-supervised analysis which was focused on the well-defined
biological process of cellular metabolism, was able to uncover
significant changes associated with patient response that were not
detected in the previous genome-wide analyses done by us and others. We
believe that these discoveries were made possible due to the importance
of metabolism in immune cells function and interaction with tumor and
other cells in the TME.
By identifying metabolic programs with various activity levels across
cells, we were able to suggest a sub-classification of CD8^+ T cells
that is predictive of patient response in different cancer types.
Importantly, this metabolic-based classification provides additional
information not captured by our original classification that was based
on the ratio between the number of cells found in an activation∖memory
vs. exhausted cellular state.[284]^29 This was found to be specifically
crucial in breast cancer where the metabolic predictor achieved
significantly improved results. It is therefore plausible that other
cellular processes can form additional sub-classifications that will be
predictive of patient response.
Metabolism of immune cells has been widely studied, though mainly
in vitro and in animal models. To the best of our knowledge, this is
the first study that metabolically analyzed immune cells from cancer
patients treated with ICI, supporting the importance of metabolic
processes in determining patient response to therapy across different
cancer types. Taken together, the increase in clinically annotated data
of patients treated with ICI, and the accumulation of additional
single-cell omics-data, is expected to significantly advance our
understanding of immune cell metabolism and its role in patient
response to ICI.
Limitations of the study
In order to fully understand cellular metabolic changes and aid
biomarker identification, single-cell proteomic and metabolomic data
should be collected and analyzed as well. While these technologies are
rapidly advancing, they have not yet reached the scale of
transcriptomic data.[285]^71^,[286]^72^,[287]^73 Moreover, functional
studies are required to determine the causal effects of metabolic
changes on tumor and immune cellular states. In addition, since
metabolism is a central process in all living cells, the set of
metabolic marker genes found as predictive in our analysis, is also
expressed in tumor cells ([288]Figures S13 and[289]S14). This makes it
challenging to use bulk RNA-seq data for validation purposes.
Nonetheless, our findings provide an improved understanding of immune
cell metabolism in the TME of ICI treated patients. Furthermore, the
highlighted genes might be used for modifying T cell function through
co-engineering strategies, thus improving T cells’ metabolic fitness
and augmenting tumor control of T cell therapies.[290]^74 Finally, more
data have to be collected in order to address potential links between
blood-based metabolic biomarkers and the metabolic landscape of the
TME. This could also include a possible connection between the
predictive value of serum LDH
levels,[291]^75^,[292]^76^,[293]^77^,[294]^78^,[295]^79 and the
secretion of lactate into the TME, which was discussed here as a
plausible factor related to an acquired ICI-resistance.
STAR★Methods
Key resources table
REAGENT or RESOURCE SOURCE IDENTIFIER
Deposited data
__________________________________________________________________
Single-cell RNA sequencing data - Melanoma Sade-Feldman et al.[296]^29
GEO: [297]GSE120575
Single-cell RNA sequencing data - NSCLC Caushi et al.[298]^39 GEO:
[299]GSE176021
Single-cell RNA sequencing data - TNBC Zhang et al.[300]^51 GEO:
[301]GSE169246
Single-cell RNA sequencing data - BC Bassez et al.[302]^54
[303]https://lambrechtslab.sites.vib.be/en
Single-cell RNA sequencing data - BC Tietscher et al.[304]^55
ArrayExpress: [305]E-MTAB-10607
Single-cell RNA sequencing data - Melanoma Li et al.[306]^40 GEO:
[307]GSE138720
Single-cell RNA sequencing data - Melanoma Li et al.[308]^40 GEO:
[309]GSE153098
Single-cell RNA sequencing data - Melanoma Pauken et al.[310]^52 GEO:
[311]GSE159251
Single-cell RNA sequencing data - MCC Paulson et al.[312]^53 GEO:
[313]GSE118056
Single-cell RNA sequencing data - Melanoma Wen et al.[314]^57 GEO:
[315]GSE164237
__________________________________________________________________
Software and algorithms
__________________________________________________________________
Python version 3.8.8 Python Software Foundation
[316]https://www.python.org/
Scanpy version 1.8.2 Wolf et al.[317]^83
[318]https://github.com/scverse/scanpy
Leiden clustering algorithm Traag et al.[319]^84
[320]https://github.com/vtraag/leidenalg
tSNE Maaten and Hinton[321]^82 [322]https://lvdmaaten.github.io/tsne/
cNMF version 1.3.2 Kotliar et al.[323]^38
[324]https://github.com/dylkot/cNMF
R version 4.1.1 R Core Team[325]^88 [326]https://www.r-project.org/
InferCNV version 1.2.1 Tickle et al.[327]^87
[328]https://github.com/broadinstitute/infercnv
Monocle3 version 1.0.0 Cao et al.[329]^85
[330]https://cole-trapnell-lab.github.io/monocle3/
Deep Count Auto-encoder (DCA) version 0.3.4 Eraslan et al.[331]^86
[332]https://github.com/theislab/dca
[333]Open in a new tab
Resource availability
Lead contact
Further information and requests for resources should be directed to
and will be fulfilled by the lead contact, Keren Yizhak
(kyizhak@technion.ac.il).
Materials availability
This study did not generate new unique reagents.
Method details
Single-cell RNA-seq datasets and preprocessing
We used 10 scRNA-seq
datasets,[334]^29^,[335]^39^,[336]^40^,[337]^51^,[338]^52^,[339]^53^,[3
40]^54^,[341]^55^,[342]^57 with six of them containing tumor biopsies
with their matched blood samples, three with only tumor biopsies, and
one dataset containing only blood samples ([343]Table S1). Expression
levels for the single smart-seq2 dataset that we have previously
published[344]^29 were quantified as transcripts per million (TPM). All
other datasets were droplet-based and contained read counts. Expression
levels in these cases were first normalized to a standard sum of 10,000
reads per cell, and then log transformed as follows:
[MATH:
log2(normalized_coun
ts+1)
:MATH]
. It should be noted though that the per-cell read coverage in these
datasets is larger than 10,000. Only samples with defined clinical
annotations (response or clonal expansion) were included in the
analysis. Furthermore, we considered only patients that were treated
with either ICI or a combination of ICI and chemo, as well as a single
Merkel cell carcinoma patient that was treated with a combination of
T cell therapy and ICI.[345]^53 Our analysis also included a pair of
patients having ocular melanoma.[346]^57 The first received a
combination of ICI and Indoximod, while the second was administered a
combination of ICI, radiotherapy and microwave ablation. We also used a
dataset of a non-ICI cohort containing mostly treatment-naive patients
having annotations for exhausted and-non exhausted TME provided by the
authors[347]^55 ([348]Table S1).
Metabolic genes and metabolic pathways scores
The set of metabolic genes and pathways used in the analysis is based
on the Recon 2 metabolic reconstruction.[349]^28 We used the set of
1689 metabolic genes that were expressed in Sade-Feldman et al.[350]^29
([351]Table S2). The glycolysis pathway as defined in Recon 2 is
composed of 78 genes and includes also the gluconeogenesis-related
genes. Therefore, to specifically study glycolysis in our analysis, we
defined a “Glycolysis” pathway which includes only 20 central genes
([352]Table S2). To compute a metabolic pathway score for each pathway
in every single cell, we calculated for every cell the amount of
expressed genes related to each metabolic pathway, out of the total
number of genes in that pathway, yielding a continuous ratio between 0
and 1 for each pathway in each cell. Specifically, for each metabolic
pathway K, the metabolic pathway score of a single cell was calculated
such that:
[MATH: Pk=∑j<
/mi>=1nk<
/msub>is_expressed
(gene
mi>j)nk
msub> :MATH]
Where
[MATH:
is_expr
essed:x<
/mi>→{0,1
mn>} :MATH]
and defined as:
[MATH:
is_expr
essed(
x)≔{1iflog2(<
/mo>x+1)>2.50iflog2(<
/mo>x+1)≤2.5 :MATH]
[MATH: x :MATH]
is the expression level in TPM units,
[MATH: nk :MATH]
is the total number of genes in metabolic pathway K, and the sum is
applied on all of the genes such that
[MATH:
genej∈pathwayk :MATH]
.
Metabolic pathway activity and its association with patient response
In order to characterize the metabolic activity of different cell
types, we averaged the metabolic pathway scores described above across
all cells of each immune cell type. We also ranked the general activity
of each metabolic pathway by the mean score of that pathway across all
single cells.
To identify metabolic genes that are differentially expressed between
responding and non-responding samples, we calculated for every sample
the percentage of cells expressing that gene in the two response
groups. We included only genes with median expression fraction higher
than 5% across all samples and conducted a two-sided Wilcoxon rank sum
test between the two groups. We then corrected the obtained P-values
for multiple hypothesis using the Benjamini-Hochberg false discovery
rate.[353]^80 We considered corrected P-values < 0.05 as significant
and repeated that process focusing only on CD8^+ T cells
([354]Table S3). To identify metabolic pathways that are differentially
expressed between responding and non-responding samples, we calculated
the mean score of each metabolic pathway in each sample and repeated
the statistical analysis described above. We conducted this analysis
using all cell types and CD8^+ T cells separately ([355]Table S3).
Dimensionality reduction and unsupervised metabolic clustering of immune
cells
Principle component analysis (PCA) was applied to all CD45^+ cells in
Sade-Feldman et al.,[356]^29 using 3580 metabolic genes obtained from
Recon 3D[357]^81 ([358]Table S2). The top 50 components obtained by the
PCA were used for the downstream t-distributed stochastic neighbor
embedding (tSNE)[359]^82 with Scanpy’s[360]^83 default perplexity
parameter of 30, learning rate of 1,000 and the Euclidean distance
metric. The reason we used Recon 3D genes here is that they also
contain signaling and regulatory genes, generating a more clear 2D
visualization. Importantly, the tSNE algorithm was used only for
visualization and was not part of the clustering analysis described
below.
To identify cell clusters based on metabolic processes we used Scanpy’s
neighbors function with 1689 metabolic enzymes from Recon 2^28, and
generated the K nearest-neighbors (KNN) graph. We then applied the
Leiden algorithm[361]^84 with the default parameters implemented in
Scanpy, yielding 10 different metabolic clusters of immune cells
([362]Table S3). To identify metabolic clusters that are significantly
associated with patient response, we calculated for every metabolic
cluster the fraction of cells assigned to that cluster in every sample.
We then conducted a two-sided Wilcoxon rank sum test between these
fractions in responders and non-responders. This process was done for
each metabolic cluster and corrected for multiple hypothesis using the
Benjamini-Hochberg false discovery rate.[363]^80 Adjusted
P-values < 0.05 were considered as significant. We repeated this
analysis separately for the set of CD8^+ T cells.
Trajectory analysis of metabolic clusters
We conducted trajectory analysis using Monocle3 pseudo-time
analysis[364]^85 ([365]https://github.com/cole-trapnell-lab/monocle3),
applied on the set of metabolic genes as an input ([366]Figure S4).
This analysis was done for the five central adjacent metabolic clusters
identified by our analysis, i.e. clusters C1-4, C6.
Differential expression analysis
In order to identify differentially expressed genes between two groups
[MATH: G1 :MATH]
and
[MATH: G2 :MATH]
, we first considered only genes that are expressed (
[MATH:
log2(TPM+1)>2.5
:MATH]
) in more than 10% of the cells, in at least one cluster. Following,
for each gene
[MATH: i :MATH]
we count the number of cells in
[MATH: G1 :MATH]
and
[MATH: G2 :MATH]
that express it with an expression level (
[MATH:
log2(TPM+1)>2.5
:MATH]
) or (
[MATH:
log2(TPM+1)≤2.5
:MATH]
). We then applied Fisher’s Exact test for the corresponding
[MATH: 2x2 :MATH]
contingency table. Significant genes were those with an adjusted
P-value < 0.05 after correcting for multiple hypothesis using the
Benjamini-Hochberg false discovery rate[367]^80 and
[MATH:
log2(FoldChange)>0.5 :MATH]
. The set of differentially expressed genes in each cluster is listed
in [368]Table S3. This analysis was done to identify gene markers in
each cluster, comparing it to all other clusters. We repeated this
analysis separately, comparing C2 vs. C3, as well as C7 vs. C8.
Identifying metabolic programs
To identify metabolic programs we applied cNMF[369]^38 on the raw
counts matrix of the melanoma dataset by Sade-Feldman et al.,[370]^29
with the set of 1689 metabolic genes. We filtered out immune cells with
total of zero counts and neglected metabolic genes that were expressed
in less than three cells. We used 200 NMF replicates for each K and
tested the results for K = 5,…,20. Considering the error and stability
values for each K, we selected K = 13 as the optimal solution. Further
reviewing this solution, we filtered out three programs that were
active (considering the max usage value) in less than 0.5% of the
cells, leaving us with 10 metabolic programs overall. The same process
was applied to the NSCLC dataset,[371]^39 obtaining a final set of six
programs. Normalized usage values and gene ranks for each program are
summarized in [372]Table S4. To identify metabolic programs that
significantly differentiate between responding and non-responding
samples, we first calculated the mean usage of each program across all
CD8^+ T cells within each sample, while excluding programs having
median of the mean usage lower than 0.1 across all samples. We
performed a two-sided Wilcoxon rank sum test using the mean usage value
of all CD8^+ T cells within each sample. This process was done for each
metabolic program separately and P-values were adjusted using the
Benjamini-Hochberg false discovery rate.[373]^80 We considered adjusted
P-values < 0.1 as significant for the melanoma dataset, and
P-values < 0.15 as significant for the NSCLC dataset. For the melanoma
dataset, this process yielded two significant programs (P2 and P6). For
the NSCLC dataset, this process resulted with three significant
programs (P2, P4 and P6), with the last two having 66% and 33% overlap
with the melanoma programs P6 and P2, respectively, considering the top
100 genes in each program.
Pathway enrichment analysis
To identify enriched metabolic pathways, we applied a hypergeometric
test for each metabolic pathway using the following probability mass
function:
[MATH:
p(x,M,n,N)=(nx)·(M−nN−x)(MN)
:MATH]
Where
[MATH: M :MATH]
is the effective domain size of all of the 1689 metabolic genes,
[MATH: N :MATH]
is the number of genes in the tested list of genes,
[MATH: n :MATH]
is the total number of genes associated with each metabolic pathway,
and
[MATH: x :MATH]
is the number of genes from the tested metabolic pathway found in the
tested list of genes. We calculated one-sided P-value for every
metabolic pathway using:
[MATH: 1−hypergeom.c
df(x−1,M,
n,N) :MATH]
Where
[MATH:
hyperge
om.cdf :MATH]
is the hypergeometric cumulative distribution function implemented by
Scipy package version 1.6.2 in Python. We corrected multiple hypothesis
using the Benjamini-Hochberg false discovery rate[374]^80 and
considered pathways with adjusted P-value < 0.05 as significant.
This analysis was done using the lists of the differentially expressed
genes of metabolic clusters C2, C3, C7 and C8, and is described in
[375]Table S3. To identify enriched metabolic pathways in P2 and P6 of
the melanoma dataset,[376]^29 we applied this hypergeometric test for
each metabolic pathway using the top 100 genes associated with each
program, with
[MATH: N :MATH]
= 100 accordingly. The results of this analysis are summarized in
[377]Table S4.
Identification of CD8^+ T cells
In our previously published melanoma dataset we used the original
classification of CD8^+ T cells that was based both on gene markers and
manual curation.[378]^29 In all of the other public datasets we used a
stringent criteria such that a CD8^+ T cell must express PTPRC (CD45),
CD3E and CD8A or CD8B, and cannot express either NCR1, NCAM1 or
FOXP3.[379]^29 For these datasets a gene was considered as “expressed”
if
[MATH:
log2(normalized_coun
ts+1)>1
:MATH]
, and “not expressed” otherwise. In all breast cancer datasets of
Bassez et al.,[380]^54 Zhang et al.,[381]^51 and Tietscher
et al.,[382]^55 we did this process only after focusing on T cells
according to the original cell type classification provided by the
authors. In the NSCLC dataset of Caushi et al.,[383]^39 we extracted
CD8^+ T cells out of all of the sequenced cells as they went through
CD3 positive cell filtration by flow cytometry. In the melanoma
datasets of Li et al.,[384]^40 Pauken et al.[385]^52 and Wen
et al.,[386]^57 we extracted CD8^+ T cells out of all of the sequenced
cells as they went through CD8 positive filtration by flow cytometry.
Finally, for the Merkel Cell Carcinoma dataset of Paulson
et al.,[387]^53 we extracted CD8^+ T cells out of all unsorted
sequenced cells.
Testing data imputation for the selection of CD8^+ T cells
Using a Deep Count Auto-encoder (DCA) for data imputation implemented
in Scanpy,[388]^83^,[389]^86 we first fitted a density curve to the
log2-transformed imputed CD8A expression values of all cells from all
samples, using the optimize.curve_fit() function implemented by the
Scipy library in Python. We then used the expression cut-off according
to Caushi et al.[390]^39: “as the trough of the bimodal density curve
(that is, the first location where the first derivative is zero and the
second derivative is positive)”.
An example of the bimodal distribution fitted to the imputed CD8A
expression in the metastatic melanoma dataset by Sade-Feldman
et al.[391]^29 is displayed in [392]Figure S15, including the imputed
expression threshold (blue dashed line).
We used the imputed expression threshold of CD8A for the selection of
CD8^+ T cells as was done by Caushi et al.[393]^39 In [394]Figure S16,
we show the selected CD8^+ T cells according to our stringent inclusion
criteria conducted using the non-imputed expression matrix vs. the
selection using the imputed expression matrix, including various
non-CD8 markers for reference. Setting higher thresholds for the
imputed expression of CD8A, still resulted with CD8^+ T cells having
non-CD8 markers.
Reconstruction of the metabolic predictor
To reconstruct the metabolic predictor, we considered the two programs
that were significantly associated with patient response and had a
large overlap between the melanoma and NSCLC
datasets.[395]^29^,[396]^39 In each dataset, genes of the two programs
were first ranked according to their usage value. We then calculated
the joint rank for the respective programs in both datasets and removed
the single gene in each program that corresponded to a housekeeping
gene (RPL14 and GAPDH). We eventually selected the top K genes within
each program for the reconstruction of the metabolic predictor
([397]Table S4). We scored each CD8^+ T cell based on the number of
expressed genes out of the K ‘Metabolic state A’ markers and the K
’Metabolic state B’ markers, yielding two scores for each CD8^+ T cell.
Then, each CD8^+ T cell was classified as A or B based on the majority
vote of the two scores:
[MATH:
nstate_A_mar
kers≥nst
ate_B_m<
mi>arkers :MATH]
for A, and
[MATH:
nstate_A_mar
kers<nst
ate_B_m<
mi>arkers :MATH]
for B. Finally, we computed a score per sample by taking the ratio
between the number of cells classified as belonging to ‘Metabolic state
A’ and those belonging to ‘Metabolic state B’. CD8^+ T cells
demonstrating an expression of zero markers from both lists of markers
were excluded from this analysis. Samples with a ratio
[MATH: > :MATH]
1 were classified as responders, and those with a ratio
[MATH: ≤ :MATH]
1 were classified as non-responders. We performed this process on
several datasets containing both response
groups[398]^29^,[399]^39^,[400]^51^,[401]^54^,[402]^55 for different
values of K, ranging 3,…,25. For each K we calculated the Receiver
Operating Curve (ROC) and the Area Under the Curve (AUC) score, as well
as the sensitivity and specificity measures. Best results were achieved
for K = 5,6 ([403]Figure S7; [404]Table S4). For simplicity, we
selected K = 6 for all downstream analysis. A two-sided Wilcoxon rank
sum test was used to calculate the P-value for the performance of the
predictor, comparing the metabolic ratio score between true responders
and non-responders. For the dataset of Bassez et al.,[405]^54 we
conducted this statistical test for the performance of the predictor
between samples undergone clonal expansion and those with no (or low)
clonal expansion. For the dataset of Tietscher et al.,[406]^55 we
compared between samples defined with exhausted TME and those with
non-exhausted TME, provided by the authors.
Testing a possible influence of dropouts on the presence of the selected
metabolic markers
Two different approaches were taken in order to address a possible
influence of dropouts on the presence of the metabolic markers selected
in this study.
In the first, we used the imputed expression values of each metabolic
marker in every CD8^+ T cell. Several markers (out of the chosen 12)
didn’t demonstrate a bimodal density curve of their imputed expression
values, making it not trivial to point a certain threshold of
expression in their unimodal representation ([407]Figure S17). In a
case where a marker had a unimodal representation, the threshold was
determined based on visual inspection and empirical considerations. The
performance of our predictor using the imputed expression values is
demonstrated in [408]Figure S18.
For the second approach, we used the non-imputed data and randomly
shuffled the expression of the metabolic markers across all CD8^+
T cells in two ways:
* a.
We first permutated all of the metabolic markers, where each marker
was shuffled separately across the cells.
* b.
We permutated the entire vector of markers across the cells where
the vector of markers itself remained intact within each
single-cell.
The purpose of this experiment is to show that the lack of marker
expression in a given cell, is mostly a biological rather than a
technical phenomenon ([409]Figure S8).
Survival analysis
Two datasets in our analysis contained survival data.[410]^29^,[411]^51
For these two, we performed survival analysis examining whether any of
the 12 genes used by our predictor are significantly associated with
patient survival. As described above, only the set of CD8^+ T cells was
used in this analysis. For the melanoma dataset,[412]^29 in cases where
more than one sample was available for a single patient, we considered
only the baseline sample, yielding 32 samples from 32 patients overall.
Similarly, for the TNBC dataset[413]^51 we considered only samples at
baseline, yielding 8 samples overall. For each sample we first computed
the mean expression of each gene across all of the associated CD8^+
T cells. We then split the samples into two groups using the median
gene expression value, and conducted logrank test between the two
groups. P-values were corrected for multiple hypothesis using the
Benjamini-Hochberg false discovery rate.[414]^80 This process was
conducted in each group of six markers (for metabolic states A and B)
separately.
Application of the cellular states predictor
For each one of the three breast cancer datasets utilized in this
study,[415]^51^,[416]^54^,[417]^55 we calculated for each CD8^+ T cell
identified by the pipeline described above, the fraction of the
expressed memory-activated markers and that of the exhaustion markers
provided by Sade-Feldman et al.[418]^29 We then classified each CD8^+
T cell into one of two cellular states: memory-activated or exhausted,
according to the higher fraction of expressed markers. For each sample
we calculated the ratio of CD8^+ T cells from these two cellular states
and examined the predictive power of this ratio, predicting samples as
responders if the ratio of memory-activated/exhausted CD8^+ T cells was
higher than 1, and the opposite for classification for negative
response ([419]Figure S11).
Melanoma tumor analysis
We utilized seven CD45^- samples from 6 patients available in the
melanoma dataset.[420]^65 To identify the malignant cells we applied
InferCNV[421]^87 ([422]https://github.com/broadinstitute/inferCNV/)
using R software,[423]^88 and marked those showing copy number
variation patterns. We found that all samples had more than 350
malignant cells (out of 384 sequenced cells) except for P5 which had
only 24 malignant cells and P28_2 which had only one malignant cell and
therefore was combined with the malignant cells from her other
available sample. We normalized the TPM matrix as described above and
focused on the previously described 1689 metabolic genes.
Differentially expressed genes and enriched metabolic pathways were
identified in each tumor according to the pipeline described above
([424]Table S5). tSNE[425]^82 was applied using the set of 1689
metabolic genes for visualization. For each cell, we scored four
sugar-related metabolic pathways as described above: “Glycolysis”,
“Fructose and mannose metabolism”, “Aminosugar metabolism”, and
“Galactose metabolism”.
Acknowledgments