Graphical abstract
graphic file with name fx1.jpg
[43]Open in a new tab
Highlights
* •
Construction of a long-term survival predictive score of
immunotherapy (iLSPS)
* •
Rigorous multi-omics feature selection based on extensive TCGA and
clinical trial data
* •
Model incorporates neoantigen, CD8^+ T cell, metabolism, and
MHC_score
* •
High iLSPS identifies ICI beneficiaries and correlates with
inflamed microenvironment
Motivation
Long-term survival (LTS) benefit is the landmark achievement of
immunotherapy, but significant questions remain about the underlying
molecular mechanism and predictive biomarkers. However, the limited
clinical and genetic data on patients with long-term follow up poses
challenges for research into ICI LTS biomarkers. Since conducting
in vitro/in vivo experiments using cell lines or mouse models to
explore the mechanism of human LTS is impractical, in silico
bioinformatic analysis with human genomic sequencing data is a needed
approach. Here, we conduct a systematic analysis to elucidate LTS
predictors, including a panoramic biomarker review, rigorous feature
selection, and Gaussian process regression machine-learning model
training and testing. We validate model performance with multiple
cohorts, discuss the biological interpretation of model findings, and
provide illustrative guidance on real-world clinical application.
__________________________________________________________________
Zhao et al. utilize TCGA multi-omics datasets and cancer immunotherapy
clinical trial and cohort data to develop predictive biomarkers and an
integrative machine-learning model for long-term survival after
immunotherapy in melanoma, lung cancer, and several other cancer types.
This work helps to address the need for developing long-term survival
predictors.
Introduction
Immune checkpoint inhibitor (ICI)-based immunotherapy is a cornerstone
of cancer treatment.[44]^1 Recently, Keynote-001 reported 5-year
overall survival (OS) rates of 23.2% and 15.5% following pembrolizumab
treatment among treatment-naive and previously treated patients with
advanced non-small cell lung cancer (NSCLC), respectively.[45]^2 The
survival benefits from ICI far exceeded those of the pre-ICI era,
representing a chance for long-term survival (LTS) and even curability
for patients with cancer.[46]^3 However, to date, the 5-year OS of
immunotherapy, an indicator of LTS, has been only determined in the
clinical trials of six cancer types, namely kidney renal clear cell
carcinoma (KIRC), lung adenocarcinoma (LUAD), lung squamous cell
carcinoma (LUSC), skin cutaneous melanoma (SKCM), esophageal squamous
carcinoma (ESCC), and bladder urothelial carcinoma (BLCA). Therefore,
the key determinants underpinning the LTS benefit have not been fully
elucidated owing to the limited clinical and genetic data of these
patients.
Discriminating LTS patients implies identifying those who truly benefit
from ICI. From previous studies, sustained responders to ICIs were
enriched among patients with high tumor mutation burden (TMB) and/or
PD-L1 tumor proportion scores (TPSs) >50%.[47]^4^,[48]^5 In the era of
artificial intelligence (AI) and “big data,” machine-learning
techniques are extensively utilized in biomedical
research.[49]^6^,[50]^7^,[51]^8 Pretreatment circulating tumor DNA
(ctDNA), peripheral T cell levels, and early on-treatment ctDNA
dynamics were harmonized to develop a Bayesian probit model to predict
durable response to ICIs.[52]^9 A machine-learning algorithm (XGBoost)
was also deployed to construct a model based on 11 genetic features,
predicting response to ICIs across seven tumor types.[53]^10 Modeling
of LTS could similarly shed light on molecular biomarkers and
mechanisms contributing to this critical clinical parameter.
Previous work[54]^11 has utilized Spearman correlation analysis between
the median value of 36 variables per cancer type from TCGA and the
objective response rate (ORR) from anti-PD-(L)1 of corresponding
clinical trials[55]^12 to identify predictors for ICI response. Another
study[56]^13 also performed Spearman correlation analysis between the
molecular features from TCGA and real-world pharmacovigilance data to
identify biomarkers for immune-related adverse effects. In this study,
to establish a multi-parameter network predictive of LTS, we adopted
the approach of mapping TCGA multi-omics data to survival data from
immunotherapy clinical trials. Each cancer type was treated as a
separate unit. The median value of molecular features and survival of
each cancer type were used to perform primary and secondary feature
selection. The selected markers were harmonized to construct a
machine-learning GPR model to estimate the LTS predictive score of
immunotherapy (iLSPS). We present assessments of prediction accuracy in
ICI-treated cohorts and correlations between iLSPS and immune-related
indexes to explore the biological implications of iLSPS. A study
overview is presented in [57]Figure 1.
Figure 1.
[58]Figure 1
[59]Open in a new tab
Study schematic and surrogate metric for LTS
(A) Study schematic. Clinical data of ICI efficacy and TCGA multi-omics
parameters were collected. Spearman correlation, random forest, and GPR
analyses were performed to select features used to construct the
predictive model. The GPR model-estimated iLSPS was utilized for
clinical application.
(B and C) Pearson’s correlation between all efficacy and AE-related
items from the publications of the included clinical trials. Each
circle represents a trial (C).
See also [60]Figure S1 and [61]Table S2.
Results
Cohort overview
(1) 158 clinical trial cohorts ([62]Table S1) involving 21,023
single-agent ICI-treated advanced/metastatic patients with 25 solid
cancer types ([63]Table S2) were utilized to investigate an optimal
surrogate for 5-year OS benefit. (2) TCGA multi-omics datasets of
corresponding 25 cancer types were used for biomarker selection and GPR
model construction. (3) Two single-cell RNA sequencing (scRNA-seq)
datasets including ICI-treated patients with advanced basal cell
carcinoma (BCC) and renal cell carcinoma (RCC), respectively, were
included to validate the predictive ability of biomarkers in
single-cell resolution. (4) Six external and one internal ICI-treated
cohorts including a total of 634 patients were tested for model
performance, which comprised five melanoma
cohorts,[64]^14^,[65]^15^,[66]^16^,[67]^17^,[68]^18 one BLCA
cohort,[69]^19 and one lung cancer cohort from our center (Zenodo:
8275988).
Establishment of surrogate metric for LTS following immunotherapy
We firstly reviewed 23 clinical trials with available 5-year OS rates
of 3,999 patients with LUAD, LUSC, KIRC, SKCM, ESCC, or BLCA. However,
we were interested in building a model that leverages relationships
between ICI and LTS across more cancer types. We therefore explored the
correlation of 5-year OS rate with near-time point OS from clinical
trials with available 4-year OS (6 cancer types), 3-year OS (10 cancer
types), 2-year OS (18 cancer types), and 1-year OS (25 cancer types)
data. Of the included clinical trials, 22 indices were assessed,
including OS and progression-free survival (PFS) outcomes, responses
(ORR and median duration of response), and immune-related adverse
events (irAEs). Pearson’s correlations across all indices were
measured, and we found that the 5-year OS rate was strongly correlated
with the 1-, 2-, 3-, and 4-year OS rates ([70]Figure 1B). The
correlation between the 1-year OS rate and the 5-year OS rate was
notably significant across 6 cancer types (Pearson’s R = 0.80;
p < 0.0001; [71]Figure 1C).
We would like to further note stepwise correlations between different
time point OS data. 3- and 5-year OS rates were generally correlated
(Pearson’s R = 0.96; p < 0.0001; [72]Figure S1A). We also analyzed 42
clinical trials with available 3-year OS rates ([73]Table S4), finding
that 2- and 3-year OS rates were significantly correlated (Pearson’s
R = 0.94; p < 0.0001; [74]Figure S1B). Then, among 80 clinical trials
with available 2-year OS rates, we found the correlation between the 2-
and the 1-year OS rates to also be significant (Pearson’s R = 0.92;
p < 0.0001; [75]Figure S1C). The above stepwise correlations involving
more cohorts provide further evidence supporting the use of 1-year OS
rate as a surrogate for the 5-year OS rate.
To incorporate more cancer types for feature selection and model
construction, subsequent analyses included 158 clinical trials
involving 25 cancer types and 21,023 patients with available 1-year OS
rates. The number of patients, follow-ups, lines of treatment, and ICI
regimens of these clinical trials are described in detail in
[76]Table S3. However, we want to clearly acknowledge the assumptions
and limitations inherent to utilizing the 1-year OS rate, particularly
when moving beyond the cancer types that had direct 5-year OS data
available.
Exploring multi-omics biomarkers of LTS and primary selection of candidate
biomarkers
To explore the multi-omics predictive biomarkers, we collected exome,
transcriptome, and methylation data from TCGA. A panel of 107 potential
biomarkers belonging to ten categories was considered ([77]Figure 2A),
mapping median values to the 1-year OS rates per cancer type (see
[78]STAR Methods). Spearman’s correlation analyses identified a total
of 29 biomarkers significantly correlated with the 1-year OS rate
following ICI (Spearman’s p < 0.05) but not with the 1-year OS rate
from TCGA (Spearman’s p > 0.05) ([79]Figure 2A). Among these
biomarkers, we selected the top 20 variables according to the
Benjamini-Hochberg (BH) false discovery rate (FDR) adjusted p values on
the 1-year OS rate following ICI ([80]Table S5).
Figure 2.
[81]Figure 2
[82]Open in a new tab
Landscape of multi-omics biomarkers for LTS
(A) The correlation of a panel of predictors with 1-year OS rate after
ICI and TCGA prognostic 1-year OS rate. In the annular stacked bar
chart, the color and the height of each bar represent the p value and
the correlation coefficient from the Spearman’s rank test,
respectively.
(B and C) Spearman’s correlation between CD8^+ T cell, MHC_score, and
1-year OS after ICI.
(D) Spearman’s correlation between MHC_score and ssGSEA scores of 12
biological pathways in all 25 cancer types. ∗p ≤ 0.05.
(E) Validation of MHC_score in two immunotherapy datasets. The efficacy
was evaluated according to RECIST v.1.1. CR, complete response; PR,
partial response; SD, stable disease; PD, progressive disease. p values
were from Wilcoxon rank-sum test. Data are represented as mean ± SEM.
(F) Spearman’s correlation between metabolism score and 1-year OS after
ICI.
(G) GSEA plot showing enrichment of pathways in the population with
superior ICI efficacy. p values from Benjamini-Hochberg (BH) FDR
adjustment are shown on a color scale.
(H and I) Spearman’s correlation between CNV burden, total neoantigen
(log-scaled), and 1-year OS after ICI.
(B, C, F, H, and I) Each circle represents the median value of a
biomarker per cancer type from TCGA and the median value of the 1-year
OS rates from clinical trials of that cancer type.
See also [83]Figure S2 and [84]Table S5.
A significant association between the level of CD8^+ T cells and the
1-year OS rate following ICI (Spearman’s R = 0.55; p = 0.0048;
[85]Figure 2B) reflected the critical role of CD8^+ T cells in
antitumor immune effects. Major histocompatibility complex (MHC) class
I and II genes were utilized to calculate the MHC_score using the Ridge
regression model, which significantly correlated with the 1-year OS
rate following ICI (Spearman’s R = 0.48; p = 0.0167; [86]Figure 2C). In
most cancer types, MHC_score was positively correlated with the level
of antitumor cytokines, coactivation molecules, and inflammatory
response, as assayed by single-sample gene set enrichment analysis
(ssGSEA) scores ([87]Figure 2D). Moreover, the MHC_score for patients
with NSCLC with response to anti-PD-1 (p = 0.041; [88]Figure 2E, left)
and patients with melanoma with response to adoptive T cell therapy
(p = 0.030; [89]Figure 2E, right) was significantly higher than that of
non-responders, supporting the relevance of MHC_score.
We next identified set of 527 genes significantly correlated with the
1-year OS rate after ICI but not with TCGA prognostic 1-year OS rate
([90]Table S6), finding that these were mainly enriched in metabolic
pathways ([91]Figure S2A), which reflects metabolic reprogramming in
most cancers.[92]^20 We calculated ssGSEA scores for all 106 metabolic
pathways in the KEGG database, nine of which significantly correlated
with the 1-year OS rate after ICI but not TCGA 1-year OS rate
([93]Table S7). Among these, five pathways ([94]Figure S2B) were
significantly enriched in ICI beneficiaries (BH FDR adjusted p value
[p.adjust] < 0.01; [95]Figure 2G). The top two pathways were valine,
leucine, and isoleucine degradation and tryptophan metabolism
([96]Figure S2C). The ssGSEA scores of these five pathways were then
added to define the metabolism score, which significantly correlated
with the 1-year OS rate after ICI (Spearman’s R = 0.54; p = 0.0056;
[97]Figure 2F) and 19 out of the 107 identified biomarkers in
[98]Figure 2A ([99]Figure S2D).
Traditionally defined copy-number variation (CNV) burden (“fraction
altered” and “number of segments”) was not significantly predictive
([100]Figure 2A). However, a modified, gene-level CNV burden, based on
the GISTIC2.0 value of the top 500 altered genes ([101]Table S8),
correlated with the 1-year OS rate after ICI (Spearman’s R = −0.48; p =
0.0147; [102]Figure 2H). These altered genes were mainly located on
chromosomes 8q and 3q ([103]Figure S2E).
385 mRNAs and seven long non-coding RNAs (lncRNAs) associated with the
1-year OS rate after ICI (not TCGA 1-year OS rate) were obtained and
yielded 273 significantly interacting lncRNA-mRNA pairs, the top 20 of
which are presented in [104]Figure S2F. LINC00601 potentially targeted
many mRNAs associated with immunotherapy efficacy. The summation of the
four lncRNAs’ expression levels (RNA-TPM) was defined as the lncRNA
score, which significantly correlated with the 1-year OS rate following
ICI (Spearman’s R = 0.46; p = 0.022; [105]Figure S2G).
Since TMB is an FDA-approved tumor-agnostic predictive biomarker of
immunotherapy, we manually incorporated mutation-based variables,
including total TMB, non-synonymous TMB, insertions and deletions
(indels), non-synonymous indels, single-nucleotide variants (SNVs),
non-synonymous SNVs, truncating and missense mutations, and tumor
neoantigen burden (TNB). These variables were highly correlated
([106]Figure S2H), and among them we selected the top 5 variables
according to the adjusted p values. In addition, the correlation
between log-scaled TNB and the 1-year OS rate after ICI was presented
([107]Figure 2I).
In summary, a total of 25 biomarkers were selected from the analyses
for further refinement.
Secondary selection of candidate biomarkers
The median value of the top 25 biomarkers per cancer type, 1-, 2-, and
3-year OS rates from ICI, and their correlations are summarized in
[108]Figure 3A. Spearman’s correlation analyses of the top 25
biomarkers suggest five major groups ([109]Figure 3B): one comprised
TMB-related biomarkers; one comprised tumor microenvironment
(TME)-related biomarkers, such as tumor-infiltrating lymphocytes
(TILs); one comprised genomic instability-related biomarkers, such as
CNV burden; and the other two groups referred to metabolism score,
lncRNA score, and MCAM expression, respectively. Importantly,
inter-group variables had little to no correlation with each other.
Based on the above intra-variable correlations, it would be redundant
to incorporate all of the top 25 biomarkers in the predictive model;
secondary feature selection was thus required to improve its
interpretability and performance. The importance of the 25 features was
ranked using random forests (RFs; [110]Figure 3C). We also used GPR
algorithm-based wrapper methods to select a subset of crucial features.
The intersection of the RF top 10 important features and GPR-selected
features was obtained, and based on this, total neoantigen
(log-scaled), metabolism score, CD8^+ T, and MHC_score were selected as
candidate biomarkers for the predictive model construction
([111]Figure 3D). The correlations between these four biomarkers and
the 1-year OS rate after ICI remained stable when iteratively removing
one cancer type to avoid overfitting ([112]Figure S3B). Moreover, the
likelihood-ratio test of HLM mixed-effects models indicated that these
four biomarkers were significantly correlated with the 1-year OS rate
after ICI across cancers (p < 0.05; [113]Figure S3C).
Figure 3.
[114]Figure 3
[115]Open in a new tab
Candidate biomarker selection and validation using scRNA-seq
(A) Heatmap of normalized median value of top 25 biomarkers correlated
with 1-year OS rate after ICI (not TCGA prognostic 1-year OS rate) in
each cancer type. The annotation on top presents the 1-, 2-, and 3-year
OS rates after ICI of 25 cancer types. The annotation on the right
presents the Spearman’s correlation coefficients of top 25 biomarkers
with 1-, 2-, and 3-year OS rates following ICI treatment.
(B) Spearman’s correlation matrix across top 25 biomarkers (median
value per cancer type).
(C) Feature importance scores assessed via random forest analysis.
(D) Venn plot of features selected from the random forest and GPR
wrapper method.
(E–H) scRNA-seq validation using BCC (E and F) and RCC datasets (G and
H). (E and G) Violin plots of metabolism scores of tumor cells among
patients responsive or non-responsive to ICI. The point represents a
single cell, and the color of the point denotes the patient. (F and H)
Violin plots of immune-related gene expression per cell of CD8^+ T
populations among patients responsive or non-responsive to ICI.
Chi-squared (χ^2) and p values were obtained via one-way ANOVA from the
linear mixed-effects models; ns: p > 0.05, ∗p ≤ 0.05, ∗∗p ≤ 0.01, ∗∗∗p
≤ 0.001. Data are represented as mean ± SEM (E–H).
See also [116]Figures S3 and [117]S4.
Marker validation using scRNA-seq
We utilized two scRNA-seq datasets to validate the biomarkers beyond
what could be resolved via bulk transcriptomic analysis. In the BCC
dataset, the metabolism scores of tumor cells (“tumor_1” cluster) were
significantly more upregulated among ICI responders than among
non-responders (χ^2 = 4.37, p = 0.0366), and the metabolism scores of
the “tumor_2” cluster were also higher among ICI responders, although
they did not achieve statistical significance (χ^2 = 0.29, p = 0.5892;
[118]Figure 3E). Instead, the metabolism scores of CD8^+ T cells and
natural killer (NK) cells indicated no association with ICI response
([119]Figure S4A). The CNV burden of the “Tumor_1” cluster from ICI
responders was significantly lower than that from non-responders (χ^2 =
5.16, p = 0.0231; [120]Figures S4B and S4D). The relative fraction of
CD8^+ T and NK cell populations was seemingly higher in
anti-PD-1-responsive patients ([121]Figure S4C), corresponding with the
bulk analysis results. Furthermore, GZMB, GZMK, and CXCR3 transcript
levels of activated CD8^+ T cells; interferon gamma (IFNG), TNFRSF4,
and CD27 of exhausted CD8^+ T cells; and PRF1 and FASLG of memory CD8^+
T cells were significantly higher in ICI responders than in
non-responders ([122]Figure 3F). The effector gene expression of NK
cells indicated no association with ICI response ([123]Figure S4E).
In another RCC dataset, the metabolism scores of tumor cells, including
“TP1” (χ^2 = 5.11, p = 0.0238), “TP2” (χ^2 = 4.28, p = 0.0387), and
“cycling tumor” clusters (χ^2 = 6.76, p = 0.0093), were significantly
more upregulated among ICI responders than among non-responders
([124]Figure 3G). However, the metabolism scores of CD8^+ T and NK cell
clusters, excluding “FGFBP2- NK,” indicated no association with ICI
response ([125]Figure S4F). The CNV burden of tumor cells in ICI
responders tended to be lower ([126]Figures S4G and S4I). The fraction
of CD8^+ T and NK cell populations were also higher in ICI responders
([127]Figure S4H). Moreover, most cytotoxic and costimulatory genes in
all CD8^+ T cells clusters and partial NK cell clusters were
significantly higher in ICI responders than in non-responders
([128]Figures 3H and [129]S4J).
Overall, these results further demonstrated the significance of the
metabolism score and CD8^+ T cells for model construction of LTS
benefit.
GPR model construction and validation using TCGA datasets
To construct a multi-parametric model, we utilized the GPR
machine-learning algorithm, a non-parametric Bayesian approach, of
which the prior probability and likelihood are generally assumed to
have a Gaussian distribution, and the predictive posterior probability
is speculated using Bayes’ rule.[130]^21 The median value of total
neoantigen, metabolism score, CD8^+ T, and MHC_score for each cancer
type from TCGA database, and the median 1-year OS rate from the
corresponding clinical trials, were input to train the GPR model, and
the predicted value was nominated as the iLSPS. The hyperparameter of
the GPR model was tuned by random search optimization. The predictive
performance of the model was measured using the Pearson correlation
coefficient between the predicted iLSPS and the true 1-year OS rate.
Through the training process in a bootstrap resampling framework
([131]Figure S5A), iLSPS values were closed to the actual 1-year OS
rates (Pearson’s R = 0.81; p < 0.0001) in the training set
([132]Figure 4A). Using the trained GPR model, we also predicted the
iLSPS of 18 cancers with available 2-year OS rates, which significantly
correlated with the actual 2-year OS rates (Pearson’s R = 0.53; p =
0.0235; [133]Figure 4B). Likewise, the iLSPS of eight cancers with
available 3-year OS rates were estimated and correlated with the actual
3-year OS rates (Pearson’s R = 0.89; p = 0.0029; [134]Figure 4C).
Figure 4.
[135]Figure 4
[136]Open in a new tab
GPR model construction with TCGA datasets and utilization in
ICI-treated cohorts
(A) Pearson’s correlation between iLSPS and actual 1-year OS rates
after ICI in the training set of 25 cancer types. The model achieved a
mean square error (MSE) of 0.0098 and a root MSE (RMSE) of 0.0988.
(B) Pearson’s correlation between iLSPS and actual 2-year OS rates
after ICI of 18 cancer types.
(C) Pearson’s correlation between iLSPS and actual 3-year OS rates
after ICI of eight cancer types.
(D) Leave-one-out cross-validation of the model. MSE was 0.0154, and
RMSE was 0.1242. Each point refers to the true 1-year OS rate after ICI
of that cancer type and the iLSPS predicted from the model trained with
the remained 24 cancer types.
(E) Kaplan-Meier (K-M) curves of OS stratified by iLSPS with the
optimal cutoff in six public cohorts and our internal cohort.
(F) Forest plot of OS HRs for patients with high versus low iLSPSs
(with the optimal cutoff) in seven cohorts.
(G and J) Pooled AUC for 1-, 2-, 3-, and 4-year OS of seven datasets
for each biomarker.
(H and K) Distribution of pooled HRs for patients with high versus low
iLSPSs from seven cohorts for each biomarker. A series of cutoffs
across the quartiles of biomarkers were used. Q values were calculated
via the Wilcoxon rank-sum test, followed by BH FDR adjustment.
(E–H) Data are represented as mean ± SEM.
(I) K-M curves of OS stratified by PD-L1 (IHC).
See also [137]Figure S5.
To validate the model, we iteratively removed data of one cancer type
and predicted its survival rate using the model trained with the
remaining 24 cancer types. The correlation between the predicted iLSPS
and the 1-year OS rate was still significant (Pearson R = 0.63; p =
0.0007; [138]Figure 4D). Next, TCGA samples of each cancer type were
randomly split into several subsets according to the number of clinical
trials reporting 1-year OS rates for that specific cancer type, which
constituted a dataset of 158 groups. The median value of total
neoantigen, metabolism score, CD8^+ T, and MHC_score for each of these
groups mapped to the 1-year OS rates from 158 clinical trials,
performing as the test dataset. This random split of TCGA samples was
repeated 500 times, in which the Pearson’s R between the predicted
iLSPS and the 1-year OS rate consistently fluctuated within a narrow
range of 0.60–0.68 ([139]Figure S5B). The most frequent result of
Pearson’s R was 0.64 (p < 0.0001; [140]Figures S5B and S5C), validating
the robustness of the model in this independent test dataset.
Utilization of iLSPS in individual patients from ICI-treated cohorts
We sought to estimate the iLSPSs of individual patients and tested the
predictive performance in seven independent ICI-treated cohorts of
melanoma,[141]^14^,[142]^15^,[143]^16^,[144]^17^,[145]^18 BLCA,[146]^19
and lung cancer (our own cohort, Wang 2022), which were not included in
model training ([147]Table S9). The predictive ability of iLSPS was
measured with a series of cutoff points, and the optimal cut point was
determined with the lowest hazard ratio (HR) for OS in the iLSPS-high
versus iLSPS-low groups in each cohort ([148]Figure S5D). In four
melanoma,[149]^14^,[150]^15^,[151]^17^,[152]^18 one BLCA,[153]^19 and
one lung cancer (Zenodo: 8275988) datasets—with 231, 347, and 30
patients, respectively—a substantially longer OS was observed in
patients with high iLSPS compared with those with low iLSPS under the
optimal cut point ([154]Figure 4E). In the Hugo et al.[155]^16 cohort
with 26 patients, a better OS was observed in the iLSPS-high than the
iLSPS-low group, although the difference was not statistically
significant (HR = 0.35, p = 0.1664). We then performed a pooled
analysis of all seven cohorts, yielding a pooled HR of 0.44 (95%
confidence interval [CI], 0.34–0.56; [156]Figure 4F), indicating that
the predictive effect of the iLSPS on survival after ICIs was
pronounced in these cohorts. Using the upper 1/3 as the cut point,
patients with high iLSPSs also exhibited consistently superior survival
to those with low iLSPSs across most cohorts, with some reaching
statistical significance ([157]Figure S5E). More importantly, among
these ICI-treated cohorts, melanoma,[158]^14^,[159]^18 LUAD, and LUSC
(Zenodo: 8275988) cohorts held relative long-term survival, with some
of the patients living more than 36 months.
We compared the predictive power of iLSPS versus conventional
individual and combined biomarkers. The area under the time-dependent
receiver operating characteristic curve (AUC) was calculated to
quantify the ability of continuous biomarkers to predict
survival.[160]^22 Here, the pooled AUCs of the iLSPS in seven cohorts
were 0.70, 0.75, 0.71, and 0.83 for 1-, 2-, 3-, and 4-year survival
rates, respectively, consistently ranking first compared with each
individual biomarker ([161]Figure 4G). Moreover, HR is an important
index evaluating the predictive efficacy of biomarker for time-to-event
outcomes.[162]^23 Pooled HRs of iLSPSs in seven cohorts were
significantly lower than individual biomarkers with a series of cutoffs
([163]Figure 4H). In BLCA[164]^19 and lung cancer (Zenodo: 8275988)
cohorts with available PD-L1 (immunohistochemistry [IHC]) levels, PD-L1
alone failed to stratify patients benefitting from ICI ([165]Figures 4I
and [166]S5F), while iLSPS displayed significant predictive efficacy
([167]Figure 4E). Furthermore, we compared the predictive performance
of iLSPS with TMB- and IFNG-related signatures, including IFNG gene
expression, IFNG4 signature (IFNG, CD274, LAG3, CXCL9), and effector
T cell (Teff)/IFNG signature (CD8A, GZMA, GZMB, IFNG, EOMES, CXCL9,
CXCL10, TBX21). Similarly, the pooled AUC of iLSPSs in seven cohorts
for 1-, 2-, 3-, and 4-year survival rates consistently ranked first
compared with these well-known biomarkers ([168]Figure 4J). The pooled
HRs of iLSPSs across seven cohorts were significantly lower than TMB-
and IFNG-related signatures across a series of cut points
([169]Figure 4K). In addition, TMB and CD8^+ T cells, TMB and IFNG were
combined, respectively, to construct a linear regression model with
leave-one-out cross-validation ([170]Figures S5G and S5H), which
exhibited inferior predictive ability to iLSPS ([171]Figures 4A–4C and
[172]5C) for survival after ICI. Taken together, iLSPS was predictive
for 1-year OS of 25 cancer types after ICI and presents as a potential
composite biomarker to predict LTS benefit at least for melanoma, LUAD,
and LUSC.
Figure 5.
[173]Figure 5
[174]Open in a new tab
Immune-related biological implications of iLSPS
(A) Stacked bar plot showing the percentage per TME subtype in the
high- and low-iLSPS groups in five immunotherapy datasets. p value was
determined via the Fisher’s exact test.
(B) Boxplots displaying the association between iLSPS and PD-L1 IHC
(TC).
(C) Boxplots showing the association between iLSPS and three
histologically defined TME subtypes.
(D) Boxplots showing the distribution of median iLSPS of TCGA 17 cancer
types in four TME subtypes.
(B–D) Q values were obtained via the Wilcoxon rank-sum test, followed
by BH FDR adjustment.
(E) Boxplots showing the distribution of median IS scores of TCGA 22
cancer types in the high- and low-iLSPS groups.
(F) Boxplots showing the distribution of median iLSPSs of TCGA 14
cancer types in the high- and low-IPS groups.
(D–F) p value (E and F) was determined via the Wilcoxon rank-sum test.
The dots (D–F) denote the median score per cancer type.
(G) Representative microphotographs of mIF (scale: 200 μm) from patient
no. 2 with high iLSPS and patient no. 5 with low iLSPS.
(H) The immune phenotype of iLSPS-high and -low patients.
(I) Percentage of cells in tumor region of iLSPS-high and iLSPS-low
patients.
(J) Boxplots showing the association between iLSPS and TCR richness of
pretreatment and on-treatment samples.
(K) Boxplots showing the association between iLSPS and ITH in Liu
et al.[175]^18 cohort.
p value (I–K) was determined via the Wilcoxon rank-sum test. Data are
represented as mean ± SEM (B–F and I–K).
See also [176]Figures S6 and [177]S7.
We subsequently utilized the model to calculate the iLSPS of all
patients—i.e., not selecting for those receiving immunotherapy—from 22
cancer types in TCGA ([178]Figure S5I). However, no significant
difference in TCGA-reporting OS was observed between the high- (upper
1/3) and low-iLSPS arms of most cancers, suggesting that iLSPS was not
a prognostic indicator ([179]Figure S5J). Importantly, iLSPS in
patients with stage I and II tumors distributed similarly to that of
stages III and IV ([180]Figure S5K), thereby overcoming the bias from
various stages of TCGA samples to some extent.
Immune-related biological implications of iLSPS
Transcriptomic-based analyses are widely used to categorize patients
into specific TME subtype to explain heterogeneous prognoses or
responses to regimens. We assessed the association between iLSPSs and
previously reported TME-related signatures[181]^24 and proposed four
distinct TME clusters: immune-enriched, fibrotic (IE/F);
immune-enriched, non-fibrotic (IE); fibrotic (F); and immune-depleted
(D). We also make reference to an “immune-excluded” phenotype, in which
the CD8^+ Teff signature was intermediate between immune-inflamed and
immune-dessert phenotypes. The IE and IE/F subtypes generally exhibited
superior survival benefits after ICI to the F and D subtypes. In four
melanoma cohorts[182]^15^,[183]^16^,[184]^17^,[185]^18 and one BLCA
dataset,[186]^19 a higher fraction of IE/F and IE phenotypes and a
lower fraction of F and D phenotypes were observed in the iLSPS-high
group ([187]Figure 5A). Generally, patients with the IE or IE/F subtype
had significantly higher iLSPS than those with the F or D subtype
([188]Figure S6A). In the BLCA cohort,[189]^19 iLSPS was statistically
higher in patients with positive PD-L1 expression on tumor cells
(TC2^+ > TC0, p = 7.70 × 10^−5; [190]Figure 5B), and more
PD-L1-positive (TC2^+) patients were observed in the iLSPS-high group
([191]Figure S6B). In addition, in the BLCA cohort,[192]^19 a higher
iLSPS was observed among patients with immune-inflamed and -excluded
phenotypes than those with immune-desert phenotypes ([193]Figure 5C),
and the iLSPS-high group exhibited more immune-inflamed subtypes than
the iLSPS-low group ([194]Figure S6C). Both TILs and immunomodulatory
genes ([195]Table S11) differed between patients with high and low
iLSPSs across the seven ICI-treated cohorts ([196]Figures S6D and S6E).
These results implied that a high iLSPS may reflect an immunoactive or
“hot” microenvironment.
Using TCGA pan-cancer datasets, we found that iLSPS was consistently
connected to the four major TME phenotypes in most cancers
([197]Figure S7A). The IE subtype generally had a higher iLSPS than the
IE/F, F, and D subtypes ([198]Figure 5D). Similarly, a significantly
higher immune signature (IS) was attained in the iLSPS-high group (p =
0.0021; [199]Figures 5E and [200]S7B). Similarly, iLSPS values of
patients were higher in immunophenoscore (IPS)-high groups
([201]Figures 5F and [202]S7C). In most cancer types, the C2 (IFNG
dominant) subset uniformly possessed the highest iLSPS
([203]Figure S7D). This result supported the activated immune status
associated with high iLSPSs, considering that the C2 subset is
characterized by a massive CD8 signal, the greatest TCR diversity, and
highest level of M1/M2 macrophage polarization.
To confirm the indication of iLSPS high for a “hot” TME, multiplex
immunofluorescence (mIF) was performed ([204]Figure 5G) in eight lung
cancer samples from our center (Zenodo: 8275988). Two patients
displayed an immune-inflamed phenotype and one patient displayed an
immune-excluded phenotype in the iLSPS-high group, while five patients
with low iLSPSs were characterized by an immune-desert phenotype
([205]Figure 5H). The infiltration of total CD3^+ T cells (p = 0.036),
PD-1^+CD8^+ T cells (p = 0.026), CD68^+CD163^− M1 macrophages (p =
0.036) and PD-L1^+CD68^+ macrophages (p = 0.046) in the tumor region
was significantly higher in the iLSPS-high than in the iLSPS-low group
([206]Figure 5I). Overall, the mIF depiction of the histological TME
validated the association of high iLSPSs with a molecular
information-derived inflamed TME.
Upon recognition of neoantigens, the characterization of a T cell
receptor (TCR) repertoire reflects the T cell expansion and immune
activation.[207]^25 In the Riaz et al.[208]^17 cohort, for both
pretreatment and on-treatment samples, a significant increase in the
number of TCRβ-chain complementarity-determining region (CDR3)
repertoires and median CDR3 counts per VJ was observed in iLSPS-high
compared with iLSPS-low patients ([209]Figure 5J), indicating the
association between iLSPS and TCR richness. Besides, we found that
patients with lower intra-tumor heterogeneity (ITH) had higher iLSPSs
(p = 0.016, [210]Figure 5K) in the Liu et al.[211]^18 cohort.
Clinical application of iLSPS
Here, we provided an overview of iLSPSs, other indicators, and OS in
ICI-treated cohorts. First, our internal dataset of lung cancer was
displayed, with higher levels of four features and a more prolonged OS
(32.90 versus 13.67 months) in the iLSPS-high group ([212]Figure S8A).
Further, a nomogram of our lung cancer cohort that incorporated
features—including Eastern Cooperative Oncology Group (ECOG) score,
smoking history, primary type of tumor, and iLSPS—was developed to
forecast the survival after ICI ([213]Figure S8B). Likewise, overviews
and nomograms for the BLCA cohort[214]^19 ([215]Figures S9A and S9B)
and the largest melanoma cohort[216]^18 ([217]Figures S9C and S9D) were
presented. The lung cancer, BLCA, and melanoma nomograms all
demonstrated satisfactory accuracy, with C indices of 0.75, 0.64, and
0.67, respectively, and the calibration plots indicated good
concordance between the nomogram-estimated and actual survival
probabilities ([218]Figures S9E–S9G).
To be more intuitive, we selected two typical cases, nos. 1 and 2, of
low and high iLSPS, respectively, from our lung cancer cohort to
demonstrate the calculation and application of iLSPS in predicting
survival in a clinical workflow. The clinicopathological features and
model-calculated iLSPSs were extracted, and the nomogram predicted a
superior survival for patient no. 2 compared with no. 1 after receiving
tislelizumab ([219]Figure S8C). As evidenced by computed tomography
(CT) images ([220]Figure S8D), patient no. 1 with a low iLSPS showed
the best response of stable disease (SD; assessed according to RECIST
criteria) and finally progressed, with an OS of 10.43 months.
Conversely, patient no. 2 with a high iLSPS exhibited the best response
of complete response (CR) on the liver metastatic lesion, with an OS of
39.27 months (still living). These results suggested that the iLSPS is
relevant and convenient in real-world clinical practice, with
considerable predictive power for survival post-immunotherapy.
Discussion
In this study, we integrated TCGA molecular multi-omics data and
immunotherapy clinical trial reports of 1-year OS rates to identify LTS
predictors and construct a predictive machine-learning model. Since
in vivo and in vitro experiments to detect the predictors of human LTS
are impractical and unreachable, in silico prediction models are an
attractive approach.
The 5-year OS rate, which reflects the proportion of potentially
curable patients, is considered an indicator of LTS.[221]^26 Here, we
investigated the correlation between 5-year OS and other survival time
points. The direct correlation between 5-year and 1-year OS rates after
ICI was significant in six cancer types: LUAD, LUSC, KIRC, SKCM, ESCC,
and BLCA. The stepwise correlation analysis, based on more clinical
trials with available 3-, 2-, and 1-year OS rates, also help to support
the use of 1-year OS as a proxy. Since only six cancer types with
5-year OS performed as six samples were unfeasible to develop a
predictive model, we then utilized 1-year OS rate, reported in 158
trials involving 25 cancer types and 21,023 patients, for the following
in silico bioinformatic analysis, which expanded the sample size to 25
cancer types.
Using TCGA database, we considered a panel of 107 variables that might
reflect the LTS predictor landscape. Based on bulk transcriptomic data,
we defined an accessible MHC_score using MHC class I and II genes
expression, which showed a significant association with survival after
ICI and was validated in two independent immunotherapy datasets. The
infiltration of CD8^+ T cells was also significantly correlated with
the 1-year OS rate after ICI. Moreover, CD8^+ T cell clusters among
ICI-responsive patients exhibited higher expression of cytotoxic genes
and costimulatory genes, suggesting a dominant role of CD8^+ T cells
with antitumor cytotoxicity. We calculated a metabolism score based on
bulk sequencing data and pathway analysis. Interestingly, in our
supporting analyses of scRNA-seq datasets, we found that the metabolism
score of TCs, rather than that of CD8^+ T and NK cells, informed
differential clinical outcomes of treatment with ICIs. These findings
underscore the implication of tumor metabolism alterations in shaping
the microenvironment.[222]^27^,[223]^28
We ultimately constructed the GPR model based on log-scaled TNB, CD8^+
T cell, metabolism score, and MHC_score. The model-estimated iLSPS
produced significantly superior pooled HRs and the largest pooled AUC
value for predicting 1-year OS benefit when compared with established
individual and combined features. Moreover, in three trial cohorts of
melanoma,[224]^14^,[225]^18 LUAD and LUSC (our own Wang 2022 cohort)
with relative long-term-surviving, iLSPS-high patients exhibited
increased OS compared with iLSPS-low patients, validating the
predictive value of iLSPS for LTS, at least for melanoma, LUAD, and
LUSC. We found that a higher proportion of immune-inflamed subtypes and
signatures were enriched in the high-iLSPS group, consistent with the
biological underpinnings of ICI therapies.
Above all, neoantigen burden, MHC_score, CD8^+ T cells, and metabolism
score were filtered to develop the model. These four biomarkers covered
the core process of initiating immune defection. Due to the complex
interactions between tumors and the microenvironment, incorporating
tumor and host factors, intrinsic and extrinsic factors to establish a
multi-omics model can enhance prediction accuracy and clinical
applicability more effectively. Among the four biomarkers developed in
this study, CD8^+ T cells may be the most crucial. However, the
contribution weights of these four biomarkers may vary across different
cancer types. Therefore, it is necessary to include more cancer types
with sufficient LTS datasets and conduct in-depth analyses of
identifying key predictive factors in each cancer type to boost the
predictive power for the LTS of patients.
Limitations of the study
First, since the direct correlation between 5- and 1-year OS rates
after ICI was observed in only six cancer types, the utilization of the
1-year OS rate as surrogate metric of LTS in other cancer types should
be cautious. Moreover, the predictive value of iLSPSs for LTS was
tested only in melanoma, LUAD, and LUSC cohorts with relative long-term
survival. Therefore, it remains unclear how iLSPS performs in other
cancer types. However, along with the continuous increment of
immunotherapy cohorts of more tumor types with longer follow-up, we
anticipate that iLSPS could be utilized for predicting LTS in other
cancers, as the four key features reflect fairly general immune
mechanisms. Second, treating each cancer type as a unit to conduct
feature selection and model training might oversimplify tumor biology.
However, differences among cancer types often exceed patient-level
heterogeneity, which was the rationale for the approach. Moreover, the
value of the selected features and iLSPS was validated across external
and internal cohorts at the patient level. We look forward to
consolidating this approach in larger datasets of individuals. Third,
the included clinical trials involved patients receiving other
therapies before immunotherapy, which was reflective of the organic
reality of present-day patients with cancer. However, the predictive
value of iLSPS for LTS would not be diminished since biomarkers are
signposts and proxies and not indications of one-to-one mechanistic
relationships. Finally, due to the heterogeneity and potential bias
across multiple cohorts, the power of the pooled HR and the pooled AUC
of seven cohorts may be diluted; however, it remained superior to that
of individual features in our research. Direct comparison of iLSPS with
other biomarkers in cohorts with larger sample sizes and prospective
exploration of iLSPS as a predictor is warranted.
STAR★Methods
Key resources table
REAGENT or RESOURCE SOURCE IDENTIFIER
Antibodies
__________________________________________________________________
anti-PD-1 Cell Signaling Technology Cat#86163S; RRID:AB_2728833
anti-PD-L1 Cell Signaling Technology Cat#13684S
anti-CD8 Abcam Cat#ab178089
anti-CD163 Abcam Cat#ab182422
anti-CD68 Abcam Cat#ab213363
anti-CD20 Daco Cat#L26, IR604
anti-CD56 Abcam Cat#ab75813
anti-CD4 Abcam Cat#ab133616
anti-FoxP3 Abcam Cat#ab20034
anti-CD3 Daco Cat#A0452
anti-panCK Abcam Cat#ab7753
__________________________________________________________________
Biological samples
__________________________________________________________________
Lung cancer samples (n = 30) This study N/A
Pan-cancer samples TCGA ([226]https://portal.gdc.cancer.gov/)
__________________________________________________________________
Critical commercial assays
__________________________________________________________________
AllPrep DNA/RNA FFPE Kit QIAGEN Cat#80234
NEBNext® Ultra™II RNA First Strand Synthesis Module NEB Cat#E7771L
NEBNext® Ultra™ II Directional RNA Second Strand Synthesis Module NEB
Cat#E7550L
Hieff NGS Ultima DNA Library Prep Kit for MGI Yeasen Cat#13310ES98
DNBSEQ OneStep DNB Make Reagent Kit (OS-DB) MGI Cat#1000026466
__________________________________________________________________
Deposited data
__________________________________________________________________
Processed scRNA-seq data of BCC cohort Yost et al. 2019[227]^29 GEO:
[228]GSE123813
Raw and processed scRNA-seq data of RCC cohort Biet al. 2021[229]^30
dbGaP: phs002065.v1.p1
Single Cell Portal:
([230]https://singlecell.broadinstitute.org/single_cell/study/SCP1288/t
umor-and-immune-reprogramming-during-immunotherapy-in-advanced-renal-ce
ll-carcinoma#study-summary)
Anti-PD-1-treated NSCLC-Cho Cho et al. 2020[231]^31 GEO: [232]GSE100797
Adoptive T cell-treated melanoma-Lauss Lauss et al. 2017[233]^32 GEO:
[234]GSE126044
ICI-treated melanoma-Snyder Snyder et al. 2014[235]^14
[236]https://www.cbioportal.org/study/summary?id=skcm_mskcc_2014
ICI-treated melanoma-Van Allen Van Allen et al. 2015[237]^15 dbGaP:
phs000452
ICI-treated melanoma-Hugo Hugo et al. 2016[238]^16 GEO: [239]GSE78220
ICI-treated melanoma-Riaz Riaz et al. 2017[240]^17 GEO: [241]GSE91061
ICI-treated bladder cancer- Mariathasan Mariathasan et al. 2018[242]^19
EGA: EGAS00001002556
ICI-treated melanoma-Liu Liu et al. 2019[243]^18 dbGaP: phs000452
__________________________________________________________________
Software and algorithms
__________________________________________________________________
iLSPS code base This paper [244]https://doi.org/10.5281/zenodo.8275988
survival (version 3.2–10) CRAN
[245]https://cran.r-project.org/web/packages/survival/index.html
GSVA (version 1.36.3) Hänzelmann et al. 2013[246]^33
[247]https://bioconductor.org/packages/GSVA/
maftools (version 2.7.10) Mayakonda et al. 2018[248]^34
[249]https://github.com/PoisonAlien/maftools
pathview (version 1.28.1) Luo and Brouwer. 2013[250]^35
[251]https://pathview.uncc.edu/
clusterProfiler (version 3.16.1) Wu et al. 2021[252]^36
[253]https://doi.org/10.1016/j.xinn.2021.100141(paper)
ssGSEA Subramanian et al. 2005[254]^37
[255]http://software.broadinstitute.org/gsea
Seurat (version 2.3.4) Butler et al. 2018[256]^38
[257]https://satijalab.org/seurat
infercnv (version 1.4.0) Elyanow et al. 2021[258]^39
[259]https://github.com/broadinstitute/inferCNV/wiki
rms (version 6.1–0) NA [260]https://hbiostat.org/R/rms/
R (version 4.0.2) R Development Core Team, 2010
[261]https://www.r-project.org/
mlr (version 2.18.0) Bischl et al. 2016[262]^40
[263]https://github.com/mlr-org/mlr
[264]Open in a new tab
Resource availability
Lead contact
Further information and requests for resources should be directed to
and will be fulfilled by the lead contact, Dr. Wang (zlhuxi@163.com).
Materials availability
This study did not generate new unique reagents.
Experimental model and study participant details
Tissue samples
Patients diagnosed with advanced/metastatic lung cancer confirmed
histologically or cytologically and received single-agent ICI were
retrospectively collected from our center. A total of 30 patients with
eligible formalin-fixed paraffin-embedded (FFPE) tumor samples were
recruited. All tissue samples were retrieved with the informed consent,
and used to perform WES, WTS, and mIF. The clinicopathological
information of patients were summarized in [265]Table S10. The age of
the included patients ranged between 56 and 64, and 76.67% of the
patients were males. This study was approve by the ethics committee of
the National Cancer Center/Cancer Hospital, Chinese Academy of Medical
Sciences and Peking Union Medical College, National GCP Center for
Anticancer Drugs, The Independent Ethics Committee (22/302–3504 and
23/035–3774). Written consent was obtained from the included patients.
Method details
Study design
We first reviewed clinical trials enrolled with patients receiving
single-agent ICI, to detect a surrogate of the 5-year OS rate. Second,
we depicted the landscape of TCGA predictive biomarkers of
immunotherapy efficacy and defined several novel predictors. Third,
primary and secondary feature selection were conducted to filter
candidate variables, which were validated using scRNA-seq datasets.
Then, four features were filtered to construct the predictive GPR
model, which was utilized in ICI-treated public cohorts and our
internal cohort. Finally, the model application in clinical practice
was visualized. The detailed methods were described as follows.
Cohort review
Immunotherapy-related clinical trials
Clinical trials on immunotherapy were searched and manually selected
from PubMed, Embase, Cochrane Central, clinicaltrials.gov and major
conference proceedings with the key words: “anti-PD-1 or anti-PD-L1 or
anti-CTLA-4 or immune checkpoint inhibitor or immune checkpoint
blockade” and “clinical trial.” To avoid the interference of
chemotherapy or other regimens, only clinical trials that reported the
efficacy of single ICI on advanced/metastatic solid tumors were
included. Trials that only included PD-L1 positive patients or
first-line therapy were excluded.
For each clinical trial, the following information, if provided, were
manually recorded, including the mOS, 1-, 2-, 3-, 4-, 5-year OS rate,
mPFS, 6-month PFS rate, 1-, 2-, 3-, 4-year PFS rate, ORR, mDOR, any
grade treatment-related AE incidence rate (AE-all-rate), grade 3 or 4
treatment-related AE incidence rate (AE-3/4-rate), any grade endocrine
AE incidence rate (AE-endocrine-rate), grade 3 or 4 endocrine AE
incidence rate (AE-endocrine-3/4-rate), any grade gastrointestinal AE
incidence rate (AE-GI-rate), grade 3 or 4 GI AE incidence rate
(AE-GI-3/4-rate), any grade skin AE incidence rate (AE-skin-rate),
grade 3 or 4 skin AE incidence rate (AE-skin-3/4-rate), follow-up time,
number of patients, line of therapy, and treatment. The most recent and
articulated data were documented for a given clinical trial with
multiple publications. Pearson’s correlation analyses were conducted
between these ICI-treated outcomes from all included clinical trials to
identify the surrogate of the 5-year OS rate. The median value of the
survival outcomes of the clinical trials of each cancer type was
computed when regarding each cancer type as a unit.
TCGA tumor datasets
We only included cancers with both TCGA molecular data and 1-year OS
rate after ICIs. We further re-classified these cancers according to
the included clinical trials. For example, we selected patients with
high microsatellite instability (MSI-H) from colon and rectal
adenocarcinoma to constitute the cancer type named CRCH. We also
utilized the MSI status to stratify uterine corpus endometrial
carcinoma (UCEC) into MSI-H (mismatch repair [MMR] deficient, dMMR)
UCEC (UCECH) and MSI-L (MMR-proficient, pMMR) UCEC (UCECL).
Triple-negative breast cancer (TNBC) that did not express estrogen
receptor, progesterone receptor, or human epidermal growth factor
receptor 2 (HER-2) were selected. Esophageal carcinoma was divided into
esophageal squamous carcinoma (ESCC) and esophageal adenocarcinoma
carcinoma, and the latter was combined with gastroesophageal junction
cancer (GEJC) and stomach adenocarcinoma, to produce the “GESC” cancer
type. Data regarding pancreatic ductal adenocarcinoma (PDAC) were
selected from pancreatic adenocarcinoma. Therefore, 25 newly classified
cancer types were produced ([266]Table S2).
The OS time from TCGA clinical data
([267]https://portal.gdc.cancer.gov/) was manually curated from
“days_to_last_follow-up” (CDE_ID: 3008273) if vital_status (CDE_ID:5) =
“Alive” or “days_to_death” (CDE_ID: 3165475) if vital_status = “dead”.
The TCGA 1-year OS rate for each cancer type was estimated using the
Kaplan–Meier method with the R package survival (version 3.2–10). Six
immune subtypes of C1 (wound healing), C2 (IFN-γ dominant), C3
(inflammatory), C4 (lymphocyte depleted), C5 (immunologically quiet),
and C6 (TGF-β dominant) were characterized across multiple TCGA tumor
types.[268]^41 The immune signature (IS) scores of 22 TCGA cancer types
were obtained from.[269]^42 The IPS values of TCGA 14 cancer types were
obtained from.[270]^43
Single-cell RNA sequencing datasets
Two scRNA-seq datasets of patients receiving ICI were obtained. One was
the BCC cohort,[271]^29 and pretreatment biopsy samples of 11 patients
(su001, su002, su003, su004, su005, su006, su007, su008, su009, su010,
and su012) treated with pembrolizumab or cemiplimab were included.
Another was the RCC cohort,[272]^30 and four patients (P55, P906, P913,
and P915) reporting ICI-treated outcomes were included.
Immunotherapy-related datasets
We curated public datasets of PD-L1 non-selective advanced/metastatic
cancer treated with single-agent ICIs. In total, five melanoma
datasets[273]^14^,[274]^15^,[275]^16^,[276]^17^,[277]^18 and one BLCA
dataset[278]^19 were finally included. Besides, a total of 30 lung
cancer patients receiving single-ICI from our center were respectively
collected and utilized as the internal cohort.
Samples with available transcriptomics sequencing data and OS data were
included in the analysis, and those with unknown neoantigen burden were
filled using the k-nearest neighbors algorithm with k = 10. For the
Snyder 2014 cohort, patients SD0346 and SD2051 treated with combined
ICI therapy were excluded. For the Mariathasan 2018 cohort, duplicate
sample data of patient 10285 were excluded. Finally, patients’
clinicopathological information and molecular data of all seven cohorts
were manually curated and summarized ([279]Tables S9, [280]S10, and
[281]S11).
Three distinct pathological immune phenotypes (immune-inflamed,
immune-excluded, or immune-desert) and PD-L1 expression (TC levels) of
patients in the Mariathasan 2018 cohort, as well as four distinct TME
classifications (immune-enriched, fibrotic [IE/F]; immune-enriched,
non-fibrotic [IE]; fibrotic [F]; or immune-depleted [D]) of the cohorts
(Van Allen 2015, Hugo 2016, Riaz 2017, Mariathasan 2018, Liu 2019) were
retrieved from.[282]^24 ITH data was available in,[283]^18 and
TCRβ-chain CDR3s repertoire data was available in.[284]^17
In addition, an NSCLC cohort treated with anti-PD-1,[285]^31 and a
melanoma cohort treated with adoptive T cell therapy[286]^32 were
included, among which patients with available MHC genes expression data
were enrolled.
Bioinformatics
WES processing
For samples from our internal cohort, genomic DNA was extracted using
the TIANamp Genomic DNA kit (Tiangen Biotech, Beijing, China) following
manufacturer’s instruction. DNA concentration and purity were estimated
using Nanodrop 2000 spectrophotometer and Qubit 2.0 Fluorometer with
Quanti-IT dsDNA HS Assay Kit (Thermo Fisher Scientific, MA, USA).
Library construction was performed using a custom 53M length capturing
probe, made by Integrated DNA Technologies (IDT, IA, USA), and covering
the coding regions of all genes and partial non-coding regions.
Captured libraries were then pair-end sequenced in 100 bp lengths with
DNBSEQ-T7R platform (MGI, Shenzhen, China) following manufacturer’s
instructions. Raw data was filtered to remove low-quality reads and
adaptor sequence. Reads were further mapped to the reference human
genome (hg19) utilizing BWA aligner (version 0.7.10) for mutation
calling.
Somatic mutations
MuTect2 and GATK were performed on the WES data of our internal cohort
to call SNVs and Indels mutations. Two or more mutation callers
(NCALLERS >1) were assessed on the WES data of TCGA samples to generate
maf files.[287]^41 The TMB was assessed as the total number of defined
mutations with the variatllele frequency (VAF) ≥ 1% per sample. Data
regarding somatic SNVs were extracted with theed a filter
Variant_Type = ‘‘SNP’, while somatic indel variants were extracted with
the filter Variant_Type = “INS,” ‘‘DEL”. Data regarding non-synonymous
mutations were extracted using Variant_Classification =
“Frame_Shift_Del,” “Frame_Shift_Ins,” “Splice_Site,”
“Translation_Start_Site,” “Nonsense_Mutation,” “Nonstop_Mutation,”
“In_Frame_Del,” “In_Frame_Ins,” “Missense_Mutation”. Data regarding
truncating mutations were extracted using Variant_Classification =
“Nonsense_Mutation,” “Splice_Site,” “Frame_Shift_Del,”
“Frame_Shift_Ins.” Data regarding missense mutations were extracted
using Variant_Classification = “Missense_Mutation”. Transition (Ti),
transversion (Tv), their ratio (Ti/Tv), and mutant-allele tumor
heterogeneity (MATH), a straightforward estimate of intratumor
heterogeneity[288]^44 of each sample were calculated using the R
package maftools (version 2.7.10).[289]^34
Neoantigen prediction
To screen the neoantigen of our internal samples, depth-based filters
were employed,[290]^45 and any variants with normal coverage ≤5× and
normal VAF ≥2% were filtered out. For tumor coverage from DNA, a cutoff
was placed at ≥ 10× with a VAF of ≥40%. To further evaluate the effect
of relevant nearby variants on neoantigen identification, we used
netMHC- 4.0, an updated version of the pVAC tools software, to assess
the binding affinities of the neoantigens with the corrected mutant
peptide sequence.[291]^46 TNB was defined as the number of neoantigens
obtained through the above prediction process.
The TNB of TCGA samples were calculated using NetMHCpan v3.0 by
identifying those with MHC binding affinity <500 nM and retrieved from
[292]https://gdc.cancer.gov/about-data/publications/panimmune.
In addition, the TCGA landscape of ten canonical pathways, cell cycle,
Hippo, Myc, Notch, Nrf2, PI-3-Kinase/Akt, RTK-RAS, TGFβ signaling, p53,
and β-catenin/Wnt were obtained from.[293]^47 For each cancer type, the
proportion of patients with one or more alterations (amplification,
deletion, mutation, epigenetic silencing, or fusion) in at least one
member of each pathway was calculated for each cancer type.
Chromosomal instability
The aneuploidy score reflected the imbalance of chromosome numbers and
was obtained from.[294]^48 Purity, ploidy, whole genomic doubling
(WGD), and ITH, defined as the subclonal genome fraction, were
calculated using ABSOLUTE.[295]^49 The chromosomal instability (CIN)
score was defined as the sum of squares of gene-level GISTIC2.0 values
and derived from.[296]^42 Loss of heterozygosity (LOH) events were
measured by “LOH_n_seg” and “LOH_frac_altered.” The HRD score was
calculated by summing the HRD-LOH, large-scale state transitions, and
telomeric allelic imbalance (NtAI).
Copy number variation
Data regarding focal copy number gains and losses identified by
in silico admixture removal (ISAR)-corrected GISTIC2.0 were downloaded
from [297]https://gdc.cancer.gov/about-data/publications/panimmune. CNV
burden scores (“fraction altered” and “number of segments”), which were
respectively defined as the fraction of altered bases and the total
number of altered segments measured the general copy number variation
profile. To refine the degree of copy number alteration, we retained
genes with a CNV frequency among each cancer type that significantly
correlated with 1-year OS rate after ICI but not TCGA 1-year OS rate.
Top 500 genes according to the absolute value of Spearman correlation
coefficient with 1-year OS rate after ICI were extracted to further
define “CNV burden” as the summation of gene-level GISTIC2.0 absolute
value, with 2 indicating deep amplification/deletion (alterations
beyond the range of the median copy-ratio for each chromosome arm), and
1 indicating shallow amplification/deletion (alterations between 0.1
relative copy number and the thresholds for deep alterations).
DNA damage response
Nine “core DNA damage response (DDR)” pathways comprising a set of 9
DDR genes and 71 DNA repair pathway-specific genes were defined as
previously described: base excision repair (BER), nucleotide excision
repair (NER, including TC-NER and GC-NER), mismatch repair (MMR),
Fanconi anemia (FA), homologous recombination (HR), non-homologous
end-joining (NHEJ), direct repair (DR), and translesion synthesis
(TLS). The fraction of patients with at least one gene alteration
(mutation, deep copy number deletion, or epigenetic silencing) per
given DDR pathway was calculated for each cancer type.
RNA-seq processing
RNA extraction, library construction, sequencing, and FASTQ data
quality control of our internal cohort were performed as previously
described.[298]^50 The RNA-seq expression profiles (tumors only) of
TCGA 25 cancer types were downloaded from The Cancer Genome Atlas
(TCGA) portal ([299]https://portal.gdc.cancer.gov/) and transformed to
log-scaled transcripts-per-million (TPM) values.
Immune cell infiltration
The profiles of 16 immune cell populations[300]^51 were computed using
the ssGSEA algorithm, including CD8 T cells (CD8^+ T), central memory
T cells (Tcm), effector memory T cells (Tem), gamma delta T cells
(Tgd), regulatory T cells (Treg) T helper cells, follicular helper
T cells (Tfh), B cells, macrophages, monocytes, mast cells,
eosinophils, activated dendritic cells (aDCs), immature dendritic cells
(iDCs), NK CD56 dim cells (NKdim), and NK CD56 bright cells
(NKbright)]. “CD8/Treg” cells were determined as the ratio of CD8^+ T
to Treg cells. “dim/bri NK” represented the ratio of NKdim to NKbright.
“aiDC” represented the ratio of aDCs to iDCs and “adp/inat” represented
the ratio of adaptive immune cells (CD8^+ T, Tcm, Tem, Tgd, Treg, and T
helper cells, Tfh, and B cells) to innate immune cells (macrophages,
monocytes, mast cells, eosinophils, aDC, iDC, NKdim, and NKbright).
TCR CDR3 sequences from T cells were inferred via RNAseq using MiTCR
v1.0.3.[301]^52 B cell receptor (BCR) heavy chain contigs were
reconstructed via RNAseq using the V’ DJer tool.[302]^53 V region
identity, V and J gene segments, and CDR3 sequences were further
identified using IMGT/HighV-Quest.[303]^54 TCR and BCR diversity scores
(Shannon entropy, evenness, and richness) of TCGA samples were
calculated.[304]^41
Inflammation signature
The T cell-inflamed gene expression profile (GEP) score was calculated
as the weighted sum of the transcript levels of 18 inflammatory genes
involved in antigen presentation, chemokine expression, immune
response, and immune resistance.[305]^55 The dendritic cell (DC) index
referred to the level of infiltration and was computed as the weighted
sum of chemokines recruiting DCs into the tumor
microenvironment.[306]^56 Immune cytolytic activity (CYT) was defined
as the geometric mean of granzyme A (GZMA) and perforin (PRF1)
transcript levels (0.01 offset).[307]^57 The IFN-γ 4/6/10/18/28 gene
signature,[308]^58 effector T cell (Teff) gene signature,[309]^59 and T
effector and interferon-γ (Teff/IFN-γ) gene signatures[310]^60 were
determined as the geometric means of the associated gene transcript
levels (0.01 offset). The immune score, stromal score, and ESTIMATE
score were analyzed using the ESTIMATE algorithm.[311]^61 Due to
restrictions of access to germline HLA types, MHC classes I and II
genes (HLA-A, HLA-B, HLA-C, MICA, MICB, HLA-DPA1, HLA-DPB1, HLA-DQA1,
HLA-DQA2, HLA-DQB1, HLA-DQB2, HLA-DRA, HLA-DRB1, and HLA-DRB5)
expression (RNA-TPM) were used to train Ridge regression model to
calculate MHC_score.
lncRNA
Protein-coding mRNAs and lncRNAs were identified according to GENCODE
(release 22). Only genes with a median value >0 in more than half of
the cancer types and associated with 1-year OS rate after ICI but not
TCGA prognostic 1-year OS rate were selected to perform Spearman’s
correlation analysis between mRNAs and lncRNAs. Moreover, significant
mRNA-lncRNA interactive pairs with adjusted p value (from BH.fdr
correction) < 0.05 were selected, from which the top 20 pairs according
to the absolute value of Spearman correlation coefficient with 1-year
OS after ICI were extracted to establish a mRNA-lncRNA regulatory
network. The expression of lncRNA from these top 20 pairs was added to
compute the lncRNA score.
Tumor metabolism
KEGG pathway enrichment analysis was performed on genes associated with
immunotherapy efficacy using the R package clusterProfiler (version
3.16.1).[312]^62 The p value adjusted using the BH.fdr method was used
to determine the significance.
Genes panel regarding 106 metabolic pathways were obtained from the
KEGG database and the metabolic activities of the pathways were
assessed using the ssGSEA algorithm.[313]^33 We selected pathways for
which the ssGSEA scores were significantly correlated with 1-year OS
rate after ICI but not with TCGA 1-year OS rate. Gene set enrichment
analysis (GSEA) enrichment analysis was conducted to further select
metabolic pathways enriched in patients with superior survival benefit
from immunotherapy. Genes were ranked according to Spearman’s
correlation coefficients between transcript levels and 1-year OS rate
after ICI. The pathways were portrayed using the R package pathview
(version 1.28.1).
In addition, the activities of immune-related pathways were also
determined using the ssGSEA method.
Methylation
DNA methylation data derived from the Infinium HM450 array (485,577 CpG
site targeting probes) were downloaded from TCGA portal
([314]https://portal.gdc.cancer.gov/). The global genomic methylation
loss of each sample was measured based on the beta values of probes
mapped to long interspersed nuclear element-1 (LINE-1).[315]^63 A probe
was considered hyper- or hypo-methylated as previously defined,[316]^64
and hyper- or hypo-methylated probe frequencies of each sample were
calculated. Then, DNA methylation inversion (DMI) scores were generated
based on hyper- or hypo-methylation frequencies.
Feature selection
Primary feature selection
For primary feature selection from 107 potential variables mentioned
above, Spearman correlation analyses were conducted between the median
value of TCGA multi-omics data and immunotherapy clinical trials’
1-year OS rates per cancer type by regarding each cancer type as a
unit.
Secondary feature selection
Random forest analysis
The random forests (RFs) technique[317]^65 is an ensemble machine
learning technique that aggregates a large number of decision trees,
and predictions for regression from all trees are pooled to arrive at
the final prediction, reducing the risk of overfitting phenomenon. Each
tree was constructed based on bootstrap sample sets randomly drawn from
the original dataset and a randomly selected subset of features. The
remaining samples, the so-called out-of-bag samples, were used as the
test set to determine the predictive ability of trees. The increase in
mean square error (MSE) produced per given randomly permuted variable
was calculated as the %IncMSE, indicating the importance of each
feature.[318]^66 A higher %IncMSE value for a predictor indicates
higher importance. Here, we utilized an RF of 200 trees to identify the
importance of the features. Missing values in the dataset were filled
using proximity imputation from the RF.
Gaussian process regression model
The Gaussian process (GP) is a nonparametric kernel-based machine
learning model[319]^67 that can make predictions relying on a few
parameters. Due to the limited data of high quality in clinical
scenarios, it was not reasonable to utilize other supervised learning
methods that required a myriad of samples.
GP regression (GPR) is a common application of the GP that was used to
solve regression problems in our work. It makes predictions by first
assuming a prior distribution and then updating the posterior
distribution based on the observed samples corresponding to the
Bayesian theorem. The GPR does not predict a specific value; rather, it
predicts a function based on a sample from the GP by specifying a mean
function
[MATH: m(x)=E(f(x)) :MATH]
and a covariance function
[MATH: k(x,z
)=cov(<
/mo>f(x),f(z)) :MATH]
where
[MATH: x,z∈Rd
:MATH]
. This can be rewritten as
[MATH: f(x)∼GP(m(x),k(x,z
)) :MATH]
. The prior mean is assumed to be the mean of the training data, as the
features are centered and scaled. The prior’s covariance is specified
by passing a radial basis function kernel object, given by
[MATH: k(xi,
xj)=exp(−d(xi,
xj)22l2) :MATH]
, which is parameterized by a length-scale parameter
[MATH: l>0 :MATH]
and
[MATH: d(xi,
xj) :MATH]
is the Euclidean distance.
Feature selection via the wrapper method
The wrapper method is an approach to feature selection that aims to
select a feature subset that leads to the best training performance.
Each sample in our dataset contained top 25 different features. To
perform the feature selection,
[MATH: D={(xi,
yi)|i=1,2,
……,n}=(X,
Y),n=25,xi
∈Rd
:MATH]
(
[MATH: d :MATH]
is the 25 dimensions of features, e.g., the median CD8^+ T value and
other features of each cancer type), and
[MATH: yi :MATH]
represents the median 1-year OS rate of the included clinical trials
per cancer type, n is the number of samples, referring to 25 cancer
types). Through repeated training of various possible feature subsets
following backward search, we estimated each model’s performance by
10-fold cross-validation and selected the best feature subset evaluated
by the MSE.
Intersections of the top 10 important features in RFs and the selected
feature subset from GPR were extracted for subsequent model
establishment.
Moreover, taken into consideration cancer type-specific factors, TCGA
samples of each cancer type were split into several parts, mapping to
the 1-year OS rates from 158 clinical trials of 25 cancer types, to
measure the associations between these four selected features and
1-year OS rate after ICI using HLM mixed-effects models. For each
feature of interest, we compared a full model (1) with a null model (2)
using ANOVA to determine if the feature was correlated with survival
after ICI, using the R package lme4 (v1.1-26).
Full = 1-year OS rate ∼ feature + (1 | cancer type) (1)
Null = 1-year OS rate ∼ (1 | cancer type) (2)
Cancer type was incorporated as a random factor while the feature was
incorporated as a fixed factor.
scRNAseq validation of markers
Uniform manifold approximation and projection clustering was performed
using the R package Seurat (version 2.3.4). Cell clusters were obtained
from[320]^29 and.[321]^30 The metabolism scores of individual cells
were computed with the normalized count data following bulk analysis
using the ssGSEA method. CNV profiles of all single cells were
estimated using the R package infercnv (version 1.4.0), based on
scRNA-seq profiles, and the CNV burden was calculated as the sum of
alteration values of top 500 genes from bulk analysis for each given
cell.
Linear mixed-effects model was used to measure the difference of
markers, including metabolism score, CNV burden, and effector genes
expression, between ICI responders and non-responders using the R
package lme4 (v1.1-26). In each cell cluster, we compared a full model
(1) with a null model (2) using an ANOVA to determine if the marker was
correlated with ICI response.
[MATH: Full=marker∼Response+(1|Patie
mi>nt) :MATH]
1
[MATH: Null=marker∼(1|Patient)
:MATH]
2
Patients were incorporated as a random factor while ICI responses were
incorporated as a fixed factor.
Model construction, utilization
Final predictive score for the GPR model
In our research, each cancer type was regarded as a unit, and there was
a total of 25 cancer types, implying that 25 samples were incorporated
to train this model. The GPR model was trained using the dataset D as
shown in the equation
[MATH: D={(xi,
yi)|i=1,2,
……,n}=(X,
Y),n=25,xi
∈Rd
:MATH]
, and the number of features in each sample was optimized to four
(log-scaled TNB, metabolism score, CD8^+ T cell, and MHC_score) to
calculate the posterior distribution of the 1-year OS rate, which was
defined as iLSPS.
Missing values were filled using the k-nearest neighbors algorithm with
k = 10. The space of hyperparameter sigma was searched and located via
the Random search optimization method, and the optimized value was
selected through bootstrap in that space. The fitting performance of
the model was evaluated by Pearson R between the predicted iLSPS and
true 1-year OS rate, and statistics in terms of the mean square error
(MSE) and root-mean-square error (RMSE). The GPR stated above was
achieved with R package mlr version 2.18.0 using the learner
"regr.gausspr".
Leave-one-out validation of the model was conducted. Besides, TCGA
samples of each cancer type were divided into several parts according
to the number of corresponding clinical trials reporting 1-year OS
rates following single-ICI, which constituted a dataset of 158
TCGA-based groups, with the median value of four features mapping to
the 1-year OS rates from 158 clinical trials. This random split of TCGA
samples was repeated 500 times and the performance of the model was
also measured by Pearson R between the predicted iLSPS and 1-year OS
rates of these 158 groups. Moreover, six public cohorts and our
internal cohort mentioned above were used to test the model utility in
ICI-treated patients.
Nomogram
A nomogram was formulated based on the results of multivariable Cox
proportional hazards regression analysis, using the R package rms
(version 6.1–0). The predictive performance of the nomogram was
measured by depicting a calibration plot and quantified by calculating
Harrell’s C-index, with a value larger than 0.5 indicating predictive
performance superior to random guessing.
Multiplex immunofluorescence
Multiplex immunofluorescence (IF) staining was performed on 4 mm
sections of FFPE tumor samples from our internal cohort, using an OPAL
7-color Automation mIF Kit (Akoya Biosciences, Marlborough, MA, USA).
FFPE tissue slides were incubated with specific primary antibodies
including pan-CK, anti-CD8, anti-PD-1, anti-CD68, anti-CD163,
anti-PD-L1 (panel 1), pan-CK, anti-CD3, anti-CD4, anti-CD20, anti-CD56,
anti-FoxP3 (panel 2). Horseradish peroxidase-conjugated secondary
antibody and corresponding reactive Opal fluorophores were then
applied. Multiplex-stained slides were scanned using a Vectra Polaris
Automated Quantitative Pathology Imaging System (Akoya), and scans from
different channels were merged. InForm (Akoya) software was used for
quantitative image analysis. Tumor and stroma regions were divided by
4′,6-diamidino-2-phenylindole (DAPI)-labeled cell nuclei and CK-labeled
tumor cells. Quantities of each cell population were reported as
percentages (immune cells/total cells of DAPI) in the tumor area. The
histology of tumor–immune phenotypes (inflamed, excluded, and desert)
was interpreted as previously described.[322]^19
Quantification and statistical analysis
Survival analysis
For the OS analysis, the log rank test was used to compare Kaplan–Meier
survival curves of two arms. The Cox proportional hazards regression
method was used to determine HRs and 95% CIs using the R package
survival (version 3.2–10). A time-dependent receiver operating
characteristic curve was plotted and the AUC was used to determine the
predictive ability of continuous biomarkers for survival.
Statistical analysis
Pearson’s and Spearman’s correlation analysis were performed using the
R package stats version 4.0.2. For continuous data, the Wilcoxon test
was used to compare two groups, while the Kruskal-Wallis test was used
to compare across multiple groups. BH.fdr adjustment was utilized for
multiple tests correction. Fisher’s exact test was used to compare two
groups for categorical data.
Acknowledgments