Abstract
The immune cell infiltration in TME has been reported to be associated
with prognosis and immunotherapy efficiency of lung cancers. However,
to date, the immune infiltrative landscape of lung adenocarcinoma
(LUAD) has not been elucidated yet. Therefore, this study aimed to
identify a new transcriptomic-based TME classification and develop a
risk scoring system to predict the clinical outcomes of patients with
LUAD. We applied “CIBERSORT” algorithm to analyze the transcriptomic
data of LUAD samples and classified LUAD into four discrete subtypes
according to the distinct immune cell infiltration patterns.
Furthermore, we established a novel predictive tool (TMEscore) to
quantify the immune infiltration patterns for each LUAD patient by
principal component analysis. The TMEscore displayed as a reliable and
independent prognostic biomarker for LUAD, with worse survival in
TMEscrore-high patients and better survival in TMEscrore-low patients
in both TCGA and other five GEO cohorts. In addition, enriched pathways
and genomic alterations were also analyzed and compared in different
TMEscore subgroups, and we observed that high TMEscore was
significantly correlated with more aggressive molecular changes, while
the low TMEscore subgroup enriched in immune active-related pathways.
The TMEscore-low subtype showed overexpression of PD-1, CTLA4, and
associations of other markers of sensitivity to immunotherapy,
including TMB, immunophenoscore (IPS) analysis, and tumor immune
dysfunction and exclusion (TIDE) algorithm. Conclusively, TMEscore is a
promising and reliable biomarker to distinguish the prognosis, the
molecular and immune characteristics, and the benefit from ICIs
treatments in LUAD.
Keywords: immune cell infiltration, prognosis, lung adenocarcinoma,
tumor microenvironment, immunotherapy
Introduction
Although great advances have been achieved in both basic and clinical
cancer research ([40]Gu et al., 2020; [41]Jiao and Yang, 2020), cancer
still caused approximately 10 million of deaths in 2020 ([42]Sung et
al., 2021). With the high prevalence and poor prognosis, lung cancer is
ranked as the first leading cause of cancer-related deaths worldwide,
becoming a major global health problem ([43]Miller et al., 2019;
[44]Siegel et al., 2019; [45]Sung et al., 2021). Recently, the
emergence of checkpoint blockade immunotherapy ([46]Pardoll, 2012;
[47]Topalian et al., 2015) has significantly improved the strategies of
LUAD. However, the minority of response and resistance to these
treatments frequently impedes the clinical outcomes. Additionally, the
effects of ICIs are not only driven by genetic and epigenetic
alterations in tumor cells, but the tumor microenvironment (TME) has
also been reported to be a crucial regulator in tumorigenesis
([48]Dejima et al., 2021; [49]Ye et al., 2022), development, metastasis
([50]Quail and Joyce, 2013), and resistance to therapies ([51]Ostman,
2012; [52]Lu et al., 2020).
TME chiefly consists of multiple subpopulations of T and B lymphocytes,
dendritic cells (DCs), macrophages, neutrophils, and myeloid-derived
suppressor cells (MDSCs) ([53]Belli et al., 2018). The balance between
pro-tumorigenic and anti-tumor factors in the TME conducts tumor growth
([54]Wellenstein and de Visser, 2018; [55]Hinshaw and Shevde, 2019).
Accumulating evidence has indicated the TME immune composition is
generally correlated with prognosis and responsiveness to various
cancer treatments. On one hand, tumor-infiltrating lymphocytes (TILs),
such as CD4^+ and CD8^+ T cells, have been associated with longer
survival and better response to immunotherapy ([56]Kawai et al., 2008;
[57]Fridman et al., 2012). On the other hand, the tumor cells can
promote a suppressive TME, which challenges anti-tumor immunity by
inducing upregulation of inhibitory immune signaling, suppressive
cytokine secretion, and recruitment of suppressive immune cells, such
as tumor-associated macrophages (TAMs) presenting pro-tumor effects by
secreting immunosuppressive cytokines, including interleukin-10 (IL-10)
and transforming growth factor-β (TGF-β) ([58]Mantovani et al., 2017),
as well as immunomodulatory cells, such as myeloid-derived suppressor
cells (MDSCs) ([59]Ostrand-Rosenberg and Fenselau, 2018)and regulatory
T cells (Tregs) ([60]Shimizu et al., 2010), which are all associated
with unfavorable prognosis. To be specific, focusing on cellular
diversity shows that TME heterogeneity could impact clinical outcomes
and provide a challenge for immunotherapy of LUAD ([61]Wu F. et al.,
2021; [62]Nguyen et al., 2021). Therefore, investigating the effects of
TME composition on the tumor cells will help us decode the regulation
of the microenvironment by the tumor.
To date, the emerging predictors for immunotherapy in NSCLC are still
imperfect, such as programmed death-ligand 1 (PD-L1) expression
([63]Dempke et al., 2018) is thought to be induced by interferon-γ
(IFN-γ)- mediated immune responses and tumor mutational burden (TMB)
([64]Klein et al., 2021) is reported to determine the tumor
immunogenicity. It is suggested that only reflecting the tumor cell
intrinsic features but ignoring the extrinsic factor, especially TME,
is attributed to inconsistencies. Thus, the characteristics of TME
should be further comprehensively explored to determine effective
biomarkers that precisely predict prognosis and considerably optimize
personalized immunotherapy.
Progress has been recently achieved by immunotherapy, emphasizing the
importance of TME in LUAD. It elucidates that TME is not the
single-cell population but a complex interface among cancer cells,
stroma, and infiltrating immune cells. Deeper analyses of the NSCLC TME
are necessary to refine the potential application of these findings to
clinical care. We applied “CIBERSORT” algorithm to analyze the
transcriptomic data of 500 LUAD samples in TCGA and classified the LUAD
into four discrete subtypes according to the distinct immune cell
infiltration patterns. Furthermore, we established the TME scores to
characterize and quantify the immune infiltration patterns for each
LUAD patient based on the mRNA expression profiles. Conclusively, we
investigated and validated the association between TME score and the
clinical outcomes, as well as the efficacy of anti-PD- (L)1 treatment
in LUAD, which can facilitate the identification of ideal candidates
for personalized immunotherapeutic strategies.
Methods
Datasets and Preprocessing
A total of 1,518 lung adenocarcinoma (LUAD) and 59 normal tissue
samples were retrieved and downloaded from the corresponding datasets,
including TCGA LUAD from TCGA data portal
([65]https://xenabrowser.net/datapages/) and [66]GSE31210,
[67]GSE37745, [68]GSE50081, [69]GSE68465, and [70]GSE13213 from the
NCBI Gene Expression Omnibus (GEO,
[71]https://www.ncbi.nlm.nih.gov/geo/). The somatic mutation data (SNPs
and small INDELs) were downloaded from TCGA database ([72]MuTect2
Variant Aggregation and Masking). The raw data of the dataset from
Affymetrix were processed using the RMA algorithm in the “Affy”
package. The data from Agilent were downloaded with the processed
version. For TCGA dataset, RNA-sequencing data (FPKM values) were
transformed into transcripts per kilobase million (TPM) values, which
are more similar to those resulting from microarrays and more
comparable between samples ([73]Wagner et al., 2012). The following
inclusion criteria were used: 1) histologically confirmed LUAD, 2)
simultaneously available information on mRNA expression profile data
and OS, 3) the sample tissue was collected from the primary solid tumor
(“01”), and there was no duplication sample in TCGA, 4) genes were
recorded in all datasets, and 5) genes with more than 70% of the
missing value or 0 value were deleted. The remaining missing values
were imputed with KNN imputation approaches. Therefore, 10,320 mRNAs
were included in the analysis.
Consensus Clustering for the Tumor Microenvironment
Distinguishing between tumor and normal tissue difference expression
genes (DEGs) in TCGA was performed with “limma” (FDR <0.05 and |log2FC|
> 1), which better identifies the characteristics of tumor.
Furthermore, the tumor microenvironment was quantified by CIBERSORT
([74]https://cibersort.stanford.edu/)([75]Newman et al., 2015), a
deconvolution method for inference of tumor-infiltrating immune
components from bulk tissue gene expression profiles. Tumors with
qualitatively different immune cell infiltration patterns were grouped
using consensus clustering (100 iterations, resample rate of 80%, and
hierarchical cluster). This procedure was performed with the
“ConsensusClusterPlus” R package.
Identification of DEGs Associated With the TME Phenotype
To functionally elucidate the biological characteristics of the TME
subtypes in LUAD, we employed random forest (RF), an efficient and
reliable machine learning method to identify DEGs between subtypes of
TME. We run RF 100 times with different seeds to find the duplicated
variables with at least 80% repetition rate to further ensure the
stability of variable selection.
Generation of TMEscore
To further elucidate the comprehensive profile of TME characteristics,
the construction of TME metagenes was performed as follows: first, we
further screened candidate prognostic genes from DEGs. Next, a
consensus clustering algorithm was employed to define the cluster of
genes. Then, a principal component analysis (PCA) was performed, and
principal component 1 was extracted to serve as the signature score.
After obtaining the prognostic value of each gene signature score, we
applied a method similar to GGI ([76]Sotiriou et al., 2006) to define
the TMEscore of each patient:
[MATH:
TMEscore=∑PC1i−∑PC1j, :MATH]
where
[MATH: i :MATH]
is the signature score of clusters whose Cox coefficient is positive
and
[MATH: j :MATH]
is the expression of genes whose Cox coefficient is negative.
Functional and Pathway Enrichment Analysis
To further analyze the biological significance of the genes related to
TMEscore with KEGG and GO function analysis, the “clusterProfiler” R
package was adopted to annotate gene patterns ([77]Wu T. et al., 2021).
The Benjamini–Hochberg procedure was used to control the false
discovery rate (FDR). We set the cut-off of adj. p-values to 0.2 so
that we could find more relevant pathways and functions based on the
small number of DEGs. Gene set enrichment analysis (GSEA) illustrated
the significantly different enriched pathways in the high- and
low-TMEscore groups. Gene sets were downloaded from the MSigDB database
of the Broad Institute ([78]Subramanian et al., 2005) and employed the
Hallmark gene sets and 1,000 permutations. An enrichment pathway
between two subtypes was determined with an FDR of <0.25 and the
normalized enrichment score (NES).
Predicting the Patients’ Response to ICIs
The Cancer Immunome Atlas ([79]https://tcia.at/) analyzed the immune
landscapes and antigenomes of 20 solid tumors that were quantified by
Immunophenoscore (IPS, a superior immune response molecular marker)
([80]Charoentong et al., 2017). The IPS value, which ranged from 0 to
10, was positively correlated to tumor immunogenicity and could predict
the patients’ response to immune checkpoint inhibitors (ICI treatment).
Tumor Immune Dysfunction and Exclusion (TIDE,
[81]http://tide.dfci.harvard.edu/), a computational method to predict
immune checkpoint blockade response, was developed by [82]Jiang et al.
(2018). TIDE uses a set of expression markers to profile two primary
mechanisms of tumor immune evasion: T-cell dysfunction and T-cell
exclusion. Patients with higher TIDE prediction scores represent a
greater potential of tumor immune escape; therefore, TIDE could
evaluate patients who are more likely to benefit from ICI. In addition,
the mRNA expression of immune checkpoints was analyzed in different
prognosis groups.
Statistical Analysis
Continuous variables were summarized as mean ± SD, and categorized
variables were described by frequency (n) and proportion (%).
Differences among variables were tested by the Wilcoxon rank-sum test
and Fisher’s exact tests. The relationship between variables was tested
by Spearman rank correlation analysis. The cut-off value of TMEscore
was calculated based on the correlation between the patients’ survival
and the TMEscore in TCGA with the “survminer” package. Univariate and
multivariate Cox regression analyses were used to assess prognostic
analysis. Batch effects from non-biological technical biases were
corrected using the “ComBat” algorithm of the “sva” package. The
“Maftools” package was used to present the mutation landscape and
identify the differential gene mutations between groups. The heatmap
was produced by the R package “ComplexHeatmap.” A two-sided p < 0.05
was regarded as statistically significant. All data processing was
performed in R 4.0.2 software.
Results
Landscape of Lung Adenocarcinoma TME
This study was conducted as per the flow chart shown in
[83]Supplementary Figure S1. The information of 1,518 LUAD patients is
detailed in [84]Supplementary Table S1. To classify the LUAD TME, the
consensus clustering algorithm was used to cluster TME information
obtained by CIBERSORT in TCGA-LUAD dataset ([85]Supplementary Table
S2). The most appropriate clustering number was four ([86]Figures
1A–D), which was selected by consensus matrices and consensus
cumulative distribution function (CDF) curve. This analysis revealed
that LUAD can be clustered into four distinct TME subtypes termed S1–4.
The patients with subtype S4 had significantly longer overall survival
(OS) than patients with subtypes S1 and S2, and subtype S3 demonstrated
the worst survival ([87]Figure 1E). These four TME subtypes varied
significantly based on the expression levels of LM22 gene signatures
([88]Figure 1F). The S4 subtype was characterized by increases in the
infiltration of CD 8+ T cells, resting NK cells, follicular helper T
cells, and M1 macrophages, displaying S4 was significantly associated
with immune activation. Meanwhile, resting mast cells, activated
dendritic cells, and regulatory T cells (Tregs) were enriched in the S1
subtype, and the S2 subtype showed significant increases in the
infiltration of naïve B cells, plasma cells, and CD4^+ memory-activated
T cells; on the contrary, M0 macrophages, M2 macrophages, and CD4^+
memory-resting T cells showed high infiltration in the S3 subtype,
indicating an immunosuppressive milieu. Taken together, we demonstrated
that the four TME subtypes were characterized by distinct immune cell
infiltration and prognosis.
FIGURE 1.
[89]FIGURE 1
[90]Open in a new tab
Unsupervised clustering of the tumor microenvironment (TME) cells for
500 patients in the TCGA-LUAD cohort. (A–C) Consensus matrices of
different clusters. (D) Consensus cumulative distribution function
(CDF) curve. (E) Kaplan–Meier (K-M) curves for overall survival (OS) of
four different subtypes (log-rank test, p =0.039). (F) Abundance
pattern of 22 TME cell types in four TME subtypes.
Identification of DEGs and Functional Annotation
To further identify the biological characteristics and differences
among TME subtypes, RF algorithm was employed to extract the phenotype
signatures. By 100 times analysis, a total of 77 DEGs duplicated at
least 80 times were identified ([91]Supplementary Table S3). Through
consensus clustering analysis based on the expression of the 77 most
representative DEGs, we divided DEGs into two different clusters termed
G1 (62 DEGs) and G2 (15 DEGs) ([92]Figure 2A). These two gene clusters
were closely related to distinct TME and played different biological
roles. Then, GO and KEGG enrichment analyses were performed with the
“clusterProfiler” R package. The G1 cluster was mainly enriched in the
MAPK signaling pathway, PI3K-Akt signaling pathway, aldosterone
syntheses, and focal adhesion pathways ([93]Figure 2B). The G2 cluster
was mainly enriched in hematopoietic cell lineage, B-cell receptor
signaling pathway, cytokine–cytokine receptor interaction, and primary
immunodeficiency ([94]Figure 2C). Significantly enriched pathways and
molecular functions are summarized in [95]Supplementary Tables S4 and
S5. Collectively, the coherence between the prognostic and biological
features in the two gene subgroups indicated that this classification
was reliable and reasonable.
FIGURE 2.
[96]FIGURE 2
[97]Open in a new tab
Construction of the TMEscore for LUAD patients. (A) Consensus matrices
of differentially expressed genes (DEGs) among TME subtypes. (B and C)
KEGG pathway enrichment analysis results in G1 and G2. (D) K-M curve
for OS of different TMEscore groups (log-rank test, p < 0.001). (E)
Forest plots illustrating the results of multivariate Cox proportional
hazards model of clinical feature in TCGA cohort. (F) Heatmap of DEG
expression and clinical characteristics. TMEscore, age, sex, stage,
smoke, therapy outcome, mutation of KARS, and mutation of EGFR are
shown as patient annotations. Gene clusters are shown as gene
annotations. Top legend, gray indicates a missing value.
Construction and Validation of the TMEscore in Six Independent Cohorts
Although the four TME subtypes were identified, their clinical
significance needed to be further evaluated and quantified. Therefore,
we build TMEscore based on TME information in the TCGA-LUAD cohort to
assess the prognostic value. Association with a prognosis of 34 genes
(G1:22, G2:12) was confirmed by Cox regression analysis. First,
principal component analysis (PCA) was used to compute two aggregate
scores, TMBscore A from G1 and TMBscore B from G2. Then, we performed
univariate Cox regression on each TMEscore to evaluate the prognostic
value. Finally, TMEscore A and TMEscore B were integrated to obtain
TMEscore for each sample. The prognostic value of the TMEscore was
further assessed by the log-rank test after classification as high-risk
and low-risk groups based on the corresponding optimal cut-off value
(−0.92) acquired by the “survminer” R package in the TCGA-LUAD cohort.
We visualized gene expression and clinical features distribution in
different risk groups with a heatmap in TCGA and GEO datasets,
respectively ([98]Figures 2F and [99]3H). The Kaplan–Meier curve of
TMEscore subgroups showed that the patients in the low TMEscore group
(median survival time 3,169 days) had significantly better overall
survival than the high-TMEscore group (median survival time 1,235 days;
log-rank test, p < 0.0001; [100]Figure 2D). Moreover, the prognostic
value of the TMEscore was further assessed with five external datasets
in the GEO database: [101]GSE37745, [102]GSE31210, [103]GSE13213,
[104]GSE50081 and [105]GSE68465. Similar results were found that the
survival advantage in the low TMEscore group in above cohorts, with the
corresponding p-value of 0.038, 0.033, 0.018, 0.0036, and 0.004
([106]Figures 3B–F). Meanwhile, we integrated a total of 1,018 samples
in the GEO datasets to evaluate prognostic efficiency, indicating the
low-TMEscore group patients had better overall survival compared to the
high-TMEscore group (log-rank test, p < 0.0001; [107]Figure 3A). These
findings suggested that TMEscore possessed a reliable and robust
capacity for predicting the prognosis for LUAD patients.
FIGURE 3.
[108]FIGURE 3
[109]Open in a new tab
Prognostic value of TMEscore for LUAD patients in five GEO cohorts. (A)
K-M curve of all 1,018 patients in the GEO database between different
TMEscore groups (log-rank test, p < 0.001). (B–F) K-M curves of five
independent GEO datasets in different TMEscore subgroups. (G) Forest
plots illustrating the results of the multivariate Cox proportional
hazards model of clinical features in the GEO database. (H) Heatmap of
DEG expression and clinical characteristics. TMEscore, age, sex, stage,
smoke, mutation of KARS, mutation of EGFR, and mutation of P53 are
shown as patient annotations. Gene clusters are shown as gene
annotations. Top legend, gray indicates a missing value.
TMEscore Was an Independent Prognostic Factor for LUAD Patients
In addition to the TMEscore, other prognostic factors such as
individual and clinicopathological features were included. After
multivariable adjustments with age, sex, smoke situation, TNM stage,
and therapy outcomes in the TCGA cohort, the TMEscore was confirmed as
an independent prognostic indicator with a hazard ratio of 0.383 [95%
CI: 0.210–0.696] in the TCGA-LUAD cohort ([110]Figure 2E), 0.632 [95%
CI: 0.475–0.839] in the GEO datasets ([111]Figure 3G). Elder,
ever-smoker, advanced stage, and non-response to therapy were also
suggested to be independent risk factors in different datasets,
respectively.
Recent studies have reported that specific gene alterations, such as
TP53 ([112]Sun et al., 2020), KRAS ([113]Hamarsheh et al., 2020), EGFR
([114]Chen et al., 2015), and STK11 ([115]Mazzaschi et al., 2021) have
an important role in the regulation of the tumor immune
microenvironment (TIME) and served as biomarkers to tumor therapeutics
([116]Lee et al., 2017; [117]Krishnamurthy et al., 2021). We further
explore the predictive value of this TMEscore in LUAD patients with
EGFR/KRAS mutation (MUT) or wild type (WT). Remarkably, this risk model
had predictive power for both EGFR wild type and EGFR mutation LUAD
patients, except for patients with TP53/EGFR co-mutations
([118]Supplementary Figure S2). Similarly, this risk model exhibited a
robustly predictive value in both KRAS wild type and KRAS mutation LUAD
patients, except for patients with KRAS/STK11 co-mutations LUAD
patients ([119]Supplementary Figure S3). Among the EGFR/KRAS
wild-type/mutation population, the beneficial trends of low TMEscore in
the prognosis of LUAD patients were observed in distinct subgroups,
suggesting that TMEscore was an independent and reliable prognostic
indicator.
Different Biological Processes Between the High-TMEscore Group and the
Low-TMEscore Group
For a comprehensive analysis of the potential regulatory mechanisms
resulting in different TMEscore groups, we performed GSEA analysis
between high and low TMEscore subgroups. The results showed that 23
pathways were enriched in different subgroups with FDR<0.25
([120]Supplementary Table S6). In high TMEscore group, MYC targets V1
(NES = 2.28 and FDR = 0.001), MTORC1 signaling (NES = 2.08 and FDR =
0.011), MYC targets V2 (NES = 2.08 and FDR = 0.012), G2M checkpoint
(NES = 2.00 and FDR = 0.021), glycolysis (NES = 1.79 and FDR = 0.014),
and other pathways were enriched ([121]Figures 4A–D). Meanwhile, the
results revealed that complement (NES = −1.74 and FDR = 0.231),
inflammatory response (NES = −1.69 and FDR = 0.212), IL6/JAK/STAT3
signaling (NES = −1.64 and FDR = 0.181), IL2/STAT5 signaling up (NES =
−1.58 and FDR = 0.212) and interferon gamma response (NES = −1.42 and
FDR = 0.223), and other pathways were correlated with the low TMEscore
([122]Figures 4E–H). It is suggested that the gene sets of the TMEscore
high samples were enriched in cancer and tumor metabolism-related
pathways, while the gene sets of the TMEscore low samples were enriched
in DNA repair and immune response-related pathways.
FIGURE 4.
[123]FIGURE 4
[124]Open in a new tab
Gene set enrichment analysis (GSEA) in TMEscore groups. (A–D)
Enrichment plots showing MYC targets V1, MTORC1 signaling, MYC targets
V2, and G2M checkpoint in the high-TMEscore group. (E–H) Enrichment
plots showing complement, myogenesis, IL6/JAK/STAT3 signaling, and KRAS
signaling up in the low-TMEscore group.
The Molecular Characteristics of Distinct TMEscore Subgroups
Genomic alterations and oncogenic signaling within the tumors have been
reported to affect anti/pro-tumor immunity and TME activity
([125]Hamarsheh et al., 2020; [126]Kumagai et al., 2020; [127]Zhou et
al., 2020; [128]Fountzilas et al., 2021), links between tumor mutations
and TME subtypes needed to be investigated. To illustrate the somatic
variants and acquire further biological insights into the immunological
characteristics of LUAD between TMEscore subgroups, we utilized the
Mutation Annotation Format (MAF) files and performed the variants
annotation. We found higher mutation counts in the TME-high subgroup
than in the TME-low subgroup. Missense variations were the most common
mutation subtype, followed by nonsense and frameshift deletions. The
oncoplot of tumor somatic mutation in the TCGA-LUAD cohort showed that
TP53, TTN, and MUC16 gene mutations in the high-TMEscore group were
approximately 20% higher than those in the low TMEscore group
([129]Figures 5A and B). Among a total of 54 differential mutated genes
between two groups (p < 0.01; [130]Figure 5C), CMA1, HSPA12B, and
FAM196A showed a higher mutation frequency in the low-TMEscore group.
The other genes, such as TP53, TTN, and FBXL7, had a higher mutation
frequency in the high-TMEscore group. Collectively, this analysis
indicated that transcriptomic-based TME classification coupled with
genomics analysis can be exploited for further studies.
FIGURE 5.
[131]FIGURE 5
[132]Open in a new tab
Molecular variations between low-TMEscore and high-TMEscore groups. (A
and B) Mutation profiles of high-TMEscore and low-TMEscore groups. (C)
Comparing differentially mutated genes between two subgroups by
Fisher’s exact tests.
Combinations of TMEscore, Immune Checkpoints, and TMB Improve Risk
Stratification and Survival Prediction
Previous studies have emphasized the importance of immune checkpoint
genes in modulating immune infiltration ([133]Keir et al., 2008;
[134]Andrews et al., 2019). Thus, we first compared the expression
pattern of immune checkpoint genes between different patient groups
delaminated by the TMEscore in TCGA-LUAD and GEO datasets. PDCD1, CD86,
CD80, and CTLA4 showed significantly high expression in the
low-TMEscore group than in the high-TMEscore group ([135]Figures 6A–D,
F–I), which was further confirmed in five independent validating
cohorts ([136]Supplementary Figure S4).
FIGURE 6.
[137]FIGURE 6
[138]Open in a new tab
TMEscore in the prediction of immunotherapeutic benefits. (A–D)
Expression of immune-checkpoint-relevant genes (PDCD1, CD86, CD80, and
CTLA4) between high- and low-TMEscore groups in TCGA and (F–I) all
1,018 patients in GEO datasets after batch correction. (E and J)
Relationships between TIDE and TMEscore in TCGA and GEO datasets. (K
and L) Relative probabilities of response to anti-CTLA-4 and
anti-PD-1/PD-L1 treatment (IPS score) in the low-TMEscore and
high-TMEscore groups in TCGA cohort.
Considering the correlations between immune checkpoint genes and
TMEscore, we next combined TMEscore with immune checkpoints expression
to test whether they have an influence on OS in LUAD patients. Though
survival analyses among four subgroups stratified by TMEscore and
immune checkpoint gene expression, we displayed that patients with low
PD-L1 and low TMEscore have prolonged OS compared to those with low
PD-L1 and high TMEscore (p = 0.005), and among patients with high PD-L1
expression, a lower TMEscore signified a remarkably better survival (p
< 0.001) ([139]Figure 7A). We also found similar survival patterns
among four patient subgroups stratified by TMEscore and PD1/CTLA-4
expressions in the TCGA cohort ([140]Figure 7A). We then confirmed the
results in the other five validation cohorts ([141]Figures 7B,C and
[142]Supplementary Figure S5). In concordance with the TCGA dataset,
patients with low TMEscore have significantly better survival relative
to the high TMEscore group, even though with similar expression levels
of immune checkpoint genes ([143]Figures 7B,C and [144]Supplementary
Figure S5). In addition, TMB has been shown to have the potential to
generate a larger number of neoantigens and make them more immunogenic
([145]Schumacher and Schreiber, 2015), which is strongly associated
with clinical outcomes and response of immune checkpoint blockade
response ([146]Yarchoan et al., 2017; [147]Chan et al., 2019). We found
that patients with low TMB and high TMEscore had the worst prognosis
([148]Figure 8A). It is suggested that TMEscore, immune checkpoint
genes, and TMB can complement each other as prognostic biomarkers.
FIGURE 7.
[149]FIGURE 7
[150]Open in a new tab
Impact of immune checkpoint gene expressions and TMEscore on clinical
outcome. Kaplan–Meier survival curves of overall survival among four
patient groups stratified by TMEscore and immune checkpoint genes (PD1,
PD-L1, and CTLA-4) in TCGA dataset (A), [151]GSE37745 dataset (B), and
[152]GSE50081 dataset (C).
FIGURE 8.
[153]FIGURE 8
[154]Open in a new tab
Kaplan–Meier survival curves of overall survival among four patient
groups stratified by TMEscore and TMB in TCGA dataset (A) and
IMvigor210 cohort (B). Proportional representation of the objective
response rate among subgroups categorized by TMEscore and TMB in the
IMvigor210 cohort (C). Kaplan–Meier survival curves of overall survival
among four patient groups stratified by TMEscore and PD-L1 in the
IMvigor210 cohort. (D) Proportional representation of the objective
response rate among subgroups categorized by TMEscore and PD-L1 in the
IMvigor210 cohort (E). CR, complete response; PR, partial response; SD,
stable disease; PD, progressive disease.
The TMEscore Predicts Clinical Outcomes of Immunotherapy
Given the linkage between TMEscore and immune checkpoint genes as well
as TMB, we further explore the predictive potential of TMEscore for
immune checkpoint blockade response through analyzing the correlation
of TMEscore and published immunotherapy predictors, including TIDE and
IPS. The relationship between TIDE and TMEscore was investigated in
TCGA and GEO datasets. As expected, the high-TMEscore group was
characterized by a significantly higher TIDE score ([155]Figures 6E and
J). The IPS values (IPS-PD-1/PD-L1/PD-L2_pos and IPS-CTLA-4_pos)
increased in the low-TMEscore group compared to the high-TMEscore group
in TCGA ([156]Figures 6K and L). It is likely that the patients in the
low-TMEscore group may have a better immune microenvironment and
respond better to ICIs than those in the high-TMEscore group.
Furthermore, the practicability of the TMEscore was further evaluated
for speculation of the therapeutic benefit for ICI treated patients.
The patients who received anti-PD-L1 immunotherapy in the IMvigor210
cohort were assigned based on high and low TME scores. Given the
contraindicatory prognostic and predictive value of TMEscore, TMB, and
immune checkpoint gene expression (PD-L1, PD-1, and CTLA4), we next
evaluated the synergistic effect of these biomarkers in the prognostic
and predictive stratification of LUAD. Consistent with previous
results, stratified survival analysis revealed that the TMB status did
not interfere with TMEscore-based predictions. Subtypes of the
combination of TMEscore and TMB showed significant survival differences
(log-rank test, p = 0.0076; [157]Figure 8B). On the other hand,
Kaplan–Meier analysis revealed patients in the IMvigor210 cohort with
TMEscore low and PD-L1 high obtained most favorable OS than either
single positive (TME low or PD-L1 high) or dual negative (TMEscore high
PD-L1 low, p = 0.0018, [158]Figure 8D). In addition, analysis of
objective response also supported that TMEscore low and PD-L1 high
subgroup represented an increased proportion of PR/CR/SD than either
single positive (TMEscore low or PD-L1 high) or dual negative (p =
0.047, [159]Figure 8E). Taken together, these findings indicate the TME
classification system and scoring system may explain the effectiveness
of immunotherapy in patients with low TMB and low PD-L1, and these
distinct classification systems, TMEscore, PD-L1, and TMB might
function as complementary factors for the prediction of immunotherapy.
Discussion
Although immune checkpoint inhibitors (ICIs) have revolutionized
treatment strategies of lung cancer, the overall response rate of ICI
monotherapies is still limited and no more than 20% in NSCLC patients
with EGFR/ALK wild-type ([160]Doroshow et al., 2019). It has been
reported that TME plays a crucial role in cancer development and
anti-tumor process, especially the immunotherapy response in cancers
([161]Lu et al., 2020; [162]Ye et al., 2022). Therefore, characterizing
the tumor–immune microenvironment can improve the personalized
immunotherapeutic strategies.
Multi-omics data are often used for generating various predictive or
prognostic models through machine learning or statistical modeling
methods ([163]Xu et al., 2021). However, to date, comprehensive
analyses based on integrated genomic and transcriptomic profiles of the
tumor and its TME remain rare and lack efficient and useful models.
Therefore, we constructed a scoring system to classify and quantify the
comprehensive tumor immune landscape based on an immune-cell phenotype
algorithm and validation in external LUAD cohorts.
Transcriptomic analysis offers an opportunity to dissect the complexity
of tumors, including TME, dynamically regulating cancer progression and
influencing therapeutic outcomes ([164]Cieslik and Chinnaiyan, 2018;
[165]Thorsson et al., 2018). In our study, we identified four distinct
immune subtypes characterized by different biological processes and
prognosis, using “CIBERSORT” algorithm to analyze the transcriptomic
data of TCGA-LUAD samples. Furthermore, we established the TME scores
to characterize and quantify the immune infiltration patterns for each
LUAD patient based on the DEGs among the distinct subtypes. The
TMEscore displayed as a reliable prognostic immune-related biomarker
for LUAD, with worse survival in TMEscore-high patients and better
survival in TMEscore-low patients and in both TCGA and other five
independent GEO cohorts. In addition, enriched pathways and genomic
alterations were also analyzed and compared in different TMEscore
subgroups, and we observed that a high TMEscore was significantly
correlated with more aggressive molecular changes such as TP53
mutations. As expected from its increased immune gene expression, the
TMEscore-low subtype showed overexpression of PD-L1, PD-1, CTLA4, and
associations of other markers of sensitivity to immunotherapy,
including IPS score and TIDE score. Our findings also revealed that the
TMEscore is a robust and reliable prognostic tool and predictive
indicator of the response to immunotherapy in the IMvigor210 cohort.
With further in-depth investigation, our TMEscore might be utilized as
an important supplementary predictor to LUAD immunotherapy.
The tumor microenvironment (TME) is a complex interface between cancer
cells, stroma, and infiltrating immune cells ([166]Fridman et al.,
2012). A previous study demonstrated that the tumor microenvironment
contexture plays a key role in tumor development and immunotherapeutic
efficacy ([167]Stankovic et al., 2018). TME heterogeneity, which
impacts tumor progression and prognosis, has been identified in
cancers, especially LUAD ([168]Lavin et al., 2017; [169]Zhang et al.,
2019; [170]Chen et al., 2020; [171]Nguyen et al., 2021). In addition,
the difference in TME patterns was found to be correlated to tumor
heterogeneity and treatment diversity ([172]Jia et al., 2018;
[173]Vitale et al., 2021). Considering the individual heterogeneity of
the immune milieu, it was demanded to quantify the TME patterns of
individual tumors. Here, using the “CIBERSORT” algorithm, we identified
22 human immune-cell phenotypes and generated an individualized
TMEscore to assess TME patterns. Our study represents an essential step
toward understanding the crosstalk between malignant cells and immune
cells in LUAD.
Our findings of the TCGA molecular mutations displayed significant
differences in distributions across the TMEscore subgroups. The largest
difference in mutations between subgroups was in TP53 mutations, which
were more common in TME-high samples than in TME-low samples (53 vs.
35%). TP53 mutation is not only the most common genetic event in NSCLC
but also reported to be associated with poor prognosis in cancers,
especially non-small cell lung cancer ([174]Ozaki and Nakagawara,
2011). TP53 mutation could affect disease progression, tumor cell
characteristics, and the therapeutic effect of different therapeutics
([175]Wu and Hwang, 2019). In addition, the more enrichment of KEAP1
mutations in TMEscore high tumors than TMEscore low may be one
potential explanation for the distinct performance of ICI efficacy in
LUAD. KEAP1 mutations were reported to be enriched in patients with
high TMB lacking T-cell infiltration and immunologically cold
([176]Marinelli et al., 2020), which have been associated with
decreased efficacy of ICIs in NSCLC in published studies
([177]Papillon-Cavanagh et al., 2020; [178]Di Federico et al., 2021).
The differences in their molecular characteristics between TMEscore
subgroups might contribute to the diverse immunogenic features and
consequently varied responses to immunotherapy.
To acquire a deeper insight into the biological feature of the TMEscore
subgroups, we further investigated enriched pathways and immune
characteristics of different TMEscore subgroups. Patients with the
low-TMEscore subtype, whose molecular traits, including an abundance of
infiltration immune active cells, enhanced enrichment of immune-related
pathways, such as interferon gamma response, complement, inflammatory
response, were previously reported to predict the efficacy of
pembrolizumab. In addition, we also observed elevated IL6/JAK/STAT3
signaling pathway in the low-TMEscore group, modulating the
IFN-γ-induced expression of PD-L1 ([179]Zhang et al., 2021).
Collectively, the designated distinct TMEscore subtypes of LUAD were
identified, and the crucial insights into the immunologic features of
these subtypes were provided. Meanwhile, we proved that the TMEscore
showed significant correlations with immune checkpoint genes (PD-L1,
PD-1, and CTLA-4), TMB, and other biomarkers of immunotherapy,
including IPS and TIDE, indicating that TMEscore possessed the
potential to predict the response to immunotherapy. Previous studies
have demonstrated that some biomarkers, such as TIDE and IPS, could
predict patient response to immunotherapy. TIDE, a creative
computational method to identify the induction of T-cell dysfunction in
tumors with high infiltration of cytotoxic T lymphocytes and the
prevention of T-cell infiltration in tumors with low-CTL levels
([180]Jiang et al., 2018), has been proven to predict the outcomes of
cancers treated with ICIs ([181]Jiang et al., 2018). In addition, IPS
was developed to quantitatively predict patients’ response to
anti-PD-1/PD-L1 and anti-CTLA-4 therapies based on an 18-gene signature
including genes that reflect an ongoing adaptive Th1 and cytotoxic CD8
T-cell response ([182]Charoentong et al., 2017). Thus, the low-TMEscore
patients presenting high IPS and low TIDE scores may have a better
response to immunotherapy. However, both TIDE and IPS focused on the
function and status of T cells, which could not fully reflect the
complexity of the TME involved in the response to immunotherapy.
Therefore, our scoring system exhibits promising clinical flexibility
for the predictive value of anti-PD-(L)1 therapy.
Furthermore, patients in the high-risk subgroup presented with a higher
level of immune checkpoint molecules and showed higher immunogenicity.
However, PD-L1 expression and TMB are neither the only nor the
satisfying tool to identify NSCLC patients that might benefit from
therapy with immune checkpoints inhibitors ([183]Klein et al., 2021).
One critical obstacle impeding the extensive utility of TMB and PD-L1
expression is the determination of feasible cut-off values. Moreover,
these two predictors only focused on the intrinsic features of tumors
and may not cover other situations involved in antitumor immune
responses such as TME. Notably, we demonstrated that it is reasonable
to combine TMEscore with PD-L1 or TMB together, and thus it might help
make clinical decisions in LUAD. Patients with TMEscore-low PD-L1 high
or TMB high should be preferentially recommended for ICI treatment,
while patients with TMEscore-low PD-L1 low, or TMEscore-high PD-L1 high
can optionally consider anti-PD- (L)1 therapy; however, patients with
TMEscore high/low PD-L1 or low TMB should carefully choose anti-PD-(L)1
therapy. Taking this step further, it is suggested that TMEscore can
identify either potential sensitive patients with low PD-L1
expression/low TMB who may benefit or patients who do not respond to
ICIs despite having a high PD-L1 expression/high TMB. In addition, we
also explored the stability of our TMEscore model. We found that
patients with a lower TMEscore were more likely to respond to ICB and
had improved overall survival in the IMvigor210 cohort treated with
checkpoint blockade. Collectively, combinations of TMEscore, TMB, and
PD-L1 could be applied not only as refined prognostic stratification
tools but also as more reliable predictive biomarkers for personalized
immunotherapy treatment.
Our study provides a translational rationale for evaluating TME based
on transcriptomic data and TMEscore as a biomarker for immunotherapy
response in patients with LUAD. However, this study still has several
limitations. First, while the composition of the TME has been
recognized as a determinant of cancer progression and response to
therapy, most analyses have focused on a limited proportion of cell
types. Nonetheless, there are still numerous cellular and molecular
mechanisms involved in immunotherapy, and our TMEscore may not cover
the possible intra/extracellular situations involved in antitumor
immune responses. Second, since this study was a retrospective
analysis, the ability of the TMEscore in predicting survival and
response to immunotherapy should be validated in a large-cohort,
multi-center, and prospective study in the future. Third, all
quantifications of gene expression are relative values, which makes it
difficult to determine the absolute threshold and cut-off values for
clinical application. Therefore, quantitative determinations of gene
expression are also needed. Specifically, the underlying molecular
mechanisms remain to be elucidated in LUAD in vivo and in vitro.
Conclusion
In conclusion, our translational rationale for TME classification may
help in distinguishing immune and molecular characteristics and
predicting clinical outcomes of LUAD patients. These findings will
further improve the implementation and utility of precisely
personalized immunotherapeutic strategies in LUAD.
Data Availability Statement
The original contributions presented in the study are included in the
article/[184]Supplementary Material, further inquiries can be directed
to the corresponding authors.
Author Contributions
YG and JH designed the study. XW, ZX, ZL, WL, ZC, and XF collected the
data. XW and ZX performed the data analysis and interpreted the data.
XW and ZX drafted the manuscript. XW, ZX, ZL, WL, ZC, XF, YG, and JH
revised the manuscript. All authors read and approved the final
manuscript.
Funding
This study was supported by the National Key R&D Program of China
(2020AAA0109500), R&D Program of Beijing Municipal Education Commission
(KJZD20191002302). National Natural Science Foundation of China
(82122053, 82188102), the Beijing Municipal Science & Technology
Commission (Z191100006619115), CAMS Initiative for Innovative Medicine
(2021-I2M-1-012 2021-I2M-1-015), Non-profit Central Research Institute
Fund of Chinese Academy of Medical Sciences (2021-PT310-001), Key-Area
Research and Development Program of Guangdong Province
(2021B0101420005), and Aiyou Foundation (KY201701).
Author Disclaimer
The funders had no roles in study design, data collection and analysis,
decision to publish, or preparation of the manuscript.
Conflict of Interest
The authors declare that the research was conducted in the absence of
any commercial or financial relationships that could be construed as a
potential conflict of interest.
Publisher’s Note
All claims expressed in this article are solely those of the authors
and do not necessarily represent those of their affiliated
organizations, or those of the publisher, the editors, and the
reviewers. Any product that may be evaluated in this article, or claim
that may be made by its manufacturer, is not guaranteed or endorsed by
the publisher.
Supplementary Material
The Supplementary Material for this article can be found online at:
[185]https://www.frontiersin.org/articles/10.3389/fgene.2022.902577/ful
l#supplementary-material
[186]Click here for additional data file.^ (1MB, docx)
[187]Click here for additional data file.^ (184KB, xlsx)
Abbreviations
CI, confidence interval; DEGs, differential expression genes; ESTIMATE,
Estimation of Stromal and Immune cells in Malignant Tumor tissues using
Expression data; FDR, false discovery rate; GEO, Gene Expression
Omnibus; GO, Gene Ontology; GSEA, gene set enrichment analysis; HR,
hazard ratio; ICIs, immune checkpoint inhibitors; IPS,
immunophenoscore; KEGG, Kyoto Encyclopedia of Genes and Genomes; LUAD,
lung adenocarcinoma; PCA, principal component analysis; RNA-seq, RNA
sequencing; TCGA, The Cancer Genome Atlas; TIDE, Tumor Immune
Dysfunction and Exclusion; TIICs, tumor-infiltrating immune cells; TMB,
tumor mutation burden; TME, tumor microenvironment.
References