Abstract

   The immune cell infiltration in TME has been reported to be associated
   with prognosis and immunotherapy efficiency of lung cancers. However,
   to date, the immune infiltrative landscape of lung adenocarcinoma
   (LUAD) has not been elucidated yet. Therefore, this study aimed to
   identify a new transcriptomic-based TME classification and develop a
   risk scoring system to predict the clinical outcomes of patients with
   LUAD. We applied “CIBERSORT” algorithm to analyze the transcriptomic
   data of LUAD samples and classified LUAD into four discrete subtypes
   according to the distinct immune cell infiltration patterns.
   Furthermore, we established a novel predictive tool (TMEscore) to
   quantify the immune infiltration patterns for each LUAD patient by
   principal component analysis. The TMEscore displayed as a reliable and
   independent prognostic biomarker for LUAD, with worse survival in
   TMEscrore-high patients and better survival in TMEscrore-low patients
   in both TCGA and other five GEO cohorts. In addition, enriched pathways
   and genomic alterations were also analyzed and compared in different
   TMEscore subgroups, and we observed that high TMEscore was
   significantly correlated with more aggressive molecular changes, while
   the low TMEscore subgroup enriched in immune active-related pathways.
   The TMEscore-low subtype showed overexpression of PD-1, CTLA4, and
   associations of other markers of sensitivity to immunotherapy,
   including TMB, immunophenoscore (IPS) analysis, and tumor immune
   dysfunction and exclusion (TIDE) algorithm. Conclusively, TMEscore is a
   promising and reliable biomarker to distinguish the prognosis, the
   molecular and immune characteristics, and the benefit from ICIs
   treatments in LUAD.

   Keywords: immune cell infiltration, prognosis, lung adenocarcinoma,
   tumor microenvironment, immunotherapy

Introduction

   Although great advances have been achieved in both basic and clinical
   cancer research ([40]Gu et al., 2020; [41]Jiao and Yang, 2020), cancer
   still caused approximately 10 million of deaths in 2020 ([42]Sung et
   al., 2021). With the high prevalence and poor prognosis, lung cancer is
   ranked as the first leading cause of cancer-related deaths worldwide,
   becoming a major global health problem ([43]Miller et al., 2019;
   [44]Siegel et al., 2019; [45]Sung et al., 2021). Recently, the
   emergence of checkpoint blockade immunotherapy ([46]Pardoll, 2012;
   [47]Topalian et al., 2015) has significantly improved the strategies of
   LUAD. However, the minority of response and resistance to these
   treatments frequently impedes the clinical outcomes. Additionally, the
   effects of ICIs are not only driven by genetic and epigenetic
   alterations in tumor cells, but the tumor microenvironment (TME) has
   also been reported to be a crucial regulator in tumorigenesis
   ([48]Dejima et al., 2021; [49]Ye et al., 2022), development, metastasis
   ([50]Quail and Joyce, 2013), and resistance to therapies ([51]Ostman,
   2012; [52]Lu et al., 2020).

   TME chiefly consists of multiple subpopulations of T and B lymphocytes,
   dendritic cells (DCs), macrophages, neutrophils, and myeloid-derived
   suppressor cells (MDSCs) ([53]Belli et al., 2018). The balance between
   pro-tumorigenic and anti-tumor factors in the TME conducts tumor growth
   ([54]Wellenstein and de Visser, 2018; [55]Hinshaw and Shevde, 2019).
   Accumulating evidence has indicated the TME immune composition is
   generally correlated with prognosis and responsiveness to various
   cancer treatments. On one hand, tumor-infiltrating lymphocytes (TILs),
   such as CD4^+ and CD8^+ T cells, have been associated with longer
   survival and better response to immunotherapy ([56]Kawai et al., 2008;
   [57]Fridman et al., 2012). On the other hand, the tumor cells can
   promote a suppressive TME, which challenges anti-tumor immunity by
   inducing upregulation of inhibitory immune signaling, suppressive
   cytokine secretion, and recruitment of suppressive immune cells, such
   as tumor-associated macrophages (TAMs) presenting pro-tumor effects by
   secreting immunosuppressive cytokines, including interleukin-10 (IL-10)
   and transforming growth factor-β (TGF-β) ([58]Mantovani et al., 2017),
   as well as immunomodulatory cells, such as myeloid-derived suppressor
   cells (MDSCs) ([59]Ostrand-Rosenberg and Fenselau, 2018)and regulatory
   T cells (Tregs) ([60]Shimizu et al., 2010), which are all associated
   with unfavorable prognosis. To be specific, focusing on cellular
   diversity shows that TME heterogeneity could impact clinical outcomes
   and provide a challenge for immunotherapy of LUAD ([61]Wu F. et al.,
   2021; [62]Nguyen et al., 2021). Therefore, investigating the effects of
   TME composition on the tumor cells will help us decode the regulation
   of the microenvironment by the tumor.

   To date, the emerging predictors for immunotherapy in NSCLC are still
   imperfect, such as programmed death-ligand 1 (PD-L1) expression
   ([63]Dempke et al., 2018) is thought to be induced by interferon-γ
   (IFN-γ)- mediated immune responses and tumor mutational burden (TMB)
   ([64]Klein et al., 2021) is reported to determine the tumor
   immunogenicity. It is suggested that only reflecting the tumor cell
   intrinsic features but ignoring the extrinsic factor, especially TME,
   is attributed to inconsistencies. Thus, the characteristics of TME
   should be further comprehensively explored to determine effective
   biomarkers that precisely predict prognosis and considerably optimize
   personalized immunotherapy.

   Progress has been recently achieved by immunotherapy, emphasizing the
   importance of TME in LUAD. It elucidates that TME is not the
   single-cell population but a complex interface among cancer cells,
   stroma, and infiltrating immune cells. Deeper analyses of the NSCLC TME
   are necessary to refine the potential application of these findings to
   clinical care. We applied “CIBERSORT” algorithm to analyze the
   transcriptomic data of 500 LUAD samples in TCGA and classified the LUAD
   into four discrete subtypes according to the distinct immune cell
   infiltration patterns. Furthermore, we established the TME scores to
   characterize and quantify the immune infiltration patterns for each
   LUAD patient based on the mRNA expression profiles. Conclusively, we
   investigated and validated the association between TME score and the
   clinical outcomes, as well as the efficacy of anti-PD- (L)1 treatment
   in LUAD, which can facilitate the identification of ideal candidates
   for personalized immunotherapeutic strategies.

Methods

Datasets and Preprocessing

   A total of 1,518 lung adenocarcinoma (LUAD) and 59 normal tissue
   samples were retrieved and downloaded from the corresponding datasets,
   including TCGA LUAD from TCGA data portal
   ([65]https://xenabrowser.net/datapages/) and [66]GSE31210,
   [67]GSE37745, [68]GSE50081, [69]GSE68465, and [70]GSE13213 from the
   NCBI Gene Expression Omnibus (GEO,
   [71]https://www.ncbi.nlm.nih.gov/geo/). The somatic mutation data (SNPs
   and small INDELs) were downloaded from TCGA database ([72]MuTect2
   Variant Aggregation and Masking). The raw data of the dataset from
   Affymetrix were processed using the RMA algorithm in the “Affy”
   package. The data from Agilent were downloaded with the processed
   version. For TCGA dataset, RNA-sequencing data (FPKM values) were
   transformed into transcripts per kilobase million (TPM) values, which
   are more similar to those resulting from microarrays and more
   comparable between samples ([73]Wagner et al., 2012). The following
   inclusion criteria were used: 1) histologically confirmed LUAD, 2)
   simultaneously available information on mRNA expression profile data
   and OS, 3) the sample tissue was collected from the primary solid tumor
   (“01”), and there was no duplication sample in TCGA, 4) genes were
   recorded in all datasets, and 5) genes with more than 70% of the
   missing value or 0 value were deleted. The remaining missing values
   were imputed with KNN imputation approaches. Therefore, 10,320 mRNAs
   were included in the analysis.

Consensus Clustering for the Tumor Microenvironment

   Distinguishing between tumor and normal tissue difference expression
   genes (DEGs) in TCGA was performed with “limma” (FDR <0.05 and |log2FC|
   > 1), which better identifies the characteristics of tumor.
   Furthermore, the tumor microenvironment was quantified by CIBERSORT
   ([74]https://cibersort.stanford.edu/)([75]Newman et al., 2015), a
   deconvolution method for inference of tumor-infiltrating immune
   components from bulk tissue gene expression profiles. Tumors with
   qualitatively different immune cell infiltration patterns were grouped
   using consensus clustering (100 iterations, resample rate of 80%, and
   hierarchical cluster). This procedure was performed with the
   “ConsensusClusterPlus” R package.

Identification of DEGs Associated With the TME Phenotype

   To functionally elucidate the biological characteristics of the TME
   subtypes in LUAD, we employed random forest (RF), an efficient and
   reliable machine learning method to identify DEGs between subtypes of
   TME. We run RF 100 times with different seeds to find the duplicated
   variables with at least 80% repetition rate to further ensure the
   stability of variable selection.

Generation of TMEscore

   To further elucidate the comprehensive profile of TME characteristics,
   the construction of TME metagenes was performed as follows: first, we
   further screened candidate prognostic genes from DEGs. Next, a
   consensus clustering algorithm was employed to define the cluster of
   genes. Then, a principal component analysis (PCA) was performed, and
   principal component 1 was extracted to serve as the signature score.
   After obtaining the prognostic value of each gene signature score, we
   applied a method similar to GGI ([76]Sotiriou et al., 2006) to define
   the TMEscore of each patient:
   [MATH:
   <mrow><mi>T</mi><mi>M</mi><mi>E</mi><mtext>score</mtext><mo>=</mo><msty
   le
   displaystyle="true"><mo>∑</mo><mrow><mi>P</mi><mi>C</mi><msub><mn>1</mn
   ><mi>i</mi></msub></mrow></mstyle><mo>−</mo><mstyle
   displaystyle="true"><mo>∑</mo><mrow><mi>P</mi><mi>C</mi><msub><mn>1</mn
   ><mi>j</mi></msub></mrow></mstyle><mo>,</mo></mrow> :MATH]

   where
   [MATH: <mi>i</mi> :MATH]
   is the signature score of clusters whose Cox coefficient is positive
   and
   [MATH: <mi>j</mi> :MATH]
   is the expression of genes whose Cox coefficient is negative.

Functional and Pathway Enrichment Analysis

   To further analyze the biological significance of the genes related to
   TMEscore with KEGG and GO function analysis, the “clusterProfiler” R
   package was adopted to annotate gene patterns ([77]Wu T. et al., 2021).
   The Benjamini–Hochberg procedure was used to control the false
   discovery rate (FDR). We set the cut-off of adj. p-values to 0.2 so
   that we could find more relevant pathways and functions based on the
   small number of DEGs. Gene set enrichment analysis (GSEA) illustrated
   the significantly different enriched pathways in the high- and
   low-TMEscore groups. Gene sets were downloaded from the MSigDB database
   of the Broad Institute ([78]Subramanian et al., 2005) and employed the
   Hallmark gene sets and 1,000 permutations. An enrichment pathway
   between two subtypes was determined with an FDR of <0.25 and the
   normalized enrichment score (NES).

Predicting the Patients’ Response to ICIs

   The Cancer Immunome Atlas ([79]https://tcia.at/) analyzed the immune
   landscapes and antigenomes of 20 solid tumors that were quantified by
   Immunophenoscore (IPS, a superior immune response molecular marker)
   ([80]Charoentong et al., 2017). The IPS value, which ranged from 0 to
   10, was positively correlated to tumor immunogenicity and could predict
   the patients’ response to immune checkpoint inhibitors (ICI treatment).
   Tumor Immune Dysfunction and Exclusion (TIDE,
   [81]http://tide.dfci.harvard.edu/), a computational method to predict
   immune checkpoint blockade response, was developed by [82]Jiang et al.
   (2018). TIDE uses a set of expression markers to profile two primary
   mechanisms of tumor immune evasion: T-cell dysfunction and T-cell
   exclusion. Patients with higher TIDE prediction scores represent a
   greater potential of tumor immune escape; therefore, TIDE could
   evaluate patients who are more likely to benefit from ICI. In addition,
   the mRNA expression of immune checkpoints was analyzed in different
   prognosis groups.

Statistical Analysis

   Continuous variables were summarized as mean ± SD, and categorized
   variables were described by frequency (n) and proportion (%).
   Differences among variables were tested by the Wilcoxon rank-sum test
   and Fisher’s exact tests. The relationship between variables was tested
   by Spearman rank correlation analysis. The cut-off value of TMEscore
   was calculated based on the correlation between the patients’ survival
   and the TMEscore in TCGA with the “survminer” package. Univariate and
   multivariate Cox regression analyses were used to assess prognostic
   analysis. Batch effects from non-biological technical biases were
   corrected using the “ComBat” algorithm of the “sva” package. The
   “Maftools” package was used to present the mutation landscape and
   identify the differential gene mutations between groups. The heatmap
   was produced by the R package “ComplexHeatmap.” A two-sided p < 0.05
   was regarded as statistically significant. All data processing was
   performed in R 4.0.2 software.

Results

Landscape of Lung Adenocarcinoma TME

   This study was conducted as per the flow chart shown in
   [83]Supplementary Figure S1. The information of 1,518 LUAD patients is
   detailed in [84]Supplementary Table S1. To classify the LUAD TME, the
   consensus clustering algorithm was used to cluster TME information
   obtained by CIBERSORT in TCGA-LUAD dataset ([85]Supplementary Table
   S2). The most appropriate clustering number was four ([86]Figures
   1A–D), which was selected by consensus matrices and consensus
   cumulative distribution function (CDF) curve. This analysis revealed
   that LUAD can be clustered into four distinct TME subtypes termed S1–4.
   The patients with subtype S4 had significantly longer overall survival
   (OS) than patients with subtypes S1 and S2, and subtype S3 demonstrated
   the worst survival ([87]Figure 1E). These four TME subtypes varied
   significantly based on the expression levels of LM22 gene signatures
   ([88]Figure 1F). The S4 subtype was characterized by increases in the
   infiltration of CD 8+ T cells, resting NK cells, follicular helper T
   cells, and M1 macrophages, displaying S4 was significantly associated
   with immune activation. Meanwhile, resting mast cells, activated
   dendritic cells, and regulatory T cells (Tregs) were enriched in the S1
   subtype, and the S2 subtype showed significant increases in the
   infiltration of naïve B cells, plasma cells, and CD4^+ memory-activated
   T cells; on the contrary, M0 macrophages, M2 macrophages, and CD4^+
   memory-resting T cells showed high infiltration in the S3 subtype,
   indicating an immunosuppressive milieu. Taken together, we demonstrated
   that the four TME subtypes were characterized by distinct immune cell
   infiltration and prognosis.

FIGURE 1.

   [89]FIGURE 1
   [90]Open in a new tab

   Unsupervised clustering of the tumor microenvironment (TME) cells for
   500 patients in the TCGA-LUAD cohort. (A–C) Consensus matrices of
   different clusters. (D) Consensus cumulative distribution function
   (CDF) curve. (E) Kaplan–Meier (K-M) curves for overall survival (OS) of
   four different subtypes (log-rank test, p =0.039). (F) Abundance
   pattern of 22 TME cell types in four TME subtypes.

Identification of DEGs and Functional Annotation

   To further identify the biological characteristics and differences
   among TME subtypes, RF algorithm was employed to extract the phenotype
   signatures. By 100 times analysis, a total of 77 DEGs duplicated at
   least 80 times were identified ([91]Supplementary Table S3). Through
   consensus clustering analysis based on the expression of the 77 most
   representative DEGs, we divided DEGs into two different clusters termed
   G1 (62 DEGs) and G2 (15 DEGs) ([92]Figure 2A). These two gene clusters
   were closely related to distinct TME and played different biological
   roles. Then, GO and KEGG enrichment analyses were performed with the
   “clusterProfiler” R package. The G1 cluster was mainly enriched in the
   MAPK signaling pathway, PI3K-Akt signaling pathway, aldosterone
   syntheses, and focal adhesion pathways ([93]Figure 2B). The G2 cluster
   was mainly enriched in hematopoietic cell lineage, B-cell receptor
   signaling pathway, cytokine–cytokine receptor interaction, and primary
   immunodeficiency ([94]Figure 2C). Significantly enriched pathways and
   molecular functions are summarized in [95]Supplementary Tables S4 and
   S5. Collectively, the coherence between the prognostic and biological
   features in the two gene subgroups indicated that this classification
   was reliable and reasonable.

FIGURE 2.

   [96]FIGURE 2
   [97]Open in a new tab

   Construction of the TMEscore for LUAD patients. (A) Consensus matrices
   of differentially expressed genes (DEGs) among TME subtypes. (B and C)
   KEGG pathway enrichment analysis results in G1 and G2. (D) K-M curve
   for OS of different TMEscore groups (log-rank test, p < 0.001). (E)
   Forest plots illustrating the results of multivariate Cox proportional
   hazards model of clinical feature in TCGA cohort. (F) Heatmap of DEG
   expression and clinical characteristics. TMEscore, age, sex, stage,
   smoke, therapy outcome, mutation of KARS, and mutation of EGFR are
   shown as patient annotations. Gene clusters are shown as gene
   annotations. Top legend, gray indicates a missing value.

Construction and Validation of the TMEscore in Six Independent Cohorts

   Although the four TME subtypes were identified, their clinical
   significance needed to be further evaluated and quantified. Therefore,
   we build TMEscore based on TME information in the TCGA-LUAD cohort to
   assess the prognostic value. Association with a prognosis of 34 genes
   (G1:22, G2:12) was confirmed by Cox regression analysis. First,
   principal component analysis (PCA) was used to compute two aggregate
   scores, TMBscore A from G1 and TMBscore B from G2. Then, we performed
   univariate Cox regression on each TMEscore to evaluate the prognostic
   value. Finally, TMEscore A and TMEscore B were integrated to obtain
   TMEscore for each sample. The prognostic value of the TMEscore was
   further assessed by the log-rank test after classification as high-risk
   and low-risk groups based on the corresponding optimal cut-off value
   (−0.92) acquired by the “survminer” R package in the TCGA-LUAD cohort.
   We visualized gene expression and clinical features distribution in
   different risk groups with a heatmap in TCGA and GEO datasets,
   respectively ([98]Figures 2F and [99]3H). The Kaplan–Meier curve of
   TMEscore subgroups showed that the patients in the low TMEscore group
   (median survival time 3,169 days) had significantly better overall
   survival than the high-TMEscore group (median survival time 1,235 days;
   log-rank test, p < 0.0001; [100]Figure 2D). Moreover, the prognostic
   value of the TMEscore was further assessed with five external datasets
   in the GEO database: [101]GSE37745, [102]GSE31210, [103]GSE13213,
   [104]GSE50081 and [105]GSE68465. Similar results were found that the
   survival advantage in the low TMEscore group in above cohorts, with the
   corresponding p-value of 0.038, 0.033, 0.018, 0.0036, and 0.004
   ([106]Figures 3B–F). Meanwhile, we integrated a total of 1,018 samples
   in the GEO datasets to evaluate prognostic efficiency, indicating the
   low-TMEscore group patients had better overall survival compared to the
   high-TMEscore group (log-rank test, p < 0.0001; [107]Figure 3A). These
   findings suggested that TMEscore possessed a reliable and robust
   capacity for predicting the prognosis for LUAD patients.

FIGURE 3.

   [108]FIGURE 3
   [109]Open in a new tab

   Prognostic value of TMEscore for LUAD patients in five GEO cohorts. (A)
   K-M curve of all 1,018 patients in the GEO database between different
   TMEscore groups (log-rank test, p < 0.001). (B–F) K-M curves of five
   independent GEO datasets in different TMEscore subgroups. (G) Forest
   plots illustrating the results of the multivariate Cox proportional
   hazards model of clinical features in the GEO database. (H) Heatmap of
   DEG expression and clinical characteristics. TMEscore, age, sex, stage,
   smoke, mutation of KARS, mutation of EGFR, and mutation of P53 are
   shown as patient annotations. Gene clusters are shown as gene
   annotations. Top legend, gray indicates a missing value.

TMEscore Was an Independent Prognostic Factor for LUAD Patients

   In addition to the TMEscore, other prognostic factors such as
   individual and clinicopathological features were included. After
   multivariable adjustments with age, sex, smoke situation, TNM stage,
   and therapy outcomes in the TCGA cohort, the TMEscore was confirmed as
   an independent prognostic indicator with a hazard ratio of 0.383 [95%
   CI: 0.210–0.696] in the TCGA-LUAD cohort ([110]Figure 2E), 0.632 [95%
   CI: 0.475–0.839] in the GEO datasets ([111]Figure 3G). Elder,
   ever-smoker, advanced stage, and non-response to therapy were also
   suggested to be independent risk factors in different datasets,
   respectively.

   Recent studies have reported that specific gene alterations, such as
   TP53 ([112]Sun et al., 2020), KRAS ([113]Hamarsheh et al., 2020), EGFR
   ([114]Chen et al., 2015), and STK11 ([115]Mazzaschi et al., 2021) have
   an important role in the regulation of the tumor immune
   microenvironment (TIME) and served as biomarkers to tumor therapeutics
   ([116]Lee et al., 2017; [117]Krishnamurthy et al., 2021). We further
   explore the predictive value of this TMEscore in LUAD patients with
   EGFR/KRAS mutation (MUT) or wild type (WT). Remarkably, this risk model
   had predictive power for both EGFR wild type and EGFR mutation LUAD
   patients, except for patients with TP53/EGFR co-mutations
   ([118]Supplementary Figure S2). Similarly, this risk model exhibited a
   robustly predictive value in both KRAS wild type and KRAS mutation LUAD
   patients, except for patients with KRAS/STK11 co-mutations LUAD
   patients ([119]Supplementary Figure S3). Among the EGFR/KRAS
   wild-type/mutation population, the beneficial trends of low TMEscore in
   the prognosis of LUAD patients were observed in distinct subgroups,
   suggesting that TMEscore was an independent and reliable prognostic
   indicator.

Different Biological Processes Between the High-TMEscore Group and the
Low-TMEscore Group

   For a comprehensive analysis of the potential regulatory mechanisms
   resulting in different TMEscore groups, we performed GSEA analysis
   between high and low TMEscore subgroups. The results showed that 23
   pathways were enriched in different subgroups with FDR<0.25
   ([120]Supplementary Table S6). In high TMEscore group, MYC targets V1
   (NES = 2.28 and FDR = 0.001), MTORC1 signaling (NES = 2.08 and FDR =
   0.011), MYC targets V2 (NES = 2.08 and FDR = 0.012), G2M checkpoint
   (NES = 2.00 and FDR = 0.021), glycolysis (NES = 1.79 and FDR = 0.014),
   and other pathways were enriched ([121]Figures 4A–D). Meanwhile, the
   results revealed that complement (NES = −1.74 and FDR = 0.231),
   inflammatory response (NES = −1.69 and FDR = 0.212), IL6/JAK/STAT3
   signaling (NES = −1.64 and FDR = 0.181), IL2/STAT5 signaling up (NES =
   −1.58 and FDR = 0.212) and interferon gamma response (NES = −1.42 and
   FDR = 0.223), and other pathways were correlated with the low TMEscore
   ([122]Figures 4E–H). It is suggested that the gene sets of the TMEscore
   high samples were enriched in cancer and tumor metabolism-related
   pathways, while the gene sets of the TMEscore low samples were enriched
   in DNA repair and immune response-related pathways.

FIGURE 4.

   [123]FIGURE 4
   [124]Open in a new tab

   Gene set enrichment analysis (GSEA) in TMEscore groups. (A–D)
   Enrichment plots showing MYC targets V1, MTORC1 signaling, MYC targets
   V2, and G2M checkpoint in the high-TMEscore group. (E–H) Enrichment
   plots showing complement, myogenesis, IL6/JAK/STAT3 signaling, and KRAS
   signaling up in the low-TMEscore group.

The Molecular Characteristics of Distinct TMEscore Subgroups

   Genomic alterations and oncogenic signaling within the tumors have been
   reported to affect anti/pro-tumor immunity and TME activity
   ([125]Hamarsheh et al., 2020; [126]Kumagai et al., 2020; [127]Zhou et
   al., 2020; [128]Fountzilas et al., 2021), links between tumor mutations
   and TME subtypes needed to be investigated. To illustrate the somatic
   variants and acquire further biological insights into the immunological
   characteristics of LUAD between TMEscore subgroups, we utilized the
   Mutation Annotation Format (MAF) files and performed the variants
   annotation. We found higher mutation counts in the TME-high subgroup
   than in the TME-low subgroup. Missense variations were the most common
   mutation subtype, followed by nonsense and frameshift deletions. The
   oncoplot of tumor somatic mutation in the TCGA-LUAD cohort showed that
   TP53, TTN, and MUC16 gene mutations in the high-TMEscore group were
   approximately 20% higher than those in the low TMEscore group
   ([129]Figures 5A and B). Among a total of 54 differential mutated genes
   between two groups (p < 0.01; [130]Figure 5C), CMA1, HSPA12B, and
   FAM196A showed a higher mutation frequency in the low-TMEscore group.
   The other genes, such as TP53, TTN, and FBXL7, had a higher mutation
   frequency in the high-TMEscore group. Collectively, this analysis
   indicated that transcriptomic-based TME classification coupled with
   genomics analysis can be exploited for further studies.

FIGURE 5.

   [131]FIGURE 5
   [132]Open in a new tab

   Molecular variations between low-TMEscore and high-TMEscore groups. (A
   and B) Mutation profiles of high-TMEscore and low-TMEscore groups. (C)
   Comparing differentially mutated genes between two subgroups by
   Fisher’s exact tests.

Combinations of TMEscore, Immune Checkpoints, and TMB Improve Risk
Stratification and Survival Prediction

   Previous studies have emphasized the importance of immune checkpoint
   genes in modulating immune infiltration ([133]Keir et al., 2008;
   [134]Andrews et al., 2019). Thus, we first compared the expression
   pattern of immune checkpoint genes between different patient groups
   delaminated by the TMEscore in TCGA-LUAD and GEO datasets. PDCD1, CD86,
   CD80, and CTLA4 showed significantly high expression in the
   low-TMEscore group than in the high-TMEscore group ([135]Figures 6A–D,
   F–I), which was further confirmed in five independent validating
   cohorts ([136]Supplementary Figure S4).

FIGURE 6.

   [137]FIGURE 6
   [138]Open in a new tab

   TMEscore in the prediction of immunotherapeutic benefits. (A–D)
   Expression of immune-checkpoint-relevant genes (PDCD1, CD86, CD80, and
   CTLA4) between high- and low-TMEscore groups in TCGA and (F–I) all
   1,018 patients in GEO datasets after batch correction. (E and J)
   Relationships between TIDE and TMEscore in TCGA and GEO datasets. (K
   and L) Relative probabilities of response to anti-CTLA-4 and
   anti-PD-1/PD-L1 treatment (IPS score) in the low-TMEscore and
   high-TMEscore groups in TCGA cohort.

   Considering the correlations between immune checkpoint genes and
   TMEscore, we next combined TMEscore with immune checkpoints expression
   to test whether they have an influence on OS in LUAD patients. Though
   survival analyses among four subgroups stratified by TMEscore and
   immune checkpoint gene expression, we displayed that patients with low
   PD-L1 and low TMEscore have prolonged OS compared to those with low
   PD-L1 and high TMEscore (p = 0.005), and among patients with high PD-L1
   expression, a lower TMEscore signified a remarkably better survival (p
   < 0.001) ([139]Figure 7A). We also found similar survival patterns
   among four patient subgroups stratified by TMEscore and PD1/CTLA-4
   expressions in the TCGA cohort ([140]Figure 7A). We then confirmed the
   results in the other five validation cohorts ([141]Figures 7B,C and
   [142]Supplementary Figure S5). In concordance with the TCGA dataset,
   patients with low TMEscore have significantly better survival relative
   to the high TMEscore group, even though with similar expression levels
   of immune checkpoint genes ([143]Figures 7B,C and [144]Supplementary
   Figure S5). In addition, TMB has been shown to have the potential to
   generate a larger number of neoantigens and make them more immunogenic
   ([145]Schumacher and Schreiber, 2015), which is strongly associated
   with clinical outcomes and response of immune checkpoint blockade
   response ([146]Yarchoan et al., 2017; [147]Chan et al., 2019). We found
   that patients with low TMB and high TMEscore had the worst prognosis
   ([148]Figure 8A). It is suggested that TMEscore, immune checkpoint
   genes, and TMB can complement each other as prognostic biomarkers.

FIGURE 7.

   [149]FIGURE 7
   [150]Open in a new tab

   Impact of immune checkpoint gene expressions and TMEscore on clinical
   outcome. Kaplan–Meier survival curves of overall survival among four
   patient groups stratified by TMEscore and immune checkpoint genes (PD1,
   PD-L1, and CTLA-4) in TCGA dataset (A), [151]GSE37745 dataset (B), and
   [152]GSE50081 dataset (C).

FIGURE 8.

   [153]FIGURE 8
   [154]Open in a new tab

   Kaplan–Meier survival curves of overall survival among four patient
   groups stratified by TMEscore and TMB in TCGA dataset (A) and
   IMvigor210 cohort (B). Proportional representation of the objective
   response rate among subgroups categorized by TMEscore and TMB in the
   IMvigor210 cohort (C). Kaplan–Meier survival curves of overall survival
   among four patient groups stratified by TMEscore and PD-L1 in the
   IMvigor210 cohort. (D) Proportional representation of the objective
   response rate among subgroups categorized by TMEscore and PD-L1 in the
   IMvigor210 cohort (E). CR, complete response; PR, partial response; SD,
   stable disease; PD, progressive disease.

The TMEscore Predicts Clinical Outcomes of Immunotherapy

   Given the linkage between TMEscore and immune checkpoint genes as well
   as TMB, we further explore the predictive potential of TMEscore for
   immune checkpoint blockade response through analyzing the correlation
   of TMEscore and published immunotherapy predictors, including TIDE and
   IPS. The relationship between TIDE and TMEscore was investigated in
   TCGA and GEO datasets. As expected, the high-TMEscore group was
   characterized by a significantly higher TIDE score ([155]Figures 6E and
   J). The IPS values (IPS-PD-1/PD-L1/PD-L2_pos and IPS-CTLA-4_pos)
   increased in the low-TMEscore group compared to the high-TMEscore group
   in TCGA ([156]Figures 6K and L). It is likely that the patients in the
   low-TMEscore group may have a better immune microenvironment and
   respond better to ICIs than those in the high-TMEscore group.

   Furthermore, the practicability of the TMEscore was further evaluated
   for speculation of the therapeutic benefit for ICI treated patients.
   The patients who received anti-PD-L1 immunotherapy in the IMvigor210
   cohort were assigned based on high and low TME scores. Given the
   contraindicatory prognostic and predictive value of TMEscore, TMB, and
   immune checkpoint gene expression (PD-L1, PD-1, and CTLA4), we next
   evaluated the synergistic effect of these biomarkers in the prognostic
   and predictive stratification of LUAD. Consistent with previous
   results, stratified survival analysis revealed that the TMB status did
   not interfere with TMEscore-based predictions. Subtypes of the
   combination of TMEscore and TMB showed significant survival differences
   (log-rank test, p = 0.0076; [157]Figure 8B). On the other hand,
   Kaplan–Meier analysis revealed patients in the IMvigor210 cohort with
   TMEscore low and PD-L1 high obtained most favorable OS than either
   single positive (TME low or PD-L1 high) or dual negative (TMEscore high
   PD-L1 low, p = 0.0018, [158]Figure 8D). In addition, analysis of
   objective response also supported that TMEscore low and PD-L1 high
   subgroup represented an increased proportion of PR/CR/SD than either
   single positive (TMEscore low or PD-L1 high) or dual negative (p =
   0.047, [159]Figure 8E). Taken together, these findings indicate the TME
   classification system and scoring system may explain the effectiveness
   of immunotherapy in patients with low TMB and low PD-L1, and these
   distinct classification systems, TMEscore, PD-L1, and TMB might
   function as complementary factors for the prediction of immunotherapy.

Discussion

   Although immune checkpoint inhibitors (ICIs) have revolutionized
   treatment strategies of lung cancer, the overall response rate of ICI
   monotherapies is still limited and no more than 20% in NSCLC patients
   with EGFR/ALK wild-type ([160]Doroshow et al., 2019). It has been
   reported that TME plays a crucial role in cancer development and
   anti-tumor process, especially the immunotherapy response in cancers
   ([161]Lu et al., 2020; [162]Ye et al., 2022). Therefore, characterizing
   the tumor–immune microenvironment can improve the personalized
   immunotherapeutic strategies.

   Multi-omics data are often used for generating various predictive or
   prognostic models through machine learning or statistical modeling
   methods ([163]Xu et al., 2021). However, to date, comprehensive
   analyses based on integrated genomic and transcriptomic profiles of the
   tumor and its TME remain rare and lack efficient and useful models.
   Therefore, we constructed a scoring system to classify and quantify the
   comprehensive tumor immune landscape based on an immune-cell phenotype
   algorithm and validation in external LUAD cohorts.

   Transcriptomic analysis offers an opportunity to dissect the complexity
   of tumors, including TME, dynamically regulating cancer progression and
   influencing therapeutic outcomes ([164]Cieslik and Chinnaiyan, 2018;
   [165]Thorsson et al., 2018). In our study, we identified four distinct
   immune subtypes characterized by different biological processes and
   prognosis, using “CIBERSORT” algorithm to analyze the transcriptomic
   data of TCGA-LUAD samples. Furthermore, we established the TME scores
   to characterize and quantify the immune infiltration patterns for each
   LUAD patient based on the DEGs among the distinct subtypes. The
   TMEscore displayed as a reliable prognostic immune-related biomarker
   for LUAD, with worse survival in TMEscore-high patients and better
   survival in TMEscore-low patients and in both TCGA and other five
   independent GEO cohorts. In addition, enriched pathways and genomic
   alterations were also analyzed and compared in different TMEscore
   subgroups, and we observed that a high TMEscore was significantly
   correlated with more aggressive molecular changes such as TP53
   mutations. As expected from its increased immune gene expression, the
   TMEscore-low subtype showed overexpression of PD-L1, PD-1, CTLA4, and
   associations of other markers of sensitivity to immunotherapy,
   including IPS score and TIDE score. Our findings also revealed that the
   TMEscore is a robust and reliable prognostic tool and predictive
   indicator of the response to immunotherapy in the IMvigor210 cohort.
   With further in-depth investigation, our TMEscore might be utilized as
   an important supplementary predictor to LUAD immunotherapy.

   The tumor microenvironment (TME) is a complex interface between cancer
   cells, stroma, and infiltrating immune cells ([166]Fridman et al.,
   2012). A previous study demonstrated that the tumor microenvironment
   contexture plays a key role in tumor development and immunotherapeutic
   efficacy ([167]Stankovic et al., 2018). TME heterogeneity, which
   impacts tumor progression and prognosis, has been identified in
   cancers, especially LUAD ([168]Lavin et al., 2017; [169]Zhang et al.,
   2019; [170]Chen et al., 2020; [171]Nguyen et al., 2021). In addition,
   the difference in TME patterns was found to be correlated to tumor
   heterogeneity and treatment diversity ([172]Jia et al., 2018;
   [173]Vitale et al., 2021). Considering the individual heterogeneity of
   the immune milieu, it was demanded to quantify the TME patterns of
   individual tumors. Here, using the “CIBERSORT” algorithm, we identified
   22 human immune-cell phenotypes and generated an individualized
   TMEscore to assess TME patterns. Our study represents an essential step
   toward understanding the crosstalk between malignant cells and immune
   cells in LUAD.

   Our findings of the TCGA molecular mutations displayed significant
   differences in distributions across the TMEscore subgroups. The largest
   difference in mutations between subgroups was in TP53 mutations, which
   were more common in TME-high samples than in TME-low samples (53 vs.
   35%). TP53 mutation is not only the most common genetic event in NSCLC
   but also reported to be associated with poor prognosis in cancers,
   especially non-small cell lung cancer ([174]Ozaki and Nakagawara,
   2011). TP53 mutation could affect disease progression, tumor cell
   characteristics, and the therapeutic effect of different therapeutics
   ([175]Wu and Hwang, 2019). In addition, the more enrichment of KEAP1
   mutations in TMEscore high tumors than TMEscore low may be one
   potential explanation for the distinct performance of ICI efficacy in
   LUAD. KEAP1 mutations were reported to be enriched in patients with
   high TMB lacking T-cell infiltration and immunologically cold
   ([176]Marinelli et al., 2020), which have been associated with
   decreased efficacy of ICIs in NSCLC in published studies
   ([177]Papillon-Cavanagh et al., 2020; [178]Di Federico et al., 2021).
   The differences in their molecular characteristics between TMEscore
   subgroups might contribute to the diverse immunogenic features and
   consequently varied responses to immunotherapy.

   To acquire a deeper insight into the biological feature of the TMEscore
   subgroups, we further investigated enriched pathways and immune
   characteristics of different TMEscore subgroups. Patients with the
   low-TMEscore subtype, whose molecular traits, including an abundance of
   infiltration immune active cells, enhanced enrichment of immune-related
   pathways, such as interferon gamma response, complement, inflammatory
   response, were previously reported to predict the efficacy of
   pembrolizumab. In addition, we also observed elevated IL6/JAK/STAT3
   signaling pathway in the low-TMEscore group, modulating the
   IFN-γ-induced expression of PD-L1 ([179]Zhang et al., 2021).
   Collectively, the designated distinct TMEscore subtypes of LUAD were
   identified, and the crucial insights into the immunologic features of
   these subtypes were provided. Meanwhile, we proved that the TMEscore
   showed significant correlations with immune checkpoint genes (PD-L1,
   PD-1, and CTLA-4), TMB, and other biomarkers of immunotherapy,
   including IPS and TIDE, indicating that TMEscore possessed the
   potential to predict the response to immunotherapy. Previous studies
   have demonstrated that some biomarkers, such as TIDE and IPS, could
   predict patient response to immunotherapy. TIDE, a creative
   computational method to identify the induction of T-cell dysfunction in
   tumors with high infiltration of cytotoxic T lymphocytes and the
   prevention of T-cell infiltration in tumors with low-CTL levels
   ([180]Jiang et al., 2018), has been proven to predict the outcomes of
   cancers treated with ICIs ([181]Jiang et al., 2018). In addition, IPS
   was developed to quantitatively predict patients’ response to
   anti-PD-1/PD-L1 and anti-CTLA-4 therapies based on an 18-gene signature
   including genes that reflect an ongoing adaptive Th1 and cytotoxic CD8
   T-cell response ([182]Charoentong et al., 2017). Thus, the low-TMEscore
   patients presenting high IPS and low TIDE scores may have a better
   response to immunotherapy. However, both TIDE and IPS focused on the
   function and status of T cells, which could not fully reflect the
   complexity of the TME involved in the response to immunotherapy.
   Therefore, our scoring system exhibits promising clinical flexibility
   for the predictive value of anti-PD-(L)1 therapy.

   Furthermore, patients in the high-risk subgroup presented with a higher
   level of immune checkpoint molecules and showed higher immunogenicity.
   However, PD-L1 expression and TMB are neither the only nor the
   satisfying tool to identify NSCLC patients that might benefit from
   therapy with immune checkpoints inhibitors ([183]Klein et al., 2021).
   One critical obstacle impeding the extensive utility of TMB and PD-L1
   expression is the determination of feasible cut-off values. Moreover,
   these two predictors only focused on the intrinsic features of tumors
   and may not cover other situations involved in antitumor immune
   responses such as TME. Notably, we demonstrated that it is reasonable
   to combine TMEscore with PD-L1 or TMB together, and thus it might help
   make clinical decisions in LUAD. Patients with TMEscore-low PD-L1 high
   or TMB high should be preferentially recommended for ICI treatment,
   while patients with TMEscore-low PD-L1 low, or TMEscore-high PD-L1 high
   can optionally consider anti-PD- (L)1 therapy; however, patients with
   TMEscore high/low PD-L1 or low TMB should carefully choose anti-PD-(L)1
   therapy. Taking this step further, it is suggested that TMEscore can
   identify either potential sensitive patients with low PD-L1
   expression/low TMB who may benefit or patients who do not respond to
   ICIs despite having a high PD-L1 expression/high TMB. In addition, we
   also explored the stability of our TMEscore model. We found that
   patients with a lower TMEscore were more likely to respond to ICB and
   had improved overall survival in the IMvigor210 cohort treated with
   checkpoint blockade. Collectively, combinations of TMEscore, TMB, and
   PD-L1 could be applied not only as refined prognostic stratification
   tools but also as more reliable predictive biomarkers for personalized
   immunotherapy treatment.

   Our study provides a translational rationale for evaluating TME based
   on transcriptomic data and TMEscore as a biomarker for immunotherapy
   response in patients with LUAD. However, this study still has several
   limitations. First, while the composition of the TME has been
   recognized as a determinant of cancer progression and response to
   therapy, most analyses have focused on a limited proportion of cell
   types. Nonetheless, there are still numerous cellular and molecular
   mechanisms involved in immunotherapy, and our TMEscore may not cover
   the possible intra/extracellular situations involved in antitumor
   immune responses. Second, since this study was a retrospective
   analysis, the ability of the TMEscore in predicting survival and
   response to immunotherapy should be validated in a large-cohort,
   multi-center, and prospective study in the future. Third, all
   quantifications of gene expression are relative values, which makes it
   difficult to determine the absolute threshold and cut-off values for
   clinical application. Therefore, quantitative determinations of gene
   expression are also needed. Specifically, the underlying molecular
   mechanisms remain to be elucidated in LUAD in vivo and in vitro.

Conclusion

   In conclusion, our translational rationale for TME classification may
   help in distinguishing immune and molecular characteristics and
   predicting clinical outcomes of LUAD patients. These findings will
   further improve the implementation and utility of precisely
   personalized immunotherapeutic strategies in LUAD.

Data Availability Statement

   The original contributions presented in the study are included in the
   article/[184]Supplementary Material, further inquiries can be directed
   to the corresponding authors.

Author Contributions

   YG and JH designed the study. XW, ZX, ZL, WL, ZC, and XF collected the
   data. XW and ZX performed the data analysis and interpreted the data.
   XW and ZX drafted the manuscript. XW, ZX, ZL, WL, ZC, XF, YG, and JH
   revised the manuscript. All authors read and approved the final
   manuscript.

Funding

   This study was supported by the National Key R&D Program of China
   (2020AAA0109500), R&D Program of Beijing Municipal Education Commission
   (KJZD20191002302). National Natural Science Foundation of China
   (82122053, 82188102), the Beijing Municipal Science & Technology
   Commission (Z191100006619115), CAMS Initiative for Innovative Medicine
   (2021-I2M-1-012 2021-I2M-1-015), Non-profit Central Research Institute
   Fund of Chinese Academy of Medical Sciences (2021-PT310-001), Key-Area
   Research and Development Program of Guangdong Province
   (2021B0101420005), and Aiyou Foundation (KY201701).

Author Disclaimer

   The funders had no roles in study design, data collection and analysis,
   decision to publish, or preparation of the manuscript.

Conflict of Interest

   The authors declare that the research was conducted in the absence of
   any commercial or financial relationships that could be construed as a
   potential conflict of interest.

Publisher’s Note

   All claims expressed in this article are solely those of the authors
   and do not necessarily represent those of their affiliated
   organizations, or those of the publisher, the editors, and the
   reviewers. Any product that may be evaluated in this article, or claim
   that may be made by its manufacturer, is not guaranteed or endorsed by
   the publisher.

Supplementary Material

   The Supplementary Material for this article can be found online at:
   [185]https://www.frontiersin.org/articles/10.3389/fgene.2022.902577/ful
   l#supplementary-material
   [186]Click here for additional data file.^ (1MB, docx)
   [187]Click here for additional data file.^ (184KB, xlsx)

Abbreviations

   CI, confidence interval; DEGs, differential expression genes; ESTIMATE,
   Estimation of Stromal and Immune cells in Malignant Tumor tissues using
   Expression data; FDR, false discovery rate; GEO, Gene Expression
   Omnibus; GO, Gene Ontology; GSEA, gene set enrichment analysis; HR,
   hazard ratio; ICIs, immune checkpoint inhibitors; IPS,
   immunophenoscore; KEGG, Kyoto Encyclopedia of Genes and Genomes; LUAD,
   lung adenocarcinoma; PCA, principal component analysis; RNA-seq, RNA
   sequencing; TCGA, The Cancer Genome Atlas; TIDE, Tumor Immune
   Dysfunction and Exclusion; TIICs, tumor-infiltrating immune cells; TMB,
   tumor mutation burden; TME, tumor microenvironment.

References