Abstract The six transcriptomic immune subtypes (ISs) (C1 - C6) were reported to have complex and different interplay between TME and cancer cells in TCGA (The Cancer Genome Atlas) pan-cancer cohort. Our study specifically explored how the consequence of interplay determines the prognosis and the response to therapy in LUAD cohorts. Clinical and molecular information of LUAD patients were from TCGA and Gene Expression Omnibus (GEO). The immune cell populations and gene/pathway enrichment analysis were performed to explore the molecular differences among the C3 IS and other ISs in the LUAD population. The proportion of C3 inflammatory IS was identified as the most common IS in both TCGA (N = 457) and GEO (N = 901) cohorts. The C3 IS was also found to be the most accurate prognostic subtype, which was associated with significantly longer OS (p <0.001) and DFS (p <0.001). The C3 IS presented higher levels of CD8 T, M1 macrophage, and myeloid dendritic cells, while lower levels of M2 macrophages and cancer-associated fibroblast cells. Moreover, the C3 subtype was enriched in the antigen process and presenting, interferon-gamma response, T cell receptor signaling, and natural killer cell-mediated cytotoxicity pathways than C1/C2. In contrast, the C1/C2 presented greater activation of pathways related to the cell cycles, DNA repair, and p53 signaling pathways. The immune-related C3 IS had a great ability to stratify the prognosis of LUAD, providing clues for further pathogenic research. This classification might help direct precision medicine screenings of LUAD patients, thus possibly improving their prognoses. Keywords: lung adenocarcinoma, prognostic signature, TME, immune subtypes (ISs), precision medicine Introduction According to the cancer incidence report in GLOBOCAN 2018, lung cancer remains the leading cause of cancer incidence and mortality worldwide^1. And lung adenocarcinoma (LUAD) is the most frequent histological subtype of non-small-cell lung carcinoma (NSCLC), accounting for 55% ([45]1, [46]2). LUAD is a heterogeneous disease with variable clinical prognosis and drug response outcomes. However, the essential role of the immune system activating status in the development and progression of the tumor genome and heterogeneous has not been well characterized ([47]3, [48]4). Immune checkpoint inhibitors (ICIs) have been used to activate the anti-tumor T cell activation to obtain a durable cure for patients ([49]5–[50]7). But only 30% of LUAD patients could benefit from ICIs ([51]6). LUAD is a heterogeneous disease with complex clinical features, biological diversity, and dynamic nature. Therefore the molecular classifications and therapeutic implications remain to be further studied ([52]8). The expression level of PD-L1 protein on the tumor was reported as a predictive biomarker of poor prognosis of NSCLC ([53]9, [54]10) and clinically benefiting from ICIs ([55]11–[56]14). PD-L1 is rich in benefits but is an imperfect marker, for the best response rate is still less than 50% in PD-L1 high NSCLC patients. Recently, anti-tumor immunity of LUAD patients ([57]15–[58]17), tumor immunogenicity ([59]5, [60]6, [61]10–[62]12, [63]16, [64]18–[65]20), and tumor immune microenvironment (TME) ([66]7, [67]11, [68]14, [69]18, [70]19, [71]21) are also reported to modulate the therapeutic impact of ICIs. Several biomarkers, including tumor mutational burden (TMB) ([72]11), blood tumor mutational burden (bTMB) ([73]22), human leukocyte antigens (HLA) loss-of-heterozygosity (LOH) status ([74]20), and murine double minute 2/4 (MDM2/MDM4) amplification ([75]23), which could affect the adaptive immune response to the tumors, have been widely studied to be associated with the efficacy of ICI therapy in LUAD patients. Crosstalk between cancer cells and TME is sophisticated, comprising pro-tumorigenic and anti-tumorigenic manners. Therefore, it is still essential to further study the functional presentation of tumor neoantigen and the TME ([76]3, [77]4) to predict the prognosis more precisely and improve the clinical outcome. Thorsson et al. classified the patients of 33 cancer types, including NSCLC, into six immune subtypes (ISs) (C1 - C6) ([78]24). This study provided a resource for understanding tumor-immune interactions, with implications for identifying ways of advancing immunotherapy research. These six ISs were reported to be associated with overall survival (OS) and progression-free survival (PFS) ([79]24). The C3 IS (inflammatory) heralded the best prognosis, while the C2 (IFN-g dominant) and C1 (wound healing) subgroups indicated less favorable outcomes despite having a substantial immune component. Moreover, the more mixed-signature subtypes, C4 (lymphocyte depleted) and C6 (TGF-b dominant), had the least favorable outcome. However, different cancer types had unique distributions of six ISs and prognosis. Thus, further refinement of this classification that was precisely adjusted for LUAD is undoubtedly warranted. Individual cancer types had varied proportions of ISs and clinical features, and the distinct lung cancer cells and tumor microenvironment of six ISs had complex and different interplay. The consequence of the complex interplay determines the growth of tumor cells and the prognosis of patients ([80]24). In this context, the molecular characteristics describing tumor-immune effects remained unclear in LUAD. The investigation of molecular classifications at the multi-omics level could provide more insight into anti-tumor immunity and might acquire novel biomarkers. In this study, we first identified the C3 IS as a robust prognostic signature associated with significantly longer overall survival and progression-free interval time by multivariate and subgroup analyses in multiple cohorts (TCGA LUAD cohort and four GEO LUAD cohorts). Then we further evaluated the prediction accuracy of the C3 IS and compared the C3 IS with the other three reported immune-related signatures. According to the time-dependent concordance index, we found that the C3 IS outperformed the other three signatures with superior overall survival and DFS predictive performances. Finally, using IS in this patient population, we analyzed the composition and functional orientation of immune and stromal populations of the tumor microenvironment, specific genes, and pathways. In conclusion, we aimed to provide more in-depth insight into the prognostic stratification of patients with LUAD and to provide a tumor-immune interaction profile with great promise of the therapeutic implications of LUAD ([81] Figure 1A ). Figure 1. [82]Figure 1 [83]Open in a new tab Developments and validations of the C3 immune subtype (IS) (A) Flowchart of developments and validations of the C3 immune subtype (IS). Overall survival (OS) (B) and progression-free interval event (PFI) (C) by immune subtypes (C3 vs. Other ISs) in TCGA LUAD cohort (n = 457) to verify the relationship between the C3 IS and prognosis. P-value was calculated by log-rank test. Materials and Methods Study Population and Data Preprocessing The mRNA expression counts data (Workflow Type: HTSeq-Counts) and clinical profiles of TCGA – NSCLC cohorts were downloaded from the PanCancer Atlas consortium, available at the publication page ([84]https://gdc.cancer.gov/about-data/publications/pancanatlas) ([85]25). Gene expression, copy number variation, and gene mutations were obtained for this study for 457 LUAD and 480 lung squamous cell carcinoma (LUSC) participants, grouped into five ISs based on the reported methods ([86]24). Four GEO datasets (901 LUAD participants) sourced from GEO databases with complete information about transcriptomics OS were also included in the validation dataset in our study (detailed in Supplementary Materials and Methods). Validate the Predictive Value of the C3 IS in GEO LUAD Cohorts To further validate the predictive value of the C3 IS, Kaplan-Meier survival analysis and Cox proportional-hazard univariate and multivariate analyses were performed in four independent GEO LUAD data sets ([87]GSE31210, [88]GSE37745, [89]GSE50081, and [90]GSE68465) and, where five immune subtypes C1-C4, C6 were identified as the reported method. Comparison Between the C3 IS Prognostic Model and Other Three Reported Immune-Related Prognostic Signatures To compare the prediction accuracies of C3 and the other reported prognostic models ([91] Table S2 ), we used R package pec::cindex to calculate the concordance index (C-index) of all/each GEO independent LUAD cohort(s) ([92]GSE31210, 114 [93]GSE37745, [94]GSE50081, and [95]GSE68465) for detailed evaluations ([96]26). OS and DFS time-dependent C-index were both calculated and compared. Clinical and Molecular Character Analyses Associations of relevant known clinical and pathological prognostic factors and LUAD subtypes were assessed using Fisher’s exact test. Overall Survival (OS) was estimated according to the pairwise Kaplan-Meier method ([97]27). Estimating Tumor Immune Score and Microenvironment Immune Cellular Fraction The immune infiltration status of the tumor purity and immune components was computed using the Estimation of Stromal and Immune cells in MAlignant Tumours using Expression data (ESTIMATE) ([98]28). The relative fraction of immune cells was estimated using CIBERSORT ([99]http://cibersort.stanford.edu/) ([100]29) and subtypes were obtained from the supplementary of a published paper ([101]24). The cell estimated by the MCP-counter ([102]30) were downloaded from TIMER2.0 ([103]http://timer.cistrome.org/) (detailed in Supplementary Materials and Methods). Differentially Expressed Genes (DEGs) Analysis We got the normalized expression levels of genes in FPKM values of the LUAD cohort. Processing of all the above data was done by the R (version 3. 6. 1). We used the R limma package to calculate the fold changes (FC) of the C3 subtype versus C1/2 subtypes. The Benjamini-Hochberg (BH) method was used for the adjusted p-value of multiple testing. A gene was defined as differentially expressed between IS subtypes when its median expression differed by at least 2-fold and multiple hypothesis correction of FDR p< 0.05. Gene Set Enrichment Enrichment analysis was performed by cluster profile package. The version 7.1 Kyoto Encyclopedia of Genes and Genomes (KEGG) Genesets were obtained from the Molecular Signatures Database (MSigDB) ([104]http://software.broadinstitute.org/gsea/downloads.jsp). The single-sample gene set enrichment analysis (ssGSEA) of signatures (median z-scores) in the three predominant immune subtypes of the TCGA LUAD (C3 versus C1/C2 dominant) were selected for each analysis, respectively, and used for heatmap visualization. Statistical Analysis Kaplan-Meier survival analysis and Cox proportional-hazard univariate and multivariate analyses examined the significant difference between C3 and the other ISs in TCGA-LUAD and GEO cohorts. Associations of categorical variables and LUAD subtypes were assessed using Fisher’s exact test, and continuous variables and LUAD subtypes were compared through the Kruskal-Wallis or Wilcoxon signed-rank test. Results Prognostic Associations of Immune Subtypes in LUAD To investigate the distribution of ISs in the TCGA NSCLC patients and determine their association with survival, we categorized 457 LUAD patients and 480 LUSC patients into five immune subtypes based on the reported methods ([105]24). Only five ISs were identified in both the TCGA LUAD and LUSC cohort, with two predominant ones (the C2 IFN-γ dominant subtype [147 patients, 32.2%] and the C3 inflammatory IS [179 patients, 39.2%]) were present in TCGA LUAD patients. Moreover, the proportion of the C3 inflammatory IS (179 patients, 39.2%) was the most common IS observed in the TCGA LUAD cohort, whereas C1 was particularly dominant in LUSC (273 patients, 57.1%). Other ISs were less commonly encountered in the TCGA NSCLC cohort, such as the C1 Wound healing (83 patients, 18.2%), C4 Lymphocyte depleted (20 patients, 4.4%), and C6 TGF-β dominant (28 patients, 6.1%) subtypes in the LUAD cohort; C2 IFN-γ dominant (181 patients, 37.7%), C3 Inflammatory (four patients, 0.8%), C4 Lymphocyte depleted (seven patients, 1.5%), and C6 TGF-β dominant (14 patients, 2.9%) subtypes in the LUSC cohort ([106]24). While the IS distribution in LUAD and LUSC cohorts was inconsistent, the C5 subtype was not identified in both cohorts ([107] Supplementary Table S1 ). In our immune study focused on NSCLC, the association of overall survival among ISs was still significant in the TCGA LUAD cohort (Log-rank P = 0.011) ([108] Figure S1A ), as Thorsson et al. reported in the pan-immune study, and the C3 performed the best prognosis. In contrast, the association was not significant in the TCGA LUSC cohort (Log-rank P = 0.14) ([109] Figure S1B ), and the C3 performed worse OS than the C1. This difference may course by the distinct distribution of C3, for there were only eight patients in the C3 subtype in TCGA LUSC cohort. Importantly, patients of the C3 IS had significantly longer OS and PFS in the TCGA LUAD cohort (Log-rank P < 0.001, HR = 0.57; P = 0.009, HR = 0.67, respectively) ([110] Figures 1B, C ). But the association of disease-free time (DFS) was not significant owing to the censored patients in the TCGA LUAD cohort (Log-rank P = 0.094, HR = 0.67) ([111] Figure S1C ). Univariate Cox regressions showed that pathological stage and the C3 IS were significantly associated with OS in the TCGA LUAD cohort (the Pathologic_stage: HR = 2.515, 95% CI 1.803-3.509, P = 0.001; the C3 IS: HR = 0.566, 95% CI 0.402-0.798, P = 0.001; respectively) ([112] Table 1 ). Further, multivariate Cox regressions demonstrated that pathological stage and the C3 IS were independent prognostic factors (the Pathologic_stage: HR = 2.518, 95% CI 1.804-3.516, P < 0.001; the C3 IS: HR = 0.566, 95% CI 0.402-0.798, P = 0.001; respectively) ([113] Table 1 ). Table 1. Univariate and multivariate Cox analyses of risk factors for survival prediction in TCGA LUAD cohorts. Univariate analysis Multivariate analysis HR (95% CI) P Value HR (95% CI) P Value Age (>=median VS. =median VS. =median VS.