Abstract

Background

   Immune checkpoint inhibitors (CPIs) have revolutionized cancer therapy
   for several tumor indications. However, a substantial fraction of
   patients treated with CPIs derive no benefit or have short-lived
   responses to CPI therapy. Identifying patients who are most likely to
   benefit from CPIs and deciphering resistance mechanisms is therefore
   essential for developing adjunct treatments that can abrogate tumor
   resistance.

Patients and methods

   In this study, we used a machine learning approach that used the
   US-based nationwide de-identified Flatiron Health and Foundation
   Medicine non-small cell lung carcinoma (NSCLC) clinico-genomic database
   to identify genomic markers that predict clinical responses to CPI
   therapy. In total, we analyzed data from 4,433 patients with NSCLC.

Results

   Analysis of pretreatment genomic data from 1,511 patients with NSCLC
   identified. Of the 36 genomic signatures identified, 33 exhibited
   strong predictive capacity for CPI response (n=1150) compared with
   chemotherapy response (n=361), while three signatures were prognostic.
   These 36 genetic signatures had in common a core set of four genes
   (BRAF, BRIP1, FGF10, and FLT1). Interestingly, we observed that some
   (n=19) of the genes in the signatures (eg, TP53, EZH2, KEAP1 and FGFR2)
   had alternative mutations with contrasting clinical outcomes to CPI
   therapy. Finally, the genetic signatures revealed multiple biological
   pathways involved in CPI response, including MAPK, PDGF, IL-6 and EGFR
   signaling.

Conclusions

   In summary, we found several genomic markers and pathways that provide
   insight into biological mechanisms affecting response to CPI therapy.
   The analyses identified novel targets and biomarkers that have the
   potential to provide candidates for combination therapies or patient
   enrichment strategies, which could increase response rates to CPI
   therapy in patients with NSCLC.

   Keywords: Lung Cancer, Biomarker, Genome, Immune Checkpoint Inhibitor
     __________________________________________________________________

WHAT IS ALREADY KNOWN ON THIS TOPIC

     * Checkpoint inhibitors (CPIs) have transformed cancer treatment,
       particularly for non-small cell lung cancer, by achieving prolonged
       responses in many patients. However, a significant portion of
       patients either do not respond or develop resistance to CPI
       therapy.

WHAT THIS STUDY ADDS

     * Among the 36 genomic signatures identified, 33 demonstrated strong
       predictive capacity for CPI response relative to chemotherapy
       response, while three signatures (hotspot granular signatures 3, 4,
       and 9) did not meet the predictive threshold. The study revealed
       core genes such as BRAF, BRIP1, FGF10, and FLT1 that are crucial
       for CPI outcomes. The findings also highlight that different
       mutations within the same gene can lead to contrasting responses to
       CPI therapy. Moreover, the study uncovers multiple biological
       pathways, including MAPK, PDGF, IL-6, and EGFR signaling,
       associated with CPI response, providing new insights into the
       mechanisms of resistance and potential targets for combination
       therapies.

HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY

     * The identification of specific genetic signatures and pathways
       associated with CPI response can guide the development of new
       predictive biomarkers and targeted therapies, potentially improving
       patient stratification and treatment outcomes. These insights could
       lead to personalized treatment strategies and the development of
       combination therapies to overcome CPI resistance, ultimately
       enhancing clinical practice and informing policy decisions on
       cancer treatment protocols.

Introduction

   Lung cancer is the leading cause of cancer-related mortality worldwide,
   and approximately 85% of these tumors are of non-small cell lung cancer
   (NSCLC) histology.[41]^1 2 The discovery and subsequent regulatory
   approval of checkpoint inhibitors (CPI) has revolutionized treatment of
   diverse cancer types including lung cancer, by achieving prolonged
   responses in many patients.[42]3,[43]7 Despite the unprecedented,
   prolonged response rates to CPIs, a substantial fraction of patients
   become resistant to CPI therapy.[44]^8

   There is, therefore, an urgent and unmet need to advance our
   understanding of CPI resistance mechanisms to promote the development
   of predictive biomarkers and identify potential targets for novel
   therapies. Extensive efforts are underway to identify predictive
   biomarkers using various omics, histopathologic, clinical, and
   computational approaches.[45]9,[46]12 Presently, the only approved
   biomarkers for CPI therapy are programmed death ligand-1 (PD-L1),
   microsatellite instability and tumor mutational burden (TMB).[47]^13
   However, these approved biomarkers do not provide the much-needed
   mechanistic insight behind CPI response. To gain insight into CPI
   resistance mechanisms in NSCLC and discover predictive genomic
   biomarkers, we used an interpretable machine learning (ML) framework
   that uses a genetic algorithm (GA) for feature selection. The patient
   data was obtained from the nationwide (US-based) de-identified Flatiron
   Health and Foundation Medicine NSCLC clinico-genomic database (FH-FMI
   CGDB). Using our GA framework, we discovered tumor intrinsic genetic
   signatures that can predict CPI response. We discovered that different
   mutations within the same gene may lead to opposing clinical responses.
   Finally, we used several network-based algorithms to discover pathways
   associated with CPI response. Our findings provide insights into
   potential mechanisms underlying CPI response in NSCLC, which are
   essential for developing new therapeutics that target CPI resistance
   and discovering biomarkers for the prediction of CPI response.

Results

Patient cohorts and characteristics

   Among 7,360 patients with NSCLC in the FH-FMI CGDB, we found 1,150
   advanced stage patients who had received monotherapy with
   anti-programmed cell death protein 1 (PD-1) or anti-PD-L1 antibodies
   (referred to hereafter as CPI) and had either response or resistance to
   CPI ([48]figure 1A,B). The de-identified longitudinal patient data
   originated from approximately 280 US cancer clinics (~800 sites of
   care). These patients had their pretreatment tumor specimens sequenced
   by the Foundation Medicine panels.[49]^14 We selected patients treated
   with CPIs as monotherapy to eliminate the confounding effect of
   combination therapy on CPI response.

Figure 1. Definition, selection and characteristics of patient outcome
groups. (A) Schematic representation of definition for response and
resistance patient groups. The long blue horizontal arrows represent the
patient’s journey over time. The study duration (270 days), in which
treatment outcomes were investigated, is marked by a green dotted open box.
Green vertical arrows represent the first time a patient was treated with a
CPI, and the greyed-out area represents buffer time (14 days) during which
treatment outcomes were not considered because they were likely influenced by
earlier treatments. The green circle represents real-world response from CPI
therapy, while the red X represents disease progression. Patients in the
response group exhibited durable clinical benefit without evidence of disease
progression during the 270-day study period. In contrast, patients with
innate resistance—defined as those who did not benefit from CPI and
experienced disease progression within the same time frame—were included in
the analyses. Patients with acquired resistance, characterized by an initial
response followed by disease progression within the 270-day study period,
were excluded. (B) Schematic representation of the number of selected
patients in the CPI treatment cohort, based on the exclusion criteria
depicted in the filter box in the scheme (more details in methods). (C)
Clinical characteristics of the patient cohorts divided into chemotherapy
(depicted as chemo) and checkpoint inhibitors (depicted as CPI). CPI,
checkpoint inhibitors; NSCLC, non-small cell lung cancer; TMB, tumor
mutational burden.

   [50]Figure 1
   [51]Open in a new tab

   The patients were categorized into the two response groups using
   clinical outcome data[52]^15 16 available in CGDB. Almost all patients
   with at least one line of therapy had real-world progression
   (rwP)[53]^15 data, and about 84% had real-world response (rwR)[54]^16
   data. The first response group consisted of 574 patients who had
   durable rwR and did not show rwP during the entire study duration of
   270 days were considered to have response (referred to as “response”
   patients).[55]^12 15 The second group had 576 patients who did not have
   rwR and showed rwP within the study and were considered to be
   CPI-resistant (referred to as “resistance” patients) ([56]figure 1A,B).
   Comparison of the clinical characteristics of response and resistance
   patients revealed no significant difference in gender, race, or disease
   stage ([57]figure 1C), but did show significant differences in age,
   TMB, smoking status, and histology. These differences are consistent
   with previous reports which show the frequency of response is greater
   in older patients,[58]^17 patients with higher TMB,[59]^18 different
   histology, as well as patients with a history of smoking.[60]^12 19 The
   majority of these patients received CPI as their first-line therapy
   (89% 1 L, 10% 2 L).

Mutational landscape of patients with NSCLC

   We characterized the mutational landscape of our patient group
   ([61]figure 2A). The five most frequent short variants (single
   nucleotide variants and short insertions/deletions) were in TP53, KRAS,
   KMT2D, KEAP1 and STK11 ([62]figure 2B). The five most frequent
   amplifications were of SOX2, SDHA, RICTOR, KLHL6 and PIK3CA ([63]figure
   2C), while the top five most frequent deletions were of CDKN2A, CDKN2B,
   PTEN, STK11 and TEK ([64]figure 2D). The five most frequent
   rearrangements were in CDKN2A, FGFR1, LRP1B, EGFR and TP53 ([65]figure
   2E). A comparison of the top 12 altered genes between response and
   resistance patients did not reveal any obvious differences in mutation
   frequencies between the two cohorts, with the exceptions of STK11
   mutations, which occurred at higher frequency in resistance patients
   (14% vs 22%) ([66]online supplemental figure S1C). Cumulatively, the
   mutational landscape analysis confirmed high-frequency mutations known
   in NSCLC but did not reveal gross differences between response and
   resistance patients.

Figure 2. Mutational landscape of the CPI cohort. (A) Oncoplot of the CPI
cohort (n=1,150), depicting the top 20 altered genes and some clinical
characteristics. Columns represent patients, rows represent genes. The top
bar plot shows mutation count and type (missense, splice site, frame shift,
etc) per patient. The middle section indicates mutations in the specified
gene in patients. The bottom heatmap shows clinical characteristics (outcome,
histology, smoking status, and gender) with a color scheme depicted at the
bottom of the figure. The right-side bar graph summarizes the total count and
proportion of mutation types for each gene. (B) Bar graph of top short
variants in the CPI cohort (n=1,150) with color stacking representing
mutation types within each gene (same color scheme as in A). (C) Bar graph
showing prevalence (%) of the top 10 amplified genes. (D) Bar graph showing
prevalence (%) of the top 10 deleted genes. (E) Bar graph showing prevalence
(%) of the top 10 rearrangements. CPI, checkpoint inhibitor.

   [67]Figure 2
   [68]Open in a new tab

Mutations affecting response to CPIs

   The conventional approach to correlating mutations with an observable
   characteristic involves testing for statistical significance of the
   occurrence of each mutation in its corresponding group. This type of
   analysis is useful but does not provide sufficient insight into the
   complex biology of response to CPI. Moreover, different mutations
   within the same gene can result in distinct biological
   outcomes,[69]20,[70]23 which can also be influenced by tissue type,
   tumor microenvironment, and/or xenobiotics.[71]^24 With the above
   considerations in mind, we categorized the mutational data using three
   different approaches (Methods). The first approach was the traditional
   classification which notates if the gene is mutated or wild-type (WT),
   and is referred to hereafter as binary classification/input. The second
   approach divides gene mutations into high-frequency mutations (HFM),
   and low-frequency mutations (LFM) (eg, TP53 HFM or TP53 LFM), and is
   referred to after this as HFM/LFM classification/input. The third
   approach further separates HFM mutations and adds the location of the
   amino acid change (eg, TP53 HFM 178, TP53 HFM 248), and is referred to
   after this as hotspot granular classification/input. Classifying the
   mutations into HFM/LFM and hotspots allowed us to further delineate the
   underlying biological mechanism as HFM tend to be gain of function
   mutations and LFM tend to be loss of function mutations.[72]^25 We used
   the Fisher’s exact test within each set (ie, binary, HFM/LFM and
   hotspot granular) to check for mutation enrichment in response or
   resistance patients. The analysis revealed that in the binary
   classification set, BRAF, BRIP1 and ASXL1 mutations were associated
   with response (false discovery rate [FDR]=0.053), while STK11 and
   CDKN2B mutations were associated with resistance (FDR=0.053 and 0.056,
   respectively; [73]figure 3A, [74]online supplemental table S1A).
   Consistent with the BRAF effect on CPI response in NSCLC, we obtained
   similar data when analyzing an independent melanoma cohort (data not
   shown). In the HFM/LFM classification, we found that BRAF_HFM and
   BRIP1_LFM were associated with response (FDR=0.046 and 0.068,
   respectively; [75]figure 3B, [76]online supplemental table S1B), and
   CDKN2B_LFM (FDR=0.1) was associated with resistance. In the
   hotspot-granular set, we found no statistically significant results
   after multiple testing adjustments ([77]figure 3C, [78]online
   supplemental table S1C). Since mutations tend to co-occur with other
   genomic alterations, we examined if there were any significant pairs or
   triples of co-occurring mutations associated with CPI response. In
   binary classification, we found that STK11 and KRAS co-occurrence was
   the only significant pair (FDR=0.023) associated with resistance, and
   co-occurring mutations in KEL and TP53 were only marginally associated
   with response (FDR=0.096; [79]figure 3A, [80]online supplemental table
   S2A). No significant co-occurring mutations in HFM/LFM or hotspot
   classifications were observed, but the top result of co-occurring
   mutations in KRAS_HFM and TERT_HFM (FDR=0.22) and KRAS_HFM_12 and
   PIK3C2G_HFM (FDR=0.25) were associated with response ([81]figure 3B,C,
   [82]online supplemental table S2B-C). Additionally, we found no
   significant triplet mutations, but top results included co-occurring
   mutations in KRAS-NF1-TP53 (FDR=0.3) and ARID1A_LFM-CDKN2A_LFM-TP53_LFM
   (FDR=0.33) ([83]figure 3A,B, [84]online supplemental table S3A-B).

Figure 3. Mutation association with outcome in the 1,150-patient cohort.
(A–C) Forest plot of the top-ranked genes of binary (A), HFM/LFM (B) and
hotspot (C) mutations by Fisher’s exact test comparing mutation rates between
patients in response and resistance groups. Each plot shows the top 15 single
gene and top 2 double and triple gene mutations co-occurrences separated by
dashed horizontal lines. Each plot shows the gene name (Gene) and the number
of patients who responded (N_resp) or were resistant (N_resist) having that
single or combination mutation, with the x-axis representing log-scaled OR.
OR>1 indicates association of the mutation with response to CPI, and <1
indicates association with resistance to CPI. CPI, checkpoint inhibitors;
HFM, high frequency mutations; LFM, low frequency mutations.

   [85]Figure 3
   [86]Open in a new tab

BRAF, BRIP1, STK11, ASXL1, CDKN1B, and co-occurring STK11-KRAS mutations
affect overall survival in patients treated with CPIs

   While we found a significant association ([87]figure 3) between several
   mutations and response to CPI, we wanted to investigate if mutations in
   the genes listed above, affect overall survival (OS) in the 1,150
   patient cohort. Additionally, we wanted to explore if the mutations
   could be predictive for CPI-specific response or prognostic
   (independent of treatment). We therefore compared OS of patients
   treated with CPI monotherapy with 361 patients treated with
   chemotherapy (patient characteristics are in [88]online supplemental
   figure S1A, complete analysis results are in Table S4). OS analysis
   revealed significantly greater OS among CPI-treated patients with
   mutations in BRAF and a trend for BRIP1 (p=0.09), and significantly
   lower OS with mutations in CDKN2B and STK11 (p=0.02) ([89]figure 4A).
   No significant change in survival was observed in patients with ASXL1
   mutations (p=0.5) ([90]figure 4A). A significantly lower OS among
   CPI-treated patients was also observed for patients with co-occurring
   mutations in KRAS and STK11 (p=0.0013) ([91]figure 4B). No significant
   change in survival was observed in chemo-treated patients with these
   gene mutations, indicating that the survival benefit is CPI-specific
   ([92]figure 4A,B). Of note, we did observe a sharp decrease in survival
   among chemo-treated patients with STK11 or STK11 and KRAS co-occurring
   mutations ([93]figure 4A,B), but since the survival lines crossed, it
   was not statistically significant, though it might be clinically
   relevant. Although we did not find statistically significant triplet
   mutations in the response data analysis ([94]figure 3A), the top
   triplet mutations in KRAS, NF1 and TP53 showed a significant
   (p=0.00041) increase in OS ([95]figure 4C). The statistically
   non-significant result in the response data analysis might be due to
   the small sample sizes (n=20 vs 2).

Figure 4. Overall survival of the 1,150-CPI patient cohort and 361-chemo
patient cohort with mutations found to be significantly associated with CPI
outcome. (A–C) Kaplan-Meier survival curves of overall survival in patients
treated with CPI (left plots) or chemo (right plots) with wild-type (wt; blue
lines) or mutated (mut; red lines) gene in top single (A), double (B) or
triple (C) gene mutations demonstrating a trend for statistical significance
in [96]figure 3. The bottom table in each Kaplan-Meier plot shows the number
of patients at risk at each time point (in month). P values indicated inside
each graph were from the log-rank test comparing patient groups within each
plot. CPI, checkpoint inhibitors.

   [97]Figure 4
   [98]Open in a new tab

   We further investigated the correlation of the OS to the mutations in
   the genes described above, with an extended cohort of patients. This
   extended cohort of 4,433 patients was selected from the 7,360 patients
   with NSCLC in the CGDB the same way as the 1,150-patient cohort
   ([99]figure 1B). The only difference was that there was no outcome
   definition criteria filter for the larger cohort (ie, regardless of
   drug response data availability) and regardless of their death status.
   These included 2,699 patients treated with CPI monotherapy and 1,734
   chemo-treated patients. Examining the effect of the mutations analyzed
   in [100]figure 4, on survival in the extended cohort confirmed that
   CPI-treated patients with mutations in BRAF, BRIP1 and ASXL1 showed
   significant increases in OS when compared with WT patients (p=0.011,
   p=0.038, p=0.04, respectively) ([101]online supplemental figure S2A).
   Notably, there was no significant difference in OS in chemo-treated
   patients with these mutations versus those without ([102]online
   supplemental figure S2A). CPI-treated patients with STK11 or CDKN2B
   mutations, and co-occurring mutations in STK11 and KRAS, showed a
   significant reduction in OS (p=0.0047, p=0.00013 and p=0.0015,
   respectively; [103]online supplemental figure S2A,B). We did, however,
   observe that chemo-treated patients with these mutations also showed
   significant decreases in OS (p<0.0001, p=0.0068 and p=0.0018,
   respectively), suggesting these mutations may not have a CPI-specific
   effect but are prognostic ([104]online supplemental figure S2A,B).
   Although the top triplet mutations in KRAS, NF1 and TP53 showed a
   significant substantial increase in OS (p=0.00041) in the 1,150-patient
   cohort, the increase was only marginally significant (p=0.056) in the
   larger CPI-treated cohort ([105]figure 4C and [106]online supplemental
   figure S2C). The same trend was observed for co-occurring mutations in
   KEL and TP53 ([107]figure 3 and data not shown). Cumulatively, the
   results of [108]figures3 [109]4 are in concordance, and the effect of
   the mutations on OR of response and resistance presented in [110]figure
   3 and OS presented in [111]figure 4 followed the same pattern and
   directionality.

Machine learning workflow identifies genetic markers of CPI outcomes

   We used a GA to evolve virtual individuals, each represented by a set
   of mutated genes, over multiple generations. The predictive power of
   each virtual individual for CPI response was assessed using Naive Bayes
   models, with cross-validation (CV) accuracy scores serving as the
   evaluation metric. The fittest individuals were selected iteratively
   for subsequent generations. From the top 5% of these individuals, based
   on their accuracy scores, we identified 12 distinct clusters using
   k-means clustering. Within each cluster, the set of mutated genes that
   appeared in more than 50% of the members was identified as the genetic
   signature for that cluster ([112]online supplemental figure S3,
   [113]figure 5A). We identified several mutations affecting outcomes to
   CPI including co-occurring mutations, which suggests that these genes
   in tandem or as a single gene may mediate outcomes to CPI. This
   suggested that decoding the biological mechanism(s) behind CPI outcomes
   requires broadening the analysis beyond one or two genes in isolation.
   Such a broad analysis aims at identifying a genetic signature(s) (i.e.,
   a collection of tumor mutations) that can be used to predict response
   to CPI. TMB is an established and important biomarker that correlates
   with response to CPI therapy,[114]^26 which we recently validated for
   predicting response in NSCLC.[115]^12 We therefore used TMB as a
   benchmark for our subsequent results. The TMB model trained on our
   NSCLC cohort showed a receiver operating characteristic (ROC) with an
   area under the curve (AUC) score of 0.6 ([116]figure 5B–E). Using an ML
   analysis workflow (described in Methods), we examined 977 out of the
   1,150 patients as the initial training set. The results yielded 36
   genetic signatures consisting of 12 signatures for each input: binary,
   HFM/LFM and hotspot (Table S5). The average CV ROC AUC scores of these
   signatures was 0.7±0.02 (Table S6). We then used an independent test
   (validation) set of 173 patients to examine the association of the 36
   genetic signatures to CPI response. The average resultant ROC AUC
   scores were 0.6±0.02 compared with 0.6 with TMB ([117]figure 5B–D,
   Table S6), which underscores the predictive power and generalizability
   of these genetic signatures. The average AUC scores for predicting
   response in 361 chemo-treated patients using the 36 genetic signatures
   was 0.5±0.02, which is consistent with random chance, confirming that
   the genetic signatures were CPI response-specific. These results are
   consistent with previously published AUC values for TMB[118]^27 28 and
   provide evidence that these genetic signatures perform as well as the
   established TMB biomarker in prediction of CPI response. However, these
   results provide an additional, important advantage since the predictive
   mutations can be explored for mechanistic understanding of outcomes to
   CPI, which is not possible with TMB.

Figure 5. Machine learning algorithm identifies 36 predictive genetic
signatures with a core of shared genes. (A) Schematic depiction of the
machine learning algorithm. (B–E) Plots showing the area under the ROC curve
(AUC) from the independent validation set (n=173 for CPI and n=361 for
chemo-treated patients) using the genetic signatures from different input
types. (B) Binary, (C) HFM/LFM and (D) hotspot granular and (E) AUC scores
for the three inputs with all unique gene mutation features (82 for binary,
130 for HFM/LFM, and 123 for hotspot granular) from the 12 models combined
into one model for each input type. For each graph, the blue dashed
horizontal line represents the score for the TMB model, and the red dashed
line represents the 50% AUC (depicted as random chance). (F) The top Venn
diagram represents the gene overlap between the 36 individual genetic
signatures across binary, HFM/LFM and hotspot granular inputs, showing that
four genes are included in every single genetic signature of the 36
signatures. The bottom Venn diagram represents overlap across the unique
genes in the 12 signatures of the three inputs (82 binary, 130 HFM/LFM, and
123 hotspot). Detailed information about the overlap is shown in [119]online
supplemental figure S4. CPI, checkpoint inhibitors; HFM, high frequency
mutations; LFM, low frequency mutations; ROC, receiver operating
characteristic; TMB, tumor mutational burden.

   [120]Figure 5
   [121]Open in a new tab

Characterization of 36 genetic signatures revealed a core set of genes most
frequently associated with CPI response

   Since each of the three ML inputs generated 12 separate predictive
   genetic signatures (Table S5), we first examined if each genetic
   signature within the inputs is diverse. Indeed, the heatmap analysis in
   [122]online supplemental figure S4 shows that the 36 signatures are
   diverse with some overlaps between them ([123]online supplemental
   figure S4). Next, we assessed the frequency of each gene mutation
   feature across the 36 genetic signatures that might indicate their
   utility for CPI outcomes. We found that within each of the binary,
   HFM/LFM and hotspot inputs, there was an overlap of 13, 17, and 20
   genes, respectively, within the 12 genetic signatures ([124]online
   supplemental figure S4 and Table S7A). These genes were present in all
   12 genetic signatures within each input, suggesting that these are
   necessary in generating predictive genetic signatures and play an
   important role in CPI outcomes. Among these overlaps, four genes
   (BRIP1, FLT1, FGF10 and BRAF) were shared ([125]figure 5F top,
   [126]online supplemental figure S4 and Table S7B), representing a core
   set of mutations most important for predicting CPI outcomes. Next, we
   identified 82, 130, and 123 unique gene mutation features contained in
   each of the binary, HFM/LFM and hotspot inputs, respectively (Table
   S7C). Using these gene mutation features, we built three models, one
   for each input type that have ROC AUC scores of 0.61–0.66 ([127]figure
   5E and Table S7D). Among the unique gene mutation features, 58 genes
   were overlapping, further distinguishing a core set of mutations
   central to predicting CPI outcomes ([128]figure 5F, Table S7E).
   Overall, this ML approach revealed a significant number of genes, many
   of which are previously unreported, which may play a role in outcomes
   to CPI therapy.

Relative contributions of individual gene mutations to overall CPI response

   To interpret the biological impact of these mutations, we used the
   linear Naive Bayesian model. It enabled us to: (1) quantify the
   relative contribution of each gene mutation feature to CPI outcomes,
   and (2) associate each mutation with response or resistance.
   Accordingly, we were able to assign a numerical rank to each gene
   mutation feature by their importance to CPI outcomes ([129]figure 6A,
   Table S8A). In binary genetic signatures, we found that CTNNA1 was the
   top contributor to response and RARA was the top contributor to
   resistance. In HFM/LFM genetic signatures, we found that ALOX12B LFM
   mutations were the top contributor to response and MYC LFM mutations
   were the top contributor to resistance. In hotspot genetic signatures,
   we found that ALOX12B LFM was the top contributor to response and EGFR
   mutation at amino acid 746 was the top contributor to resistance.
   CTNNA1 and RARA and other mutations that were prominent in the binary
   signature were also highly ranked in HFM/LFM and hotspot indicating
   consistency between inputs of the impact of these genes on response to
   CPI.

Figure 6. ML identifies genetic markers associated with outcomes to CPI. (A)
Waterfall plot of the top 10 genes of the 36 ML-derived genetic signatures
ranked by feature importance using the three different inputs (left panel for
binary, middle for HFM/LFM and right for hotspot). Positive values indicate
association with response (green bars), and negative values with resistance
(red bars) to CPI. (B) Plot representing feature importance in genes in which
HFM and LFM mutations within the same gene have divergent contribution to CPI
outcome, with red associated with resistance and green with response. CPI,
checkpoint inhibitors; HFM, high frequency mutations; LFM, low frequency
mutations; ML, machine learning.

   [130]Figure 6
   [131]Open in a new tab

   Remarkably, we observed a set of genes that had opposite effects on CPI
   response, depending on the type of mutation ([132]figure 6B, Table
   S8B). For example, in HFM/LFM input, we found that LFM mutations in
   FGFR2 were associated with response, while HFM mutations were
   associated with resistance ([133]figure 6B). Furthermore, in hotspot
   inputs, we found that certain TP53 HFM mutations were associated with
   response (TP53_HFM_331), while other TP53 HFM mutations were associated
   with resistance (TP53_HFM_152) (Table S8A). These findings suggest that
   different mutations within the same gene can lead to opposite outcomes
   in CPI-treated patients.

Topology/network-based pathway enrichment analysis reveals pathways
associated with CPI response

   While it is important to identify individual mutations (or
   combinations) that affect CPI outcomes, there is a need to understand
   the biological processes affected by these unique gene mutations and
   how they affect CPI response. To explore the biological processes
   influenced by the genetic signatures, we investigated whether these
   signatures fall into meaningful biological processes. We applied a
   network/topology-based pathway analysis that takes the upstream and
   downstream gene–gene or protein–protein interactions into account. We
   used six different algorithms to perform topology-assisted pathway
   analysis (please see Methods), using the Computational Biology Methods
   for Drug Discovery (CBDD) R package[134]^22 (please see Methods) which
   facilitated the discovery of additional pathways that would otherwise
   not be detected with an enrichment-only approach. With binary input,
   the top pathways across six algorithms were associated with PDGF,
   oxidative stress, prolactin, thrombopoietin, AGTR1, IL-3 and YAP/TAZ
   and IL-6 signaling ([135]figure 7A, Table S9A). For HFM/LFM input, the
   uppermost pathways were associated with androgen, IL-6, PDGF,
   epigenetics, oxidative stress, ErbB2 and EGFR signaling ([136]figure
   7B, Table S9B). For hotspot input, we found the uppermost pathways to
   be associated with PDGF, androgen, IL-6, AGTR1, HGF, EGFR and IL-2
   signaling ([137]figure 7C, Table S9C). The enrichment of PDGF and IL-6
   as well as other pathways was shared across the binary, HFM/LFM and
   hotspots inputs, indicating consistency between these inputs to enable
   identification of pathways relevant for response to CPI ([138]figure
   7D, Table S9D). Since there was an overlap of 58 genes between the
   three ML inputs ([139]figure 5F), we examined the pathways that were
   enriched in this overlap. This approach revealed pathways associated
   with IFN alpha/beta, PDGF, MAPK, oxidative stress, ALK, ErbB2, IL-6,
   leptin, prolactin and IL-2 signaling ([140]figure 7E, Table S9E). The
   overall highest-ranking pathways among all inputs ([141]figure 7A–C)
   were associated with PDGF, oxidative stress, androgen, MAPK, prolactin,
   AGTR1, IL-6, EGFR and YAP/TAZ signaling ([142]figure 7E, Table S9E). In
   summary, our pathway analysis reveals several tumor intrinsic pathways
   like oxidative stress and PDGF involved in outcomes to CPI as well as
   several previously reported immune response-related pathways like
   interleukin (IL)-6 and IL-2 to be involved in outcomes to CPI.

Figure 7. Pathway analysis of predictive genetic signatures associated with
CPI outcomes. This figure illustrates the immune response and other
biological pathways associated with CPI outcomes, identified through pathway
analysis of predictive genetic signatures. The top 15 pathways were derived
using the Computational Biology Methods for Drug Discovery pathway analysis,
which used six different network-based algorithms. Pathways are ranked by the
lowest p values (derived from the hypergeometric test) across the algorithms
and are sorted according to the composite score (see Methods for details).
(A–C) The top pathways identified from the (A) binary, (B) HFM/LFM, and (C)
hotspot inputs. (D) The top pathways from A to C (cross input pathways). (E)
Pathways enriched using the 58 overlapping genes among the genetic signatures
from the three inputs, as shown in the bottom Venn diagram of [143]figure 5F
(overlap pathways). Cross input pathways (D) refer to the pathways that are
consistently identified across all three input types (binary, HFM/LFM, and
hotspot), indicating their broad relevance. Overlap pathways (E) refer to the
pathways identified from the subset of genes that are common across all three
genetic signature inputs, highlighting specific pathways that are crucial due
to the shared gene set. CPI, checkpoint inhibitors; HFM, high frequency
mutations; LFM, low frequency mutations.

   [144]Figure 7
   [145]Open in a new tab

Higher IL-6 serum levels are associated with progressive disease in NSCLC

   Since IL-6 signaling appeared in the results of analyses in multiple
   pathways, we investigated its importance in a cohort of patients with
   NSCLC treated with atezolizumab. We found that serum levels of IL-6
   were elevated in both stable disease (SD) and progressive disease (PD)
   patients when compared with partial response (PR) patients using either
   RECIST (Response Evaluation Criteria in Solid Tumors) ([146]figure 8A)
   or irRC (immune-related response criteria) (data not shown).
   Furthermore, OS analysis of patients with NSCLC treated with
   atezolizumab revealed a significantly longer survival in patients with
   lower IL-6 serum levels, which is consistent with previous and more
   recently published reports[147]29,[148]31 ([149]figure 8B). The
   involvement of IL-6 in resistance to atezolizumab validates our ML
   algorithm, which allows greater delineation of the biological pathways
   and mechanism(s) behind resistance or response to CPI.

Figure 8. IL-6 validation in the atezolizumab study shows high serum levels
correlate with disease progression. (A) Boxplot depicting baseline (cycle
1 day one pre-dose) serum IL-6 levels in patients whose response to
atezolizumab was assessed by the RECIST criteria (CR, PR, SD or PD). Unpaired
two-sided t-test p values were shown above the boxes. (B) Kaplan-Meier curves
of overall survival of patients treated with atezolizumab (trial PCD4989G)
comparing patients with high (blue) and low (red) IL-6 serum levels. The
number of patients at risk is given below the Kaplan-Meier plot. P value
indicated inside the graph was from the log-rank test comparing the two
patient groups. CR, complete response; IL, interleukin; PD, progressive
disease; PR, partial response; RECIST, Response Evaluation Criteria in Solid
Tumors; SD, stable disease.

   [150]Figure 8
   [151]Open in a new tab

Discussion

   In this study, we report a comprehensive analysis of tumor genomic
   determinants that affect outcomes of patients with NSCLC to CPI.

   We classified patients by outcomes over an extended period, which
   enabled a different perspective compared with objective response rate,
   with the extra length of time in our analyses better representing
   real-world outcomes of CPI therapy. We focused on innate resistance
   rather than acquired resistance. By targeting innate resistance, we
   aimed to identify genomic mechanisms intrinsic to tumors that drive
   resistance to CPI therapy. This distinction is crucial for
   understanding pre-existing actionable resistance pathways. Finally, the
   exclusion of early death events in our analyses increases the focus on
   drug response and excludes patients who had extremely advanced disease
   and morbidities, allowing us to focus on predictive rather than
   prognostic effects.[152]^12 This approach enables more focused data
   analysis for biomarkers of response. The biomarkers identified from
   this analysis included those (eg, STK11 and STK11-KRAS), which have
   been previously reported to be associated with response to CPI.[153]^32
   33

   Using a traditional statistical test comparing response and resistance,
   we found that mutations in single genes (BRAF, BRIP1, ASXL1, and
   CDKN2B) as well as co-occurring mutations (TP53-KEL and TP53-KRAS-NF1)
   correlate with CPI response, including OS. CDKN2B deletion correlated
   with decreased OS in CPI-treated patients, which is in keeping with a
   previous report in advanced urothelial carcinoma.[154]^34 35 However,
   CDKN2B deletion, in our studies with the larger cohort of patients,
   correlated with a decrease in OS in chemo-treated patients as well,
   suggesting this to be a prognostic rather than predictive marker for
   CPI outcomes. We did not observe any significant OS differences between
   patients with mutations in TP53 alone but found a significant
   difference if TP53 mutations co-occurred with KRAS and NF1. This data
   from the co-occurring mutations suggests that specific combinations of
   mutations may affect outcomes, and provides insight beyond conventional
   single-gene analyses.

   We classified the mutations as HFM and LFM, where HFM and LFM tend to
   have gain of function and loss of function characteristics,
   respectively.[155]^25 We found that not all HFM and LFM or hotspots
   mutations within the same gene lead to similar clinical outcomes with
   CPI, including TP53[156]^36 and KEAP1.[157]^32 33 36 KEAP1 was
   previously reported to be associated with resistance to CPI
   therapy.[158]^32 33 However, our analysis showed that not all mutations
   in KEAP1 lead to resistance, and while KEAP1 HFM 320 was associated
   with resistance, LFM were associated with response. These divergent
   therapy outcomes to different mutations within the same gene were also
   found in other genes. The number of genes indicates that this
   observation is not anecdotal, and that binary classification (mutated
   vs WT) of genes is insufficient for exploring the role of mutations in
   outcomes to CPI, as two different mutations in the same gene may lead
   to opposite clinical consequences. The interpretation of how different
   mutations, in the same gene, lead to opposite outcomes requires
   experimental investigation as it can be due to different active
   mechanisms or non-causal factors. These divergent outcomes from
   different mutations in the same gene also suggest that it is necessary
   to assess each individual mutation and its effect on CPI outcome, but
   this is presently not feasible due to the high number of these variants
   that results in limitations in statistical power.

   These signatures, as well as a shared signature identified from the
   common gene set, could inform a patient enrichment strategy for
   patients with NSCLC for treatment with CPI. The signature genes were
   analyzed with network algorithms to identify the biological pathways
   influencing CPI response. This identified several pathways
   significantly associated with CPI outcomes. Notably, some of these
   pathways were previously reported to be involved with CPI outcomes like
   IFN-alpha/beta,[159]^37 38 EGFR[160]^39 40 and IL-6,[161]^29 41 42
   further credence to our analysis. The association of these pathways
   with CPI outcomes may have therapeutic implications, as there are
   already several Food and Drug Administration-approved drugs that can
   target these pathways being tested in combination with CPIs to increase
   response rate.[162]^43 Additionally, using data available from a phase
   I clinical study, we confirmed IL-6 involvement in CPI resistance and
   showed that higher baseline serum levels of IL-6 in CPI-treated
   patients were associated with disease progression and poor OS. This is
   in line with the reports on higher IL-6 serum levels leading to worse
   outcomes in metastatic urothelial bladder cancer and in metastatic
   renal cell carcinoma treated with anti-PD-L1,[163]^29 44 45 implying
   that IL-6 signaling might be an important target in several cancers and
   is not limited to NSCLC. Furthermore, our pathway analysis showed that
   WNT/Beta-catenin, PTEN and LKB1/STK11-related pathways may affect CPI
   outcomes, consistent with previous reports,[164]^32 44 46 47 further
   sustaining our observations. Additionally, the BRAF/MAPK pathway has
   been previously reported in preclinical and clinical settings to be of
   central importance in inhibiting T cell-dependent antitumor activity,
   and inhibiting BRAF/MEK concurrently with CPI has been demonstrated to
   have synergistic effects in bringing about tumor
   regression.[165]48,[166]50

   In addition to these previously reported pathways, we identified
   several resistance pathways that were not previously highlighted to
   affect CPI outcomes, with the most significant pathways among these
   being related to PDGF, ESR1, YAP/TAZ, EGFR, IGF, and AGTR1 signaling.
   Our pathway analyses can potentially facilitate biomarker selection.
   There are only a limited number of biomarkers presently available for
   CPI (eg, PD-L1, TMB and mismatch repair deficiency) all sharing the
   characteristic of being related to TMB/immunogenicity.[167]^2
   IdentificationThis of novel biomarkers can improve patient selection
   and guide better outcomes. This includes potential novel biomarkers in
   pathways such as PDGF, Leptin, IGF, Estrogen, and HGF that can be
   measured in either the tumor samples or potentially using non-invasive
   serum samples.

   Our analyses provide a tumor-genome intrinsic view, as we used the
   baseline (pre-CPI treatment) tumor genetic profile to predict outcomes
   of CPI therapy. The pathway analyses inform on the processes that are
   altered in the tumor microenvironment, but an extended analysis of the
   processes can be obtained from RNA sequencing analysis that measures
   the messenger RNA expression in heterogeneous (immune, fibroblasts,
   endothelial and tumor) cell populations. There are several other
   aspects to consider when interpreting the data from the current
   analysis. This includes the limited data set of 286 genes, variations
   in tissue source (primary tumor vs metastatic site) and lines of
   therapy as well as the possibility that these mutations could have been
   selected at least in part due to their increased immunogenicity as
   targets and not because of their functional biological
   contribution.[168]^51 This latter possibility, however, is rather low,
   because it would require that such mutations create a de novo epitope
   presented and recognized by the immune system in all affected patients.
   Furthermore, the genetic data used in pathway analysis cannot reveal
   whether the pathway is inhibited or activated, which requires further
   experimental studies or previously reported differentiation in a
   relevant context to interpret the direction of the impact.[169]^42
   Notwithstanding these caveats, it is encouraging that with these
   limited sets of genes, the models have an efficiency similar to TMB and
   could account for up to 65% of clinical outcomes. As more whole
   exome/genome sequenced patients become available, it might be possible
   to further refine our predictive model and deepen the biological
   understanding of CPI outcomes with this approach.

   Of note, the adjusted FDR values from multiple Fisher’s exact tests may
   have underestimated the significance of some mutations or co-occurring
   mutations, as they assume independence. It is accepted, however, that
   mutations are not independent of each other,[170]^52 and as we
   demonstrated above, some combinations of mutations can affect outcomes.

   Patients can be stratified based on mutations identified in the
   signatures, and these signatures could potentially be combined with TMB
   and PD-L1 to create a model for patient stratification. While testing
   these approaches is beyond the scope of this paper, four alternative
   strategies with increased complexity can be used to facilitate the
   evaluation and use of the identified signatures in clinical trial
   design. First, patients can be stratified based on the four genes
   (BRIP1, FLT1, FGF10, and BRAF) identified in the signatures
   ([171]online supplemental figure S4). Second, patients can be
   stratified based on one or a combination of the pathways identified,
   such as IL-6, or a combination of pathways ([172]figure 7). Third, a
   predictive model based on the 36 predictive signatures ([173]figure 5)
   can be used to stratify the patients. Fourth, our earlier data
   demonstrated that combining PD-L1 and TMB[174]^12 can increase the
   likelihood of response. Accordingly, the identified mutations,
   pathways, and predictive signatures could potentially be combined with
   TMB and PD-L1 to create a more advanced predictive model for patient
   stratification.

   There are presently no approved therapies to circumvent or reverse
   resistance to CPI, which is in part due to the lack of understanding
   the mechanism behind this resistance. In this report, we expand our
   understanding of CPI outcomes by using a GA framework, which resulted
   in the identification of multiple predictive genes, mutations and
   biological pathways with several models that can predict a patient’s
   outcome to anti PD-(L)1 therapy using pretreatment data. The
   identification of mutations, pathways and potential biomarkers also
   provide testable hypotheses for the mechanism behind CPI resistance.
   Our identification of these novel targets can be used to explore
   predictive biomarkers, treatment selection, and patient enrichment
   strategies, all possibly leading to better outcomes in CPI-treated
   patients.

Methods

Research design and patient cohorts

   We used integrated FH-FMI CGDB which consists of longitudinal clinical
   data derived from electronic health records curated via
   technology-enabled abstraction and were linked to genomic data from the
   same patient who underwent FMI comprehensive genomic profiling tests by
   de-identified, deterministic matching.[175]^53 The study included 7,360
   patients diagnosed with NSCLC from January 2003 to May 2021 who
   underwent FoundationOne (F1) or FoundationOne CDx (F1CDx) tests (ie,
   only patients with T7 or DX1 baitset data) and had treatment data. We
   used rWR data[176]^15 (CR, PR, SD, PD or not evaluable (NE) on a
   specified date) and rwP data (progressed or not progressed at a
   specified date)[177]^16 for defining the clinical outcome groups. To
   study the clinical benefit of CPI therapy and to have a more balanced
   number of patients in each of these outcome group, CR, PR and SD from
   the rwR data were considered as having tumor response. Progressive
   disease and rwP, death or a change to a non-CPI treatment line within
   the study duration was considered to be disease progression. A patient
   was considered to have a “durable clinical benefit”, hereafter referred
   to as “response”, if there was tumor response and no disease
   progression starting 14 days after initiation of CPI to the end of the
   study duration (270 days). A patient was considered to have “innate
   resistance”, hereafter simply referred to as “resistance” if there was
   disease progression without any tumor response during the study
   duration. Acquired resistant patients, defined as having response
   followed by disease progression within the 270-day study duration, were
   not included. Using these definitions, 574 patients who had rwP were
   assigned to the response group and 576 patients with disease
   progression, as defined above, were assigned to the resistance group.
   Genetic variations in these two outcome groups were then investigated
   by ML methods to identify key variables and pathways that might explain
   the drug response. No explicit cohort balancing of the two outcome
   groups was performed. However, baseline covariates (age, race, sex,
   histology, smoking history, initial disease stage) were examined and
   found to have no correlation with the outcomes after FDR correction.
   For determining the study duration, sensitivity analysis using periods
   ranging from 120 to 365 days, in~2–3 months increments, was performed.
   The study duration of 270 days resulted in an optimal balance in
   patient number in each outcome group and is clinically relevant.
   Disease progression within the first 14 days after CPI treatment was
   not counted, since it may not reflect the effect of the current
   treatment (recommendation by Flatiron Health).

   To better distinguish CPI-specific predictive effects of biomarkers
   from prognostic effects (independent of treatment type), we also
   analyzed standard-of-care chemotherapy cohorts. We also removed
   patients with early deaths (patients who died in the first 18 weeks
   after treatment start) to better delineate the two effects. This is
   because it has been reported that the CPI and chemotherapy survival
   curves did not differentiate well until after about 18 weeks.[178]^5 12
   To maximize the number of patients with NSCLC with FMI (Foundation
   Medicine) targeted DNA sequencing data, we selected patients with data
   from the FoundationOne (F1; T7 baitset) or FoundationOne CDx (F1CDx;
   DX1 baitset) panels for solid tumors. We then focused on the 288 genes
   profiled on both panels.

Treatment data

   CPI analysis included monotherapies nivolumab (~31%), pembrolizumab
   (~62%), atezolizumab (~5%), durvalumab (~2%). Chemotherapy patients
   included patients with cytotoxic monotherapy or combinations with other
   cytotoxics or targeted therapies who did not have “immunotherapy” in
   the patient’s history in the database. For patients who had multiple
   lines of CPI or chemotherapy, their first CPI or chemotherapy records
   were used for analysis.

Tumor mutation burden

   TMB (number of mutations per Mb) calculated from targeted DNA
   sequencing using F1 or F1CDx panels[179]^53 on tumor biopsies from all
   analyzed patients as provided by FMI. TMB data from the most recent
   specimens collected before the start of treatment was used. Research
   Use Only calculations based on FMI’s research algorithm used at the
   time of collection were analyzed.[180]^54 The TMB-only model was
   trained using logistic regression on the same train/validation patient
   split used for the mutation signature models.

IL-6 analysis

   Plasma IL-6 protein levels measured by Luminex multiplex assay were
   from the atezolizumab phase I trial ([181]NCT01375842).[182]^55 Samples
   from cycle 1 day one predose time point from patients with NSCLC were
   analyzed. The survival curves were performed using the Kaplan-Meier
   method.

Mutation data preparation

   No explicit functional filters were applied to mutations during the ML
   analysis, allowing the inclusion of all mutation types (eg, short
   variants, copy number variations, and rearrangements) to
   comprehensively identify genomic signatures associated with CPI
   resistance or response. To ensure data quality, technical filters were
   implemented. These included restricting the analysis to mutations
   identified using validated bait sets (T7 or DX1) from Foundation
   Medicine and excluding synonymous mutations to focus on variants with
   potential functional implications. This approach enabled robust
   identification of clinically relevant genomic features while minimizing
   the inclusion of technical or biologically irrelevant artifacts. After
   associating a CPI-resistance label to each patient, the mutations for
   each CPI patient were filtered to remove synonymous mutations and then
   used as three different sets of inputs in ML with different
   aggregations of mutations: (1) binary input (all mutations in a gene
   are aggregated into one feature per gene, ie, whether a gene is mutated
   or not), (2) HFM or LFM (high/low frequency mutations; see the next
   paragraph for definition) inputs (all HFM in a gene are aggregated into
   one feature and LFM into another; 1–2 features per gene), and (3)
   hotspot granular input (the HFM are further separated into individual
   mutations). In the 1,150 patients (March 2015 to May 2021) used for
   analysis in this study, there were 288 binary inputs, 460 HFM/LFM
   inputs, and 528 hotspot granular inputs. The input data set was
   randomly split into training and validation subsets, stratified by
   CPI-resistance label, leaving 977 training patients and 173 validation
   patients.

   HFM/LFM input categorization was performed as follows: (1) we
   calculated the total number of amino acids mutated per gene as X (2) we
   calculated the frequency of mutations in each gene (ie, how many
   patients had any mutation in that gene) as Y. (3) From 1 to 2 we
   calculated the average amino acid mutation rate:
   [MATH: <mstyle displaystyle="true"
   scriptlevel="0"><mrow><mi>a</mi><mi>v</mi><mi>e</mi><mi>r</mi><mi>a</mi
   ><mi>g</mi><mi>e</mi><mtext> </mtext><mi>a</mi><mi>a</mi><mtext> </mtex
   t><mi>m</mi><mi>u</mi><mi>t</mi><mi>a</mi><mi>t</mi><mi>i</mi><mi>o</mi
   ><mi>n</mi><mtext> </mtext><mi>r</mi><mi>a</mi><mi>t</mi><mi>e</mi><mo>
   =</mo><mtext> </mtext><mfrac><mrow><mi
   mathvariant="normal">%</mi><mtext> </mtext><mi>p</mi><mi>a</mi><mi>t</m
   i><mi>i</mi><mi>e</mi><mi>n</mi><mi>t</mi><mi>s</mi><mtext> </mtext><mi
   >w</mi><mi>i</mi><mi>t</mi><mi>h</mi><mtext> </mtext><mi>a</mi><mi>n</m
   i><mi>y</mi><mtext> </mtext><mi>m</mi><mi>u</mi><mi>t</mi><mi>a</mi><mi
   >t</mi><mi>i</mi><mi>o</mi><mi>n</mi><mtext> </mtext><mi>i</mi><mi>n</m
   i><mtext> </mtext><mi>t</mi><mi>h</mi><mi>e</mi><mtext> </mtext><mi>g</
   mi><mi>e</mi><mi>n</mi><mi>e</mi><mtext> </mtext><mrow><mo>(</mo><mi>Y<
   /mi><mo>)</mo></mrow></mrow><mrow><mi>t</mi><mi>o</mi><mi>t</mi><mi>a</
   mi><mi>l</mi><mtext> </mtext><mi>n</mi><mi>u</mi><mi>m</mi><mi>b</mi><m
   i>e</mi><mi>r</mi><mtext> </mtext><mi>o</mi><mi>f</mi><mtext> </mtext><
   mi>a</mi><mi>a</mi><mtext> </mtext><mi>m</mi><mi>u</mi><mi>t</mi><mi>a<
   /mi><mi>t</mi><mi>e</mi><mi>d</mi><mtext> </mtext><mi>i</mi><mi>n</mi><
   mtext> </mtext><mi>t</mi><mi>h</mi><mi>e</mi><mtext> </mtext><mi>g</mi>
   <mi>e</mi><mi>n</mi><mi>e</mi><mtext> </mtext><mrow><mo>(</mo><mi>X</mi
   ><mo>)</mo></mrow></mrow></mfrac></mrow></mstyle> :MATH]

   HFM was assigned to any mutation that had 2× the average mutations per
   that specific amino acid and had more than/equal to nine mutations in
   that gene. LFM was assigned to any mutation that had lower than 2× the
   average mutations per that specific amino acid and/or had less than
   nine mutations. Amplifications were considered as HFM and deletions as
   LFM; the rationale behind this was that LFM tend to be loss of function
   and high-frequency mutations tend to be gain of function. Hotspot
   granular classification was the same as in HFM/LFM, but the amino acid
   mutation location to any HFM was added (eg, TP53_178, meant HFM in TP53
   in amino acid 178). Any HFM that lacked the information about amino
   acid location was considered an amplification mutation. To assign HFM,
   LFM and hotspot mutations 4,433 patients with NSCLC from the CGDB were
   used.

Pathway analysis using CBDD

   The predictive genes in the genetic signatures from the ML output were
   used to identify extended biological network member genes using six
   different network-based algorithms (CBDD R package,[183]^56 which used
   the Metabase network and pathway data (Thomson Reuters)). The
   algorithms used were Network Propagation,[184]^57
   Interconnectivity,[185]^58 Overconnectivity[186]^59 Hidden
   nodes[187]^60 GeneMania,[188]^61 and Causal Reasoning.[189]^62 63 The
   top 100 genes resulting from each algorithm were then used to run a
   pathway enrichment analysis on the Metabase pathways. Each pathway then
   got an enrichment p value from each of the six network algorithms.

   A composite score was then calculated for each pathway using the six
   enrichment p values. A common way of using p values for scoring is the
   “−log[10]P” transformation, in which higher value indicates more
   significance. However, the relationship between statistical and
   biological significance is not linear. For example, a p value of 1e−50
   (−log[10]p=50) is not 10 times more significant than 1e−5
   (−log[10]p=5). They both mean a high level of significance. To more
   closely capture the biological and statistical significance reflected
   by p values, we performed a transformation on −log[10]P (Equation 1)
   such that the final score better reflected the level of significance,
   which increased and then plateaued as −log[10]P increases (Table S10
   and see the interactive transformation curve on [190]this page).
   [MATH: <mstyle displaystyle="true"
   scriptlevel="0"><mrow><mi>s</mi><mi>c</mi><mi>o</mi><mi>r</mi><mi>e</mi
   ><mo>=</mo><mn>10</mn><msqrt><mfrac><mi>x</mi><mrow><mn>12</mn><mo>+</m
   o><mi>x</mi></mrow></mfrac></msqrt><mtext> </mtext><mtext> </mtext><mte
   xt> </mtext><mtext> </mtext><mtext> </mtext><mtext> </mtext><mtext> </m
   text><mi>x</mi><mo>=</mo><mo>−</mo><mi>l</mi><mi>o</mi><msub><mi>g</mi>
   <mrow><mn>10</mn></mrow></msub><mi>P</mi></mrow></mstyle> :MATH]
   (1)

   We then scaled the scores from each algorithm and calculated the
   composite score per pathway using the harmonic sum of scores from the
   different algorithms (Equation 2), where k is the rank of the algorithm
   score per pathway.
   [MATH: <mstyle displaystyle="true" scriptlevel="0"><mrow><mstyle
   displaystyle="true"
   scriptlevel="0"><munderover><mo>∑</mo><mrow><mi>k</mi><mo>=</mo><mn>1</
   mn></mrow><mrow><mi
   mathvariant="normal">∞</mi></mrow></munderover><mfrac><mn>1</mn><mi>k</
   mi></mfrac></mstyle></mrow></mstyle> :MATH]
   (2)

Genetic algorithm feature selection

   Our ML approach applied Naive Bayes models with a Bernoulli prior as
   the model architecture, embedded in a GA for feature selection to
   reveal predictive genetic signatures.[191]^64 65 We used GA because of
   the large set of available features (binary:288, HFM/LFM:460,
   hotspot:528) relative to the number of patients (n=977 in the training
   set), as it can efficiently find the most predictive feature
   combinations (out of binary:2^288, HFM/LFM:2^460, hotspot:2^528) while
   including multifeature interactions.[192]^64 65 Naive Bayes models were
   chosen for the simplicity of their internal state, resistance to
   overtraining, and interpretability.[193]^66 Random Forest and other
   ensemble-based methods were attempted but found to require heavy
   hyperparameter tuning to avoid overtraining during the GA search. The
   Bernoulli prior is appropriate for binary input data and includes a
   penalty term for the feature absent of mutation, differentiating it
   from a multinomial prior. The details are outlined in the subsequent
   paragraphs, accompanied by a schematic diagram in [194]online
   supplemental figure S3. Python scripts ([195]online supplemental file
   3) for implementation and performance evaluation are also provided.

   In this study, we defined a GA individual as a subset of the available
   input features of the data set, represented as a binary mask (mutated
   or not) over all features. The fitness of each individual is calculated
   based on the predictive power of a model which uses only the features
   contained in the subset and is trained to predict the response to CPI
   category, “response” or “resistance”.

   For each GA individual, the fitness was the CV accuracy score of a
   Naive Bayes model on the training set. The accuracy score was
   class-balanced to avoid favoring “response” over “resistance,” or vice
   versa. The CV uses stratification to keep the same fraction of
   “response” and “resistance” patients in each fold. The number of folds
   used was 5, a compromise to keep the number of patients in each fold
   high while keeping the number of folds high enough to be confident in
   the result. The training data was shuffled for each individual before
   CV to avoid overfitting on CV folds during GA optimization.

   In each generation of the GA, individuals were selected for mutation or
   crossover using fitness proportionate selection,[196]^67 which samples
   individuals probabilistically based on their fitness in order to
   maintain diversity in the GA breeding pool. The breeding pool of each
   generation is supplemented by a set of the highest fitness individuals
   from all previous generations. Mutation occurs by randomly removing or
   adding features to the subset while conserving the average number of
   features. The average number of features is conserved during mutation
   by partitioning the probability of mutation between adding features and
   removing features in order to retain (on an average) the same number of
   features removed and added. Without this partition, mutation would tend
   to increase the size of models with less than 50% of the total features
   used and decrease the size of models with more than 50% of the features
   used, regardless of the fitness of the result. Crossover occurs by
   randomly selecting each feature flag of the binary mask from two
   previous individuals and does not include further correction: crossover
   on average will produce offspring with the number of features midway
   between the parents.

   The GA procedure was run with the following parameters set. The GA was
   run for 200 generations, each with a population of 1,000 individuals.
   Larger populations and larger numbers of generations were not found to
   produce different results, as the GA was able to find an optimum within
   this time. The first generation was generated randomly such that on
   average, 10% of the features were included in each individual. This
   fraction was chosen to correspond roughly to the number of features at
   the end of the GA. During mutation, on average, 1% of features were
   removed or added to an individual. The top 200 individuals over all
   generations were added to the breeding pool for each generation for a
   total breeding pool size of 1,200 in each generation after the first.
   To generate each successive generation, 600 individuals were formed by
   mutating an individual from the breeding pool while the remaining 400
   of each generation were formed by crossover of two previous
   individuals, a relative fraction chosen to slightly favor mutation in
   order to increase diversity in the population.

   The GA was run 10 times for each mutational input categorization
   (binary per-gene, gain or loss of function, hotspot). Since each run of
   the GA tended to find separate local optima, these 10 runs along with
   the clustering technique described below were used to identify multiple
   local optima that were too distant for a single GA run to identify.

   After 10 runs of 200 generations with a population size of 1,000 there
   were 2,000,000 GA individuals which are available. The top 5% (100,000)
   of all individuals, based on their CV score, were selected and then
   clustered according to the similarity of the features they contain. The
   clustering was done using a simple k-means clustering with 12 clusters
   to account for the expected 10 separate local optima (one from each run
   of the GA) plus some leeway for outliers.

   In each cluster, the set of features that appeared in more than 50% of
   cluster members was considered the characteristic set of features for
   that cluster. A final model was then trained on each of the 12
   characteristic sets and evaluated on the validation set.

Feature importance calculation

   In order to visualize the importance of individual mutations on the
   prediction outcomes, the internal state of the 36 Bernoulli Naive Bayes
   models were converted into their equivalent linear (logistic
   regression) coefficients^[197]68 . This conversion for each gene
   mutation feature is outlined below.
   [MATH: <mi>c</mi><mi>o</mi><mi>e</mi><mi>f</mi><mfenced
   separators="|"><mrow><mi>r</mi><mi>e</mi><mi>s</mi><mi>p</mi></mrow></m
   fenced><mo>=</mo><mi>l</mi><mi>n</mi><mi></mi><mfenced
   separators="|"><mrow><mfrac><mrow><mi>P</mi><mfenced
   separators="|"><mrow><mi>m</mi><mi>u</mi><mi>t</mi><mo>|</mo><mi>r</mi>
   <mi>e</mi><mi>s</mi><mi>p</mi></mrow></mfenced></mrow><mrow><mn>1</mn><
   mo>-</mo><mi>P</mi><mfenced
   separators="|"><mrow><mi>m</mi><mi>u</mi><mi>t</mi><mo>|</mo><mi>r</mi>
   <mi>e</mi><mi>s</mi><mi>p</mi></mrow></mfenced></mrow></mfrac></mrow></
   mfenced><mi></mi> :MATH]
   [MATH: <mi>c</mi><mi>o</mi><mi>e</mi><mi>f</mi><mfenced
   separators="|"><mrow><mi>r</mi><mi>e</mi><mi>s</mi><mi>i</mi><mi>s</mi>
   <mi>t</mi></mrow></mfenced><mo>=</mo><mi>l</mi><mi>n</mi><mi></mi><mfen
   ced separators="|"><mrow><mfrac><mrow><mi>P</mi><mfenced
   separators="|"><mrow><mi>m</mi><mi>u</mi><mi>t</mi><mo>|</mo><mi>r</mi>
   <mi>e</mi><mi>s</mi><mi>i</mi><mi>s</mi><mi>t</mi></mrow></mfenced></mr
   ow><mrow><mn>1</mn><mo>-</mo><mi>P</mi><mfenced
   separators="|"><mrow><mi>m</mi><mi>u</mi><mi>t</mi><mo>|</mo><mi>r</mi>
   <mi>e</mi><mi>s</mi><mi>i</mi><mi>s</mi><mi>t</mi></mrow></mfenced></mr
   ow></mfrac></mrow></mfenced><mi></mi> :MATH]
   [MATH:
   <mi>c</mi><mi>o</mi><mi>e</mi><mi>f</mi><mo>=</mo><mi>c</mi><mi>o</mi><
   mi>e</mi><mi>f</mi><mfenced
   separators="|"><mrow><mi>r</mi><mi>e</mi><mi>s</mi><mi>p</mi></mrow></m
   fenced><mo>-</mo><mi>c</mi><mi>o</mi><mi>e</mi><mi>f</mi><mo>(</mo><mi>
   r</mi><mi>e</mi><mi>s</mi><mi>i</mi><mi>s</mi><mi>t</mi><mo>)</mo>
   :MATH]

   Where “P(mut│resp)” is the probability of mutation for the response
   group, “P(mut│resist)” is the probability of mutation for the
   resistance group. “Coef(resp)” is the linear coefficient for the
   feature in the response group, while “coef(resist)” is the linear
   coefficient for the feature in the resistance group. “coef” is the
   final linear coefficient of the feature. “coef” > 0 indicates
   association of the feature with response while “coef” < 0 indicates
   association with resistance. The sum of coefficients of all the
   features determines the prediction of response or resistance for a
   patient.

   When
   [MATH: <mstyle displaystyle="true"
   scriptlevel="0"><mrow><munder><mo>∑</mo><mrow><mi>m</mi><mi>u</mi><mi>t
   </mi></mrow></munder><mi>c</mi><mi>o</mi><mi>e</mi><mi>f</mi><mo>></mo>
   <mi>t</mi><mi>h</mi><mi>r</mi><mi>e</mi><mi>s</mi><mi>h</mi><mi>o</mi><
   mi>l</mi><mi>d</mi></mrow></mstyle> :MATH]
   , response is predicted over resistance.

   Where the threshold is the intercept of the model, which is a trained
   parameter of the model in addition to the coefficients.

Mutation co-occurrence analysis

   To assess potential relationships between co-occurrence of mutation
   pairs/triples and CPI response, Fisher’s exact tests were performed,
   unless there was zero count(s) in the 2×2 contingency table, then the
   Mantel-Haenszel method with continuity correction of 0.5 was used.
   Multiple testing adjustment of p values used the false discovery rate
   method. The contingency table structure for the number of patients with
   mutation pairs/triples is as below:
                                         Response Resistance
   All gene mutation features mutated
   Not all mutated (any or none mutated)
   [198]Open in a new tab

   Each row/column sum of the above table had to be at least 5 to remove
   very rare mutational combinations.

supplementary material

   online supplemental file 1
   [199]jitc-13-3-s001.docx^ (562.5KB, docx)
   DOI: 10.1136/jitc-2024-009092
   online supplemental file 2
   [200]jitc-13-3-s002.pdf^ (180.8KB, pdf)
   DOI: 10.1136/jitc-2024-009092
   online supplemental file 3
   [201]jitc-13-3-s003.zip^ (11.9KB, zip)
   DOI: 10.1136/jitc-2024-009092
   online supplemental table 1
   [202]jitc-13-3-s004.xlsx^ (1.6MB, xlsx)
   DOI: 10.1136/jitc-2024-009092

Acknowledgements