Graphical abstract

   graphic file with name fx1.jpg
   [56]Open in a new tab

Highlights

     * •
       Immune-related gene sets (irGS) identified via factorization of
       RNA-seq database
     * •
       The irGS refines pan-cancer immune subtypes
     * •
       The irGS enhances immunotherapy response classification
     * •
       The irGS improves functional annotation of spatial transcriptomic
       data
     __________________________________________________________________

   Bioinformatics; Cancer; Expression study; Immunity; Molecular network;
   Transcriptomics

Introduction

   Despite recent breakthroughs in immunotherapy such as immune checkpoint
   blockade (ICB)[57]^1 and adoptive cell therapy (ACT),[58]^2 only a
   limited fraction of patients benefits from these biomedical
   advancements. Challenges related to immune-related adverse events and
   resistance remain and hinder positive outcomes, especially in solid
   tumors.[59]^3 It is essential for the field of immunotherapy to
   understand molecular mechanisms underlying the treatment resistance and
   responses better predict treatment outcomes and develop targeted
   therapeutics. A systematic way to address these challenges is through
   statistical analysis of transcriptional programs of human patient
   samples,[60]^4 which can uncover useful gene expression patterns
   associated with immunotherapeutic responses. Interpretation of
   transcriptional programs and their functional significance, however,
   relies on utilizing previously established gene signature
   knowledgebases[61]^5^,[62]^6^,[63]^7^,[64]^8 such as Molecular
   Signature Database (MSigDB) Hallmarks (Hallmark),[65]^9 Kyoto
   Encyclopedia of Genes and Genomes (KEGG),[66]^10 Gene Ontology
   (GO),[67]^11 etc. Unfortunately, these knowledgebases are largely
   cell-type agnostic and lack granularity or immune specificity.[68]^7
   Such biases severely hamper data interpretation in immunological
   studies utilizing high-throughput RNA profiling, resulting in many
   missed opportunities. To further advance cancer immunotherapy, there is
   an urgent need to discover and expand immune-related gene
   signatures/sets (irGSs) in current knowledgebases to comprehensively
   represent diverse immune-specific transcriptional states and to apply
   them to interpret molecular functions.

   A closer inspection of the aforementioned knowledgebases reveals
   important limitations that are inherent to each source. For example,
   Hallmark[69]^9 contains only 7 immune-related pathways, with an average
   of 160 member genes each. The large gene sets in this database can
   potentially inflate biological significance, without genuine and
   specific relevance to the designated functions. On the other hand,
   KEGG[70]^10 contains mostly metabolism related pathways that are not
   specific to immune systems. The GO[71]^11 terms, with its hierarchical
   nature, result in collective enrichment of general terms with
   overlapping functions, diminishing statistical power to detect specific
   terms. Moreover, although there are some immune-related pathways in the
   aforementioned knowledgebases, they are too broadly defined. For
   example, in KEGG, the term T cell receptor (TCR) signaling does not
   distinguish receptor formation by T cells from antigen presentation by
   antigen-presenting cells, two complimentary functions in TCR signaling.
   In Hallmark, “inflammatory response” is too broad to inform detailed
   molecular programs or mechanisms. Therefore, there is a need to define
   more specific gene sets and refined annotations to achieve higher
   granularity.

   To address this knowledge gap, some immune specific knowledgebases have
   been constructed. In particular, ImmuneSigDB,[72]^12 consisting of over
   5,000 irGSs, was derived from RNA-expression datasets. These RNA
   expression datasets are very rich, derived from nearly 400
   noncancer-specific immunological experiments, containing samples
   challenged with pathogen infections, cytokines or immunological
   perturbations of different kinds and magnitudes, thus possessing rich
   immunological states. Some immunological states may be shared by a
   variety of conditions. For example, the severe immune dysregulation and
   inflammation,[73]^13^,[74]^14 observed in sepsis samples may inform
   immune cell proliferation, inflammatory signaling, and cellular
   dysfunction and exhaustion in tumor immune microenvironment (TIME).

   However, the ImmuneSigDB study, conducted nearly a decade ago, applied
   empirical approaches that may have resulted in some limitations. First,
   the study examined sets of genes that are differentially expressed
   (DEGs) across different comparisons of interest, and these comparison
   groups were subjectively decided by researchers as important and
   immune-relevant.[75]^12 Second, differential gene expression analysis
   tends to output large numbers of DEGs (200 ± 70, max = 1,992), with 50%
   of the gene set size ranges from 190 to 196 member genes. Third, the
   derived irGSs tend to have limited translational implications as they
   can only be used to differentiate major states (such as cell lineages)
   instead of more granular substates. Fourth, given that these comparison
   groups are defined based on empirical knowledge, ImmuneSigDB irGSs are
   likely incomplete. Lastly, the ImmuneSigDB irGSs are simply annotated
   using the GEO ID of the dataset and the comparison groups, thus
   containing limited annotations of immune functions, hampering results
   interpretation.

   In this paper, we aim to construct concise and well-annotated
   immune-specific gene sets using non-negative matrix factorization
   (NMF),[76]^15 a powerful factor analysis
   technique[77]^16^,[78]^17^,[79]^18^,[80]^19^,[81]^20 that has not yet
   been applied to ImmuneSigDB data. By decomposing gene expression of
   samples from ImmuneSigDB into composite programs, followed by addition
   statistical analysis to consolidate and annotate these programs, we
   have identified 28 irGSs describing various immunological processes.
   These irGSs can be applied to study and characterize various
   immune-related transcriptional states such as those in play in TIMEs.
   We have systematically validated gene set annotations on 12 different
   datasets and explored their translational implications on 15 different
   datasets (see [82]STAR Methods Data Sources). These datasets include
   various modalities, CITE-seq, BulkRNAseq, scRNA-seq, and spatial
   transcriptomics, and were profiled from a variety of immunological
   contexts, including but not limited to cytokine perturbation
   experiments, infectious disease, and treatment naive or
   immunotherapy-treated cancer patients. Through these exploratory
   analyses, we showed that these gene sets can better characterize
   pan-cancer tumor-immune subtypes, separate ICB response in melanoma,
   liver, and lung cancer and reveal spatial heterogeneity in ovarian,
   intestinal, and breast cancer samples.

Results

Twenty eight gene sets covering diverse immunological functions in the TIME

   The ImmuneSigDB datasets comprise 83 human bulk gene expression
   profiles with 1,826 RNA samples from studies published in leading
   immunology journals ([83]Figure S1A). The datasets included in the
   ImmuneSigDB were derived from patients affected by severe infections
   and sepsis or from cell lines challenged with experimental
   manipulations[84]^12: gamma delta T cells activated by IL2,[85]^21 and
   CD14^+ monocytes[86]^22 cultured with IFN-gamma and stimulated with
   Toll-like receptor 2 ligand. Such a large collection of datasets
   entails transcriptional profiles from diverse cell states, making them
   suitable to derive generalizable irGSs.

   The datasets encompass gene expression data obtained from human and
   mouse samples ([87]Figure S1B) that originated from samples deriving
   from different cell types ([88]Figure S1C). Among datasets profiled
   from Homo sapiens, the number of genes and samples profiled varies
   among datasets ([89]Figures S1D and S1E). We partitioned the 83
   ImmuneSigDB human datasets into 47 lymphoid-derived and 36
   myeloid-derived datasets.

   To obtain unique irGSs, we carried out the analytic workflow,
   illustrated in [90]Figure 1A and detailed in [91]Figure S1F and
   [92]STAR Methods, separately for the lymphoid and the myeloid data. We
   applied NMF on each qualifying dataset (see [93]STAR Methods) and
   obtained the top 50 genes from each latent factor based on the NMF
   loadings, producing a total of 529 gene sets. To ensure
   generalizability, gene sets derived from one dataset were kept only if
   they overlap at least 20% with at least one gene set derived from
   another dataset. To ensure non-redundancy, we excluded gene sets with
   over 20% overlap with gene sets from the same dataset, resulting in 115
   gene sets. To further reduce redundancy, we employed the algorithm
   proposed by Tirosh et al.,[94]^16 whereby gene sets are progressively
   merged into meta programs (MP) based on Jaccard distances (see [95]STAR
   Methods).

Figure 1.

   [96]Figure 1
   [97]Open in a new tab

   Analytic workflow and constructed gene sets

   (A) Gene set construction analysis workflow: we downloaded 83
   immunology relevant human studies and performed NMF on each of the
   qualifying dataset. We curated robust NMF programs and merged them into
   meta gene sets. We performed over-representation enrichment analysis
   for each meta gene set using KEGG pathways, Hallmarks, and biological
   process terms from GO. The name and annotation of each meta gene set
   was determined based on counts and false discovery rate (FDR) adjusted
   p values of the highly enriched terms. We seek relevant multimodal
   datasets to functionally name and validate each gene set, and further
   examined the translational utilities of these gene sets.

   (B) The roles of the constructed gene sets in central immunity:
   constructed gene sets overlaid near the most relevant processes during
   central immune activities in tumor-immune microenvironment. Created
   with [98]BioRender.com.

   Using the workflow described previously, we derived 19 immune-related
   MPs (irMPs) from lymphoid samples (L_MP1-19) and 9 irMPs from myeloid
   samples (M_MP1-9) ([99]Table 1). To annotate each irMP with
   immune-relevant functions, we performed over-representation analysis
   (ORA) using biological processes from GO, KEGG, and Hallmark and sorted
   the enrichment results first by counts of core enrichment then by
   ascending adjusted p values. Enrichment results and constituent genes
   for each L_MP and M_MP were presented in [100]Tables S1, [101]S2,
   [102]S3, and [103]S4. We named the irMPs based on top enriched terms
   and refined the annotations by going over the constituent genes and the
   stringDB[104]^23 protein-protein interaction (PPI) network, followed by
   meticulous scrutiny by multiple immunologists (Matthew Gubin Ph.D.;
   Hind Rafei M.D.; Rafet Basar M.D.; Katy Rezvani M.D/Ph.D.; and Weiyi
   Peng M.D/Ph.D.) from MD Anderson Cancer Center. Rationales for each
   gene set annotation have been documented in supplementary notes.

Table 1.

   Annotations for the 9 myeloid-derived irMPs and 19 lymphoid-derived
   irMPs
   Myeloid-derived gene sets Lymphoid-derived gene sets
   1 Antiviral defense network Cell cycle progression
   2 Antigen processing TCR anchoring
   3 Cytokine production Histone-associated lipid antigen presentation
   4 Cell adhesion 1 Interleukin-induced Tregs
   5 Modulation of cell migration Antigen presentation
   6 Cell adhesion 2 Cell cycle mitosis
   7 Eosinophil chemotaxis Chemokine-mediated T cell activation
   8 MHC2-mediated lymphocyte activation MHC-mediated immunity
   9 Chemokine activity Metabolic process
   10 Lipid localization TCR synapse
   11 Cell cycle immune response
   12 Lymphocyte activation
   13 Interferon induced antiviral defense
   14 Type-I interferon
   15 Cell adhesion 1
   16 Cell adhesion 2 (CD52^+)
   17 IFN-gamma-induced cytotoxicity
   18 Cytokine receptor signaling
   19 T cell activation
   [105]Open in a new tab

   Overall, the L_MPs are highly specific to immune functions, consisting
   of pathways such as lymphocyte activation, antigen processing, TCR
   anchoring, and cytotoxic functions. M_MPs are less diverse, including
   pathways related to cell migration, adhesion, and antiviral response
   ([106]Figure 1B; [107]Table 1). A few MPs are related to core cellular
   processes, such as cell cycle and metabolism.

   To benchmark our irMPs with existing database, we compared and
   quantified their constituent genes overlap (see [108]STAR Methods) with
   231 manually curated Spectra immune gene sets,[109]^24 5,000
   ImmuneSigDB pathways, 50 Hallmark pathways, and 29 immune-related KEGG
   (irKEGG). We observed significant distinction (false discovery rate
   [FDR]-adjusted p value < 0.05) in 99.5%, 98%, 99%, and 87% of the
   comparison with Spectra gene sets ([110]Figure S2), ImmuneSigDB
   ([111]Figures S3 and [112]S4), Hallmark ([113]Figure S5), and irKEGGs
   ([114]Figure S6), respectively.

Functional characterization of the irMPs

   To validate the functional annotation of irMPs and demonstrate their
   general applicability, we compiled 12 publicly accessible gene
   expression datasets encompassing a diverse range of immune contexts,
   captured through various sequencing techniques (see [115]STAR Methods
   Data Sources). These datasets were derived from well-designed
   immunological experiments with well-defined phenotypes, providing
   independent and orthogonal evidence required to validate annotations of
   irMPs.

   Compared to RNA expression, protein expression is often a more direct
   indicator of biological activity and reflects real physiological
   processes. We, therefore, leveraged two CITE-seq
   datasets,[116]^25^,[117]^26 in which the whole transcriptome and ∼200
   surface antibodies were simultaneously profiled in the peripheral blood
   mononuclear cells (PBMCs) of healthy human and COVID-19 patients,
   respectively. irMPs were scored for each cell in the RNA level using
   single sample gene set enrichment analysis (ssGSEA)[118]^27 (see
   [119]STAR Methods). For each irMP of interest, cells were dichotomized
   into MP-high or MP-low (see [120]STAR Methods) and the abundance of
   proteins indicative of the function described by the irMP was compared
   between MP-high and MP-low groups to show consistency between the irMP
   and proteomic measurement. For irMP without relevant proteins in the
   CITE-seq data, we defined highly relevant phenotypes in appropriate
   scRNA-seq data or The Cancer Genome Atlas (TCGA) pan-cancer data and
   checked for gene set enrichment in a pre-defined phenotype. In
   addition, we utilized PPI network constructed by the StringDB
   database[121]^28 to elucidate the activity cascade of each irMP
   ([122]Figures S7 and [123]S8).

   Interestingly, the irMPs we obtained can be broadly classified into
   categories that align with the sequential progression of a typical
   immune response: cytokine/chemokine production, cell adhesion, cell
   chemotaxis, antigen processing, antigen presentation, lymphocyte
   activation, and cytotoxicity ([124]Figure 1B), indicating the key
   importance of these functional components and the rich molecular
   programs that have been under-explored.

   In the following, we describe the function of each irMP and evidence
   from orthogonal validations.

Cytokine/chemokine production

   There are two cytokine/chemokine related irMPs from myeloid data (M_MP3
   and M_MP9) and two from lymphoid data (L_MP7 and L_MP18). To validate
   their annotations, we examined scRNA-seq data profiled from two severe
   COVID-19 patients, with symptoms consistent with inflammatory cytokine
   storms (ICS).[125]^29 If the irMPs are in fact related to pathogenic
   cytokine/chemokine activities, they should be enriched in cells
   deriving from patients rather than health controls. As myeloid cells
   were concluded to be the source of ICS,[126]^29 we extracted myeloid
   cells and performed gene set enrichment analysis (GSEA) on fold-changes
   of genes expression between ICS and healthy controls. Indeed, we found
   that all four irMPs were significantly enriched in ICS patients
   relative to healthy controls ([127]Figure 2A).

Figure 2.

   [128]Figure 2
   [129]Open in a new tab

   Functional characterization of the generic immune gene sets

   (A) Barcode enrichment plot of 4 cytokine/chemokine release irMPs; all
   genes from the data are ranked along x axis according to enrichment
   score (y axis). There is a significant upregulation in all 4 irMPs in
   samples with severe COVID-19 patients with cytokine storm, and the
   ranked position of each gene within a signature is shown in the x axis.

   (B) Density plot comparing the normalized enrichment score of three
   antiviral irMPs in different immune cell types when stimulated with
   interferon vs. non-interferon.

   (C) Violin plot comparing cell cycle MP activities calculated using GS
   density across different cell types identified from COVID-19 single
   cell atlas. ANOVA test was performed for multi-group comparison.

   (D) UMAPs labeled with T cell activation stages, L_MP9 activity,
   glycolysis, and oxidative phosphorylation activities from Hallmark
   using GS density.

   Upon further examining the PPI network constructed from the genes in
   M_MP9 ([130]Figure S7.9), we found that M_MP9 has a central module of
   chemokine genes (CXCL10, CXCL9, CCL4, CCL2, CCRL2, CCL7, and CCL8),
   connected with antigen presenting cell (APC)-specific human leukocyte
   antigen (HLA) class II genes (HLA-DPA1, HLA-DMA, and HLA-DRA) by CD74,
   a chaperone responsible for antigen presentation.[131]^30 This further
   supports the hypothesis that M_MP9 captures the signaling cascade via
   chemokine secretion leading to the activation of APCs.

Interferon-signaling pathways

   There is one antiviral response irMP (M_MP1) from myeloid data and two
   interferon-mediated viral defense irMPs (L_MP13 and L_MP14) from
   lymphoid data. To validate these annotations, we used cytokine
   dictionary curated by Cui et al.,[132]^31 in which the authors have
   profiled scRNA-seq on 17 different immune cell types extracted from
   lymph nodes (LNs) of mice treated with 86 different cytokines in vivo
   or phosphate buffered saline (PBS) as control. As these gene sets are
   associated with interferon signaling, they should be induced upon
   stimulation by interferon rather than other cytokines. For each
   cytokine treatment in each cell-type, we performed differential
   expression relative to corresponding PBS controls followed by GSEA.
   Across cell-types, we observed that all three irMPs show a general
   trend of significantly higher enrichment under interferon treatments
   (IFN-α1, IFN-β, IFN-ε, IFN-κ, IFN-γ, and IFN-λ2) relative to other
   cytokine treatments ([133]Figure 2B), indicating that these three irMPs
   are all signaling pathways downstream of interferon stimulation.

Cell adhesion/cell migration

   There are two cell adhesion (M_MP4 and M_MP6) and two chemotaxis
   related (M_MP5 and M_MP7) irMPs from myeloid data, and two cell
   adhesion irMPs (L_MP15 and L_MP16) from lymphoid data. Cell adhesion
   and migration are essential steps in the immune response, as they
   facilitate downstream immune cell trafficking, activation, and effector
   function.[134]^32 In general, the three steps in cell adhesion are
   rolling, weak, and strong adhesion.[135]^33 Rolling involves selectin
   molecules (L-selectin, E-selectin, and P-selectin) loosely moving
   immune cells on endothelial surface. Weak adhesion occurs when integrin
   molecules (CD49d, CD29, CD11a, CD11b, and CD11c) weakly bind to ligands
   (ICAM) on endothelial cells, slowing down the immune cell movement.
   Strong adhesion occurs when integrin molecules firmly anchor the immune
   cells to the endothelial surface, allowing them to extravasate through
   the endothelium and into surrounding tissues. If the irMPs are
   associated with cell-adhesion and chemotaxis, their activity should be
   concordant with the expression of relevant protein markers discussed
   previously.

   To validate the functional annotation of these irMPs, we dichotomized
   cells in the PBMC CITE-seq data[136]^34 by their RNA-based MP activity
   levels and examined the abundance levels of relevant proteins (see
   [137]STAR Methods). We found that, among myeloid cells, the activity of
   M_MP4 and M_MP6 are associated with the abundance of cell adhesion
   proteins related to both rolling and weak adhesion ([138]Figure S9A).
   In particular, myeloid cells with higher M_MP4 have the highest CD62L
   abundance, suggesting that M_MP4 is driven by L-selectin, a cell
   adhesion molecule on the surfaces of leukocytes. In contrast, the
   activity levels of M_MP5 and M_MP7 that are associated with cellular
   migration are associated with lower abundance of cell adhesion marker
   proteins ([139]Figure S9A), consistent with their contrasting
   functional annotation. Among lymphoid cells, we found that L_MP15 shows
   strong association with integrins (CD49d, CD29, CD11a, CD11b, and
   CD11c), and L_MP16 with selectins (CD62P and CD62L) ([140]Figure S9B).
   These two irMPs differ from myeloid-derived cell adhesion irMPs in that
   L_MP15 and L_MP16 are enriched in HLA genes, suggesting possible cell
   adhesion through antigen presentation.

Core cellular process

   Besides highly immune-specific gene sets, there are also gene sets
   related to core cellular processes. In particular, there are three
   irMPs (L_MP1, 6, and 11) annotated as cell cycle irMPs from lymphoid
   data. To validate their functions, we used a COVID-19 single cell gene
   expression atlas,[141]^35 where author identified a cluster of
   proliferative CD8^+ effector T cells. Since proliferative lymphocytes
   undergo clonal expansion, we expect cell cycle pathways to be
   upregulated in this cluster.[142]^28 Indeed, we observed significantly
   higher gene set activities[143]^26 for all three cell-cycle irMPs in
   the proliferative lymphocyte cluster ([144]Figure 2C) than in the other
   clusters.

   There is one metabolism-related gene set (L_MP9), enriched in
   glycolysis genes (ENO1, PGK1, ALDOA, PGAM1, TPI1, STMN1, LDHA, HMMR,
   ENO2, and PPIA). To confirm its relevancy to metabolism, we utilized a
   study in which authors perform in vitro stimulation of naive CD8^+
   T cells and characterized metabolic programs at different activation
   stages,[145]^36 showing activation of T cells is accompanied with
   increased metabolism. We observed that the activity of L_MP9 increases
   as T cells become more activated, which also coincided with the trend
   of glycolysis and oxidative phosphorylation ([146]Figure 2D).

   Besides the aforementioned gene sets that represent shared functions in
   both myeloid and lymphoid compartments, we have also discovered irMPs
   that are highly specific to a cell type or a contact interface between
   two immune cells.

Leukocyte activation

   Two of the lymphoid irMPs (L_MP12 and L_MP19) were annotated as
   lymphocyte activation and T cell activation, respectively. The
   activation of these pathways should coincide with high expression of
   protein markers reflective of lymphocyte activation. As COVID-19
   induces immune cell activation, we leveraged the COVID-19 PBMC CITE-seq
   data.[147]^26 We extracted natural killer (NK) and CD8 T cells based on
   SingleR[148]^37 annotation (see [149]STAR Methods) and observed that
   NK/T cells with higher L_MP12 activity have higher abundance of CD16,
   CD56, KLRG1, CD8, and CD69 ([150]Figure 3A), which are known to be
   expressed by activated T and NK cells. It is also known that CD8 can be
   expressed by cytotoxic NK cells.[151]^38 Cells with higher L_MP19
   activity have higher abundance of CD3, CD4, CD8, and CD69
   ([152]Figure 3A), which are known to be expressed by activated T cells.
   In addition, L_MP12 and L_MP19 appear to be associated with checkpoint
   proteins, TIGIT, CD152 (CTLA4), and CD279 (PD-1), which also indicate
   activation and regulation of immune homeostasis.[153]^39

Figure 3.

   [154]Figure 3
   [155]Open in a new tab

   Functional characterization for the specific irMPs

   (A) COVID-19 PBMC cells annotated as NK or CD8^+ T cells were
   dichotomized into two groups (high vs. low) based on the activity
   levels of 2 lymphoid-derived gene sets: L_MP12 and 19, estimated
   respectively from the RNA expression data. Shown in the heatmap table
   (and colored accordingly) are the average Z-transformed protein
   abundance of surface proteins: CD16, CD56, KLRG1, CD8, CD3, CD4, CD69,
   TIGIT, CD279, and CD152 in each of the respective cell groups.

   (B) Shown in violin plots are the abundance of 6 surface proteins:
   CD16, CD8, CD69, CD152, CD279, TIGIT in two dichotomized COVID-19 PBMC
   cells (annotated as NK or CD8^+ T cells) groups with respectively low
   and high L_MP17 activity levels, determined from the RNA expression
   data. In all the cases, dichotomization was performed using Gaussian
   mixture models. Activities between two groups were compared using t
   test with significance level 0.05.

   (C) Violin plot comparing two TCR-related irMPs across different T cell
   activation states using ssGSEA. Activities between two groups were
   compared using t test with significance level of 0.05.

   (D) Barcode enrichment plot of L_MP3 and L_MP5; All genes from the TCGA
   are ranked along x axis according to enrichment score (y axis). There
   is a significant upregulation in both L_MP3 and L_MP5 in samples with a
   high persistent tumor mutational burden, and the ranked position of
   each gene within a signature is shown in the x axis. Statistical
   significance was determined by a rank-based test.

Interferon-induced cytotoxicity pathway

   One lymphoid irMP (L_MP17) was annotated as IFN-gamma induced
   cytotoxicity. As COVID-19 induces cytotoxicity, we leveraged the
   COVID-19 PBMC CITE-seq data.[156]^26 Extracting NK and CD8 T cells
   based on SingleR annotation, we observed that cells with higher L_MP17
   activity have significantly higher abundance of CD8, CD16, and CD69
   (FDR adjusted p value < 0.05), which are protein markers for cytotoxic
   CD8 T cells and cytotoxic NK cells ([157]Figure 3B). In addition,
   higher abundance of TIGIT, CD152 (CTLA4), and CD279 (PD-1) in L_MP17
   high cells indicate activation, required for achieving cytotoxicity and
   regulating immune response to avoid over-activation.[158]^39 The PPI
   network for this MP suggests the following activity cascade
   ([159]Figure S7.17): Activated NK cells (NKG7 and GNLY) secrete IFNG,
   inducing the activation of granzyme-producing (GZMH, GZMB, GZMK, and
   GZMA) cytotoxic macrophages (LYZ). Moreover, downstream targets of
   leukocyte activation are also captured by L_MP17, specifically NFKBIA,
   KLF6, and nuclear receptor genes (NR4A2 and NR4A3), which work
   harmoniously to ensure T cell homeostasis and proper
   proliferation.[160]^40^,[161]^41

   To further validate the cytotoxicity nature of this irMP, we used a
   COVID-19 single cell atlas data.[162]^35 We observed that lymphoid
   cells with higher L_MP17 activities correspond to cells with higher
   cytotoxicity scores ([163]Figure S9C), calculated using GSDensity with
   the following genes: PRF1, GZMA, GZMB, GZMH, and GNLY.[164]^35 In
   addition to cytotoxicity, L_MP17 scored high in the CD8 effector T cell
   cluster, a cluster defined as highly proliferating T cells in the
   paper, suggesting that L_MP17 describes not only a cytotoxic but also a
   proliferative cell state. Indeed, immediately connected to the granzyme
   modules ([165]Figure S7.17) are TUBA4A and TUBA1C, which are known to
   be involved in forming and stabilizing microtubules in proliferating
   T cells during various stages of cell cycling, contributing to spindle
   formation during mitosis and ensuring proper chromosome segregation and
   accurate cell division.[166]^42

TCRs

   Two lymphoid irMPs (L_MP2 and L_MP10) were annotated as TCR-anchoring
   and lipid localization at TCR synapse, respectively. L_MP2 shows
   significant enrichment in antigen receptor-mediated signaling pathways
   ([167]Table S1.2). In addition, L_MP2 is enriched in genes related to
   T cell surface glycoprotein and transmembrane adapter (CD247, CD2,
   CD28, CD3D, TRAT1, CD52, and HLA-E) and genes that have key roles in
   T cell antigen receptor-linked signal transduction pathways (ICOS,
   SLAMF1, LCK, and LAT)[168]^43^,[169]^44^,[170]^45^,[171]^46
   ([172]Figure S7.2). As we expect TCR signaling to be upregulated when
   T cells are activated, we examined scRNA-seq from T cells isolated from
   bone marrow (BM) and LNs of healthy donors that were cultured in
   resting condition vs. activated by anti-CD3/anti-CD28
   antibodies.[173]^47 As expected, T cells showed higher L_MP2 and L_MP10
   activities, indicating cell-type specificity ([174]Figure 3C). Further,
   the activities of these gene sets were significantly higher in
   activated T cells indicating their relevance to TCR engagement upon
   T cell activation ([175]Figure 3C).

Antigen presentation and processing pathway

   Two of the 19 lymphoid irMPs (L_MP3 and L_MP5) are related to
   antigen-presentation mediating immunity. We named L_MP3 as
   histone-associated lipid antigen presentation and L_MP5 as antigen
   presentation. L_MP3 is enriched in chromatin organization histones
   genes with lipid antigen presentation genes connected by a V(D)J
   recombination gene RAG1. Histone genes play an important factor in
   V(D)J recombination and formation of antigen receptors on
   lymphocytes.[176]^48 The PPI network of L_MP3 ([177]Figure S7.3) shows
   a HISTONE gene module and a CD1 gene module connected by RAG1. L_MP5 is
   enriched in genes responsible for cell cycle regulation (DLGAP5, BUB1B,
   and CCNB1) and genome stability (TUBB, HMGB4, HNRNPA2B1, PTTG1, and
   HMGB1), each of which is an important element for V(D)J recombination
   fidelity[178]^49 (RAG1 and RAG2) ([179]Figure S7.5).

   To validate the antigen presentation function of L_MP3 and L_MP5, we
   used pan-cancer mutational and gene-expression data from TCGA.[180]^50
   Persistent tumor mutational burden (pTMB), defined as the number of
   single copy and multiple copy mutations,[181]^51 informs immune
   activation. Tumors with higher pTMB burdens are more likely to be
   visible to the immune system and are associated with sustained
   anti-tumor immune responses and improved response to ICB.[182]^51 These
   tumors should therefore have higher activity of antigen presentation
   pathways. We performed differential gene expression analysis (DGE)
   between TCGA samples with high vs. low pTMB, regardless of cancer type
   and stage (see [183]STAR Methods). GSEA with 19 lymphoid irMPs was
   performed. Both L_MP3 and L_MP5 are significantly upregulated among
   samples with high pTMB ([184]Figure 3D), confirming the antigen
   presentation function of these two irMPs. Notably, L_MP3 and L_MP5 are
   the only two irMPs (out of the 19) that are significantly enriched
   among high pTMB samples in the full GSEA profiles (FDR adjusted p
   value < 0.05) ([185]Figure S9D).

pMHC-TCR contact interface

   Among myeloid irMPs, M_MP2 was annotated as antigen processing.
   [186]Figure S10A depicts a typical antigen uptake by an APC, such as,
   dendritic cell (DC), in which an immature DC (iDC) transforms into a
   mature DC (mDC) upon detecting pathogen-associated molecular patterns
   (PAMPs). This activation process involves the DC recognizing PAMPs,
   leading to upregulation of major histocompatibility complex (MHC)
   molecules for antigen presentation and the co-stimulatory molecules
   CD80/CD86, crucial for T cell activation.[187]^52^,[188]^53
   Simultaneously, there is a change in cytoskeleton organization, notably
   the F-actin, to facilitate processing and presentation of the
   peptide-MHC (p-MHC) complexes on their surface for downstream T cell
   activation.

   Concordant with [189]Figure S10A, we observed that APCs (dendritic
   cells and macrophages, identified by SingleR[190]^37) with higher
   ssGSEA in M_MP2 also have significantly (p value < 0.05) higher
   abundance in MHC proteins in the COVID-19 CITE-seq data
   ([191]Figure S10A), such as HLA-F, HLA-A-B-C, and HLA-DR, suggesting
   APC activation. In addition, these cells with higher M_MP2 activity
   also have significantly (p value < 0.05) higher abundance in
   CD11a/CD18, which facilitate downstream T cell binding, and in CD86, a
   co-stimulatory molecule for T cells activation.

   Among lymphoid irMPs, L_MP8 was annotated as MHC mediated immunity.
   [192]Figure S10B depicts a typical interaction between an mDC and a
   T cell at the TCR synapse: Upon recognition of the p-MHC complex by
   TCR, the T cell upregulates CD69, indicating that the T cell has been
   successfully engaged and activated. Concordant with [193]Figure 3F, we
   observed that cells with higher M_MP2 activity also have significantly
   (p value < 0.05) higher abundance in MHC molecules, CD69, and TCRab
   from COVID-19 CITE-seq data, indicating successful TCR engagement at
   the immunological synapse.

   Among myeloid irMP, M_MP8 was annotated as MHC-II-mediated lymphocyte
   activation because it is enriched in MHC-II markers (HLA-DRB4,
   HLA-DQB1, HLA-DMA, and HLA-DMB) and TCR complex (TRAC and TRBC1),
   suggesting MHC-II dependent CD4 T cell activation (as MHC-II presents
   to helper T cells).[194]^54 To confirm if this gene set is describing
   functions at the interface between macrophages and CD4 T cells, we
   extracted macrophages and CD4 T cells based on SingleR annotation and
   observed that cells with high activity in M_MP8 have higher protein
   abundance in M2-like macrophages (CD163 and CD206), T cell activation
   (CD45RO and TCR), and MHC-II (HLA-DR) ([195]Figure S10C), suggesting
   CD4 T cell activation via MHC-II on M2-like macrophages.

Interleukin-induced Tregs

   There is one lymphoid irMP (L_MP4) annotated as interleukin induced
   Tregs because it is enriched in Treg activation markers (CTLA4, IL2RA,
   TIGIT, TNFRSF9, and IL1R2). [196]Figure S10D depicts one of the
   mechanisms for Treg survival: binding between IL2 and CD25 (IL2RA)
   promotes the growth and suppressive functions of Tregs, upregulating
   CD39[197]^55^,[198]^56, eliciting a more suppressive Treg phenotype,
   defined by upregulation of TNFRSF9 and checkpoint transcripts.[199]^56
   Using COVID-19 CITE-seq data, we extracted all Tregs based on SingleR
   annotation and found that Tregs with higher L_MP4 activity do have
   significantly (p value < 0.05) higher abundance in IL2 receptors (IL2RA
   and IL2RB), CD39, CTLA4, TIGIT, and TNFRS9 ([200]Figure S10D), which
   are all protein markers over-expressed in effector Tregs, suggesting
   possible interleukin-induced Treg activation.

irMPs redefine immune subtypes in TCGA pan-cancer data

   In TCGA pan-cancer immunity study, tumor samples were previously
   categorized into six major immune subtypes (C1: wound healing, C2:
   IFN-gamma dominant, C3: inflammatory, C4: lymphocyte depleted, C5:
   immunologically quiet, and C6: TGF-beta dominant), based on 160 immune
   signatures.[201]^57 We performed Uniform Manifold Approximation and
   Projection (UMAP) based on ssGSEA[202]^27 scores with the 28 irMPs
   ([203]Figure S11A). Since C4 and C6 are composed of less than 10% and
   2% of TCGA samples, respectively, we did not see a clear separation of
   these two clusters, but C4 is mostly distributed in the bottom left and
   C6 is mostly distributed in the top right of the UMAP
   ([204]Figure S11A). Nevertheless, the UMAP well assigned the tumor
   samples into C1+C2, C3, or C5 subtypes with significant (log rank p
   value < 0.0001) differential overall survival between C1+C2 and C3+C5
   ([205]Figures 4A and 4B), but the overall survival difference between
   C3 and C5 is insignificant (p value = 0.623), despite having
   significantly (p value < 2.2e−16) different levels of lymphocyte
   infiltration. Through k-mean clustering with k = 6 based on the same
   UMAP, we re-clustered the same set of TCGA samples into 6 new subtypes
   ([206]Figure 4C), each of which is enriched in different types of
   cancer ([207]Figure 4D).

Figure 4.

   [208]Figure 4
   [209]Open in a new tab

   irMPs redefine immune subtypes in TCGA pan-cancer data

   (A) UMAP of subsets of TCGA samples (n = 7,768) using single sample
   gene-set enrichment (ssGSEA) scores calculated from the 28 irMPs. Each
   dot is a TCGA sample colored by its immune subtypes defined by Thorsson
   et al.

   (B) Kaplan-Meier curves for overall survival between TCGA immune
   subtypes. Log rank test is performed between comparisons of interests,
   with corresponding lymphocyte infiltration scores compared on the top
   right (p value is calculated with one-way ANOVA test).

   (C) UMAP of all TCGA samples (n = 10,128) using ssGSEA scores
   calculated from the 28 irMPs. Each dot is a TCGA sample colored by
   clusters identified using k-mean clustering.

   (D) Stacked plot showing the proportions of cancer types across k-mean
   clusters.

   (E) Sankey River plot showing the shuffling of each TCGA immune subtype
   into each of the k-mean clusters.

   (F) Kaplan-Meier overall survival curves across the different k-mean
   clusters. Log rank test is performed between pairs of interests (2 vs.
   5, 2 vs. 4, 4 vs. 5). Risk table is shown below with clusters ordered
   from best survival (top) to worst survival (bottom).

   (G) Violin plot comparing tumor intrinsic characteristics between
   cluster 2 and the rest of the clusters by scoring programs with ssGSEA.
   Activities between two groups were compared using t test with
   significance level 0.05.

   (H) Violin plot comparing the gene expression levels of 3 classical
   activation and exhaustion markers across the k-mean clusters. Activites
   across multiple groups were compared using anova with significance
   level 0.05.

   Upon re-clustering, C3 and C5 are resolved into new clusters 2, 4, and
   5 ([210]Figure 4E) that well delineate patients by their overall
   survival ([211]Figure 4F), with cluster 2 having the best overall
   survival. To further study the molecular characteristics of cluster 2,
   we assessed selected TCGA signature scores on tumor
   characteristics[212]^57 (see [213]STAR Methods) across our clusters and
   found that cluster 2 has significantly (p value < 0.05) lower scores,
   compared to other k-mean clusters in terms of aneuploidy score,
   neoantigen loads, intratumor heterogeneity, and hallmark pathways
   indicative of proliferation (mitotic spindle, G2M checkpoint, and MYC
   targets V1) ([214]Figure 4G), suggesting that cluster 2 contains more
   indolent tumors, corroborated with the observed low cell proliferation.
   In addition, cluster 2 has significantly lower expression in cytotoxic
   markers (GZMK, PRF1, and CD8A) and exhaustion markers (CTLA4, PDCD1,
   and TIGIT) ([215]Figure 4H), suggesting less immunogenic environment.
   Cluster 2 is also enriched in lower grade glioma, prostate
   adenocarcinoma, and thyroid carcinoma ([216]Figure 4D), which are often
   lower grade cancers with indolent to slower growth[217]^58 and are less
   likely to invade nearby tissues or metastasize to distant organs. These
   tumors generally have better prognosis[218]^59 compared to higher-grade
   cancers. Therefore, the superior survival of cluster 2 is mainly driven
   by its simple tumor composition, low proliferation, and low
   immunogenicity, showing that our irMPs can also discern tumor intrinsic
   differences via their associated immunological states. Clusters 4 and 5
   are both characterized by low tumor proliferation and high tumor
   development ([219]Figure S11B). In term of immune landscape, cluster 4
   has significantly (p value < 0.05) higher immune infiltration (Th1
   cells and Gamma-Delta T cells) than clusters 2 or 5 and cluster 5 has
   significantly (p value < 0.05) higher level of activated NK cells
   compared to clusters 2 or 4 ([220]Figures S11B and S11C).

   The k-mean algorithm also redistributed C1 and C2 into clusters 1, 3,
   and 6 ([221]Figure 4E). Specifically, we identified a transitional
   subtype between C1 and C2 (cluster 3 in
   [222]Figure 4C)[223]^30^,[224]^31 characterized by in-between
   expression levels of characteristic IFNG-subtype (C2) markers (IFNG,
   JAK1, and STAT1)[225]^60 and classical wound-healing (C1) markers
   (S100A1, E2F1, and APC)[226]^61 ([227]Figure S11D). Cluster 3 has
   significantly (p value < 0.05) worse overall survival compared to
   clusters 1 and 6, respectively ([228]Figure S11E), and it is
   characterized by high tumor proliferation, significantly (p
   value < 0.05) higher tumor glycolysis and oxidative phosphorylation
   than cluster 6 ([229]Figure S11B). Clusters 1 and 6 are both
   characterized by high tumor proliferation ([230]Figure S11B). In term
   of immune landscape, cluster 1 has significantly (p value < 0.05)
   higher infiltration of CD8 T cells and M1 macrophages
   ([231]Figure S11C) than the other clusters, suggesting high level of
   immune activation, which is related to the observed high expression of
   exhaustion markers[232]^62 ([233]Figure S11F). Cluster 6 is
   characterized by significantly (p value < 0.05) lower hallmark immune
   pathway scores than other clusters ([234]Figure S11B).

   In summary, our irGSs redefine TCGA subtypes with significantly
   distinct survival patterns, unveiling not only the intrinsic traits of
   tumors such as their proliferation and metabolic profiles, but also the
   dynamic immune activities within the tumor microenvironment. Compared
   with the immune archetypes defined by Combes et al.,[235]^63 we are
   defining differential functional states rather than cell type
   compositions. [236]Figures S11B and S11C contain detailed information
   on hallmark pathway scores and the relative abundance of immune cell
   infiltration for each cluster.

irMPs better distinguish ICB response

   We investigated if the irMP expression at baseline could be used to
   differentiate ICB responders and non-responders. We utilized four ICB
   melanoma cohorts,[237]^64^,[238]^65^,[239]^66^,[240]^67 details in
   [241]Table 2. To account for differences in sequencing depth and
   coverage, we calculated the fragments per kilobase of transcript per
   million mapped reads (FPKM) for each of the four RNA-seq data and
   obtained 104 RNA samples at baseline. We calculated ssGSEA score for
   each RNA expression sample in the integrated ICB cohort using (1) 50
   Hallmark pathways,[242]^9 (2) 29 immune-relevant KEGG (irKEGG)
   pathways,[243]^10 (3) lymphoid irMPs (L_MPs), (4) myeloid irMPs
   (M_MPs), or (5) lymphoid-myeloid combined irMPs. We randomly divided
   104 baseline samples into 70% training (n[train] = 73) and 30% testing
   (n[test] = 21). To avoid over-fitting, we fitted five generalized
   linear models (GLM) with LASSO regularization adjusting for the
   aforementioned gene set variables. Using the best model selected from
   10-fold cross-validation, we fitted the model on the test data and
   evaluated model performance by comparing classification accuracy, which
   is the average of sensitivity and specificity. To deal with randomness
   in data splitting, we performed the procedure 1,000 times and computed
   the range for classification accuracy from each of the five models (see
   [244]STAR Methods and [245]Figure 5A). Our gene sets resulted in the
   highest mean accuracy of 0.71, followed by Hallmark (0.69) and irKEGG
   (0.62), with most of the gene sets being negatively associated with
   response ([246]Figure S12A). The classification accuracy is comparable
   (p value = 0.13) between Hallmark and combined irMPs ([247]Figure 5A).
   However, Hallmark contains 50 gene sets with relatively large (146 ±
   67) gene set sizes, whereas the combined irMPs has only 28 gene sets
   with 50 genes each, possessing comparable predictive power using fewer
   parameters. We also compared the classification accuracies with the
   18-gene tumor inflammation signature (TIS),[248]^68 which has been
   approved for use as an investigational use-only (IUO) criteria to
   stratify patients based on potential to respond to ICB, and the
   classification accuracy with TIS only on the test data is 52.83%.

Table 2.

   Number of samples contained in each of the melanoma ICB cohort
   GEO ID Disease Treatment Pre (n) Post (n) Responder (n) Non-responder
   (n)
   91061 Metastatic melanoma Anti PD-1 51 58 60 49
   115821 Metastatic melanoma Anti PD-1 9 18 1 26
   78220 Metastatic melanoma Anti PD-1 25 0 13 12
   145996 Metastatic melanoma Anti PD-1 14 0 9 5
   Total (n) 99 76 83 92
   [249]Open in a new tab

Figure 5.

   [250]Figure 5
   [251]Open in a new tab

   irMPs better separate ICB responses

   (A) Distribution of ICB classification accuracy in each of the five
   model across 1,000 iterations. Distribution of accuracies was compared
   using t test with significance level 0.05.

   (B) The average fitted coefficients for top 10 selected irMPs across
   1,000 iterations.

   (C) UMAP derived from 10 most selected irMPs activities (left), 50
   Hallmarks (middle), and 29 irKEGGs (right) in lung (top), melanoma
   (middle), and liver (bottom) cancer. Color represents different
   responses.

   (D and E) Baseline cell-cell communications for responders (D) and
   non-responders (E) inferred from cell chat. Nodes are the 11 cell types
   annotated in the original single cell data, edges represent significant
   interactions (adjusted p value < 0.05) and the thickness of the edges
   represents the probability of interactions.

   To corroborate if our gene sets can better separate ICB response with
   baseline gene expression only, we extracted the top 10 most selected
   irMPs across the 1,000 LASSO models ([252]Figure 5B) and calculated
   their activities at baseline using BulkRNAseq profiled from lung
   cancer,[253]^69 melanoma,[254]^70 and liver cancer.[255]^71 We again
   observed that these 10 irMPs can better separate ICB response in
   comparison to irKEGGs or Hallmarks ([256]Figure 5C). These data
   indicate that the irMPs can be useful to develop tumor type agnostic
   general prognostic signatures to predict ICB response.

   To understand what these irMPs explain, we calculated the ssGSEA for
   exhaustion, cytotoxicity, and tissue resident memory (T[RM]) using
   their respective marker gene expression (full marker list detailed in
   [257]STAR Methods). Computing the correlation between irMPs activity
   and these signature expressions at baseline ([258]Figure S12B), we
   observed significant (FDR adjusted p value < 0.05) positive correlation
   between most of the irMPs and exhaustion, cytotoxicity, and T[RM],
   suggesting that at baseline non-responders show phenotype of activation
   induced exhaustion, a result of prior antigen stimulation. To confirm
   our observation, we deconvoluted each of the bulk RNA sample into
   exhausted (T[EX]), tissue resident memory (T[RM]), tissue effector
   memory (T[EM]), naive (T[N]), and effector T cells (T[EFF]) using
   CibersortX[259]^72 and pan-cancer T cell signature matrix[260]^73 as
   reference (see [261]STAR Methods). Indeed, non-responders show higher
   T[EX] and T[RM] and lower T[EM] and T[N] at baseline
   ([262]Figure S12C). However, when we fit the model with only
   exhaustion, cytotoxicity and T[RM] score, the classification accuracy
   is only 54%, suggesting that irMPs have captured addition factors that
   dictate response to ICB in addition abundance of T cell states.

   As cellular communication plays an important role in orchestrating the
   immune response,[263]^74^,[264]^75 we hypothesize that irMPs have also
   captured differential cellular interactions between responders and
   non-responders. Utilizing an independent scRNA-seq dataset[265]^76 from
   baseline melanoma patients later treated with anti-PD1, we converted
   the data into pseudo-bulk expression (see [266]STAR Methods),
   calculated activity for selected irMPs (top 10 most frequently selected
   irMPs from 1,000 LASSO regressions in the previous analysis), Hallmarks
   and irKEGGs pathways, and projected the 12 samples onto a UMAP. Once
   again, we observed that irMPs more effectively separated out responders
   and non-responders ([267]Figure S12D) and these 10 irMPs are also
   associated with exhaustion, cytotoxicity and T[RM] ([268]Figure S12E).
   We then performed CellChat[269]^77 analysis (see [270]STAR Methods) to
   infer cell-cell interactions based on ligand-receptor expression and
   observed that responders are enriched in interactions surrounding
   memory T cells, while non-responders are enriched in interactions
   surrounding exhausted T cells at baseline ([271]Figure 5D). To offer
   additional evidence of these differential interactions, we used
   scrublet[272]^78 to identify doublets and double-annotated the doublets
   (see [273]STAR Methods). Although doublets in droplet-based scRNA-seq
   data may result from multiple sources, they have been shown to reflect
   true physical interactions between cell types.[274]^79 Responders were
   characterized by a limited repertoire of doublet phenotypes with a
   marked enrichment for doublets with memory T cells. In contrast,
   non-responders presented with a more varied doublet repertoire with
   frequent doublets involving various exhausted lymphocytes
   ([275]Figure S12F). These trends mirror results of the cell-cell
   communication analysis ([276]Figures 5D and 5E). To confirm whether our
   irMPs have captured these interactions, we calculated the activities of
   the 10 important irMPs in each type of doublets at baseline. We
   observed that the top 10 selected irMPs all have high activities in
   doublets that are enriched in non-responders at baseline
   ([277]Figure S13). Taken together, the irMPs may capture interactions
   between exhausted lymphocytes and other cell types at baseline, which
   have been associated with poor response in anti-PD1
   immunotherapy.[278]^80^,[279]^81^,[280]^82

irMPs granularly segment spatial transcriptomics data

   irMPs can potentially enhance segmentation and phenotyping of spatially
   resolved transcriptomic (SRT) data. For illustration, we obtained 10×
   Visium data from an FFPE human breast tissue section.[281]^83 The H&E
   slide from the tissue section was segmented by pathologists into major
   cell types ([282]Figure 6A). We calculated the activity scores of the
   irMPs from the SRT sample and observed that irMPs consistently isolate
   out the tumor regions (dark blue), aligning with pathologist’s
   annotation ([283]Figures 6B and [284]S14). Besides identifying tumor
   regions, our irMPs can also be characterized by the infiltrated
   immunological patterns on the H&E image. We used CellTrek[285]^84 to
   project single cells from a reference breast cancer patient profiled by
   scRNA-seq in Tumor Immune Single-cell Hub 2 (TISCH2)[286]^85 onto the
   spatial spots[287]^84 (see [288]STAR Methods).[289]^83 We then
   visualized the average irMP activity score for each projected cell type
   in [290]Figure 6C using a clustered dot plot and observed that
   clustering of irMPs was largely driven by lineages (3 clusters from top
   to bottom): myeloid, lymphoid, and cycling. On one hand, some irMPs are
   strongly lineage specific. For example, the “TCR anchoring” score is
   prominent among T cells, nearly absent in other cell types
   ([291]Figure 6C). On the other hand, some irMPs also provide functional
   insight across the TIME, not limited to a cell type. For example,
   “IFN-gamma induced cytotoxicity” exhibits the highest score in T cells
   and a notable score in macrophages ([292]Figure 6C), suggesting a
   co-enforcement feedback loop between activated T/NK cells and cytotoxic
   macrophages mediated by potent interferon gamma, a hub gene in the PPI
   network ([293]Figure S7.17), which aligns with the annotation of this
   irMP. Benchmarking with immune-related KEGG pathways and Hallmarks, we
   observed that Hallmark pathways are mostly specific to cancer and
   endothelial regions ([294]Figure S15A) and irKEGG pathways are
   myeloid-specific ([295]Figure S15B). To examine the generalizability of
   our results, we repeated the aforementioned experiments on an FFPE
   human ovarian tumor tissue and an FFPE human intestine tumor tissue
   using ovarian cancer patients[296]^86 and colorectal cancer[297]^87
   patients as single cell references, respectively. We obtained coherent