Abstract The immune system is a highly complex and dynamic biological system. It operates through intracellular molecular networks and intercellular (cell–cell) interaction networks. Systems immunology is an emerging discipline that applies systems biology approaches of integrating high-throughput multi-omics measurements with computational network modeling to better understand immunity at various scales. In this review, we summarize key omics technologies and computational approaches used for immunological studies at both population and single-cell levels. We highlight the hidden driver analysis based on data-driven networks and comment on the potential of translating systems immunology discoveries to immunotherapy of cancer and other human diseases. Keywords: Systems immunology, scRNA-seq, gene regulatory network, cell–cell communication, hidden driver analysis Graphical Abstract graphic file with name nihms-1052169-f0001.jpg Introduction The immune system, one of the most complex and dynamic biological systems in mammals, is comprised of diverse cell types with varying functional states. Between the two major arms of the immune system, the innate immune system, comprised of macrophages, dendritic cells, neutrophils and other cells, serves as the first line of defense by mounting immediate and potent immune and inflammatory responses against invading pathogens and other immunological insults. Cells in the adaptive immune system, including T and B cells, have a more specialized role in immune reactions and are characterized by antigen specificity and long-term memory. These different elements interact as an integrative system to give rise to proper immune responses and regulation, and play crucial roles in protecting host health against viruses, bacteria, parasites and tumors. Dysfunctions in the immune network may lead to autoimmune, malignant, and inflammatory diseases. Characterizing these diverse cell types, their unique molecular features, and their interactions is the key to successfully manipulate the immune system for therapeutic applications. Advances in high-throughput profiling technologies, particularly the emerging single-cell omics platforms, enable comprehensive characterization of the immune components at multiple scales. However, immunity is not merely a sum of its components, and its behavior cannot be explained or predicted solely by examining individual components. Therefore, systems biology approaches are essential for decoding the cellular complexity, plasticity, and functional diversity of the immune system, leading to the emerging field of systems immunology to better understand how the immune system works as a whole in health and disease. Gene regulatory networks function as the crucial molecular determinants of cell fate and state by governing gene expression programing and reprograming in immune development and homeostasis [[28]1]. Signaling and epigenetic factors are also crucial drivers of immunological functions and are likely druggable, making them promising therapeutic targets. However, it is often difficult to identify many of these drivers (hence known as “hidden drivers”), because they may not be genetically altered or differentially expressed at the mRNA or protein levels but, rather, are altered by posttranslational modifications (PTMs; e.g., phosphorylation) or other mechanisms. Moreover, immune responses are mediated by both the intracellular gene networks and crosstalk between many types of immune cells in specific tissues and microenvironmental contexts; and their dysregulations can lead to diseases, including cancer and inflammatory disorders. Therefore, molecular and cellular networks, and their drivers and “hidden” drivers (cannot be easily detected by conventional approaches) must be systematically dissected to develop effective and curative immunotherapies for diseases such as cancer [[29]2]. Omics technologies for immunology research Technological advances in high-throughput and high-bandwidth profiling, phenotyping and perturbation assays have contributed to rapid advances of systems immunology. A variety of omics technologies at both population and single-cell levels have played important roles in improving our understanding of the immune system ([30]Figure 1). Each technology has its advantages and limitations, and understanding these factors is essential to devise effective and reliable systems approaches that address immunological questions at the appropriate resolution. In this section, we discuss these technologies by summarizing the essential aspects of their proper use, with example applications and certain limitations. Figure 1. [31]Figure 1. [32]Open in a new tab Overview of the omics profiling technologies to characterize the immune system of human and mouse at population and single-cell levels. Population level Transcriptome profiling by microarray or RNA-based next-generation sequencing (RNA-seq) is the most widely used omics method in immunology research. Transcriptome analysis has provided instrumental insights into the mechanisms of immune system development and homeostasis under steady state, and transcriptional dynamics during the immune response to antigens or pathogens, including the identification of diverse immune cell types and functional states [[33]3]. As the cost of sequencing decreases, RNA-seq, particularly bulk RNA-seq, has become the more prevalent technology for gene expression profiling, with several advantages than microarray technology: high coverage and sensitivity (detecting low-abundance transcripts); detection of splicing events, gene fusions, and small RNAs; low background noise and batch effects; and the ability to handle low RNA input (down to 10 pg). The community-driven and publicly available databases of gene expression profiles, such as Gene Expression Omnibus (GEO), have enabled data mining across platforms, studies, and species. A few curated immune-specific databases with analysis and visualization tools have provided valuable resources for immunology researchers, including ImmGen [[34]4], ImmPort [[35]5], ImmuneSpace [[36]6], and 10K Immunomes [[37]7]. The expression levels of mRNA and protein can differ substantially for many genes [[38]8], especially during the dynamic transitional state when there is a temporal delay between transcription and translation [[39]9]. Moreover, posttranslational modifications, such as phosphorylation, are crucial regulators of protein functions and signaling, but are poorly correlated with mRNA or total protein expression. With the recent advances in mass spectrometry (MS) analytical technologies [[40]10], in-depth proteomic profiling can now identify more than 10,000 proteins (whole proteomics) and 30,000 phosphopeptides (phosphoproteomics) across multiple samples simultaneously [[41]11**,[42]12,[43]13]. The tandem mass tagging (TMT) [[44]14] and the label-free quantitation (LFQ) are two common proteomic methods to quantify the differential abundance of expressed proteins, with the TMT method recently shown to have higher precision and coverage than the LFQ method [[45]15,[46]16]. Despite the challenge to cover the entire proteome and PTM landscape, current MS-based proteomic technologies are capable of providing comprehensive characterizations of proteome dynamics and biological insights into gene regulation and signaling circuits in immunology, such as T-cell activation [[47]11**,[48]17] and host-pathogen interaction [[49]18]. Proteomics by affinity purification-mass spectrometry (AP-MS) is also commonly used to identify protein-protein interactions (PPIs) [[50]19**] that help dissect the molecular mechanisms of crucial immunological modulators, for example, Mst1 signaling in regulatory T cells [[51]20]. Recently, advanced MS-based platforms have been developed to profile and explore the metabolome that may shape the functions of immune cells (e.g., metabolomics [[52]21] and lipidomics [[53]22,[54]23]). DNA-based next-generation sequencing (NGS) has revolutionized the study of many fields in biology, including immunology [[55]24**]. Whole-genome or -exome sequencing and targeted DNA sequencing are now routinely used to identify somatic genetic alterations associated with cancer and other diseases, spurring the advent of precision medicine [[56]25]. In basic immunology research, NGS is commonly used to dissect protein-DNA interactions (ChIP-seq) [[57]26], protein-RNA interactions (CLIP-seq) [[58]27], DNA methylation (Bisulfite-seq) [[59]28], chromosomal interactions (Hi-C) [[60]29], and chromatin accessibility (ATAC-seq) [[61]30]. The revolutionary CRISPR/Cas-based genome engineering technologies enable the use of genome-wide functional perturbation screening [[62]31] to systematically interrogate novel players and circuits that regulate or modulate immune development, homeostasis and response [[63]32,[64]33**]. CRISPR and conventional RNAi screens perform comparably for identifying essential genes [[65]34]. Novel CRISPRi/a technologies provide a complementary but superior approach to RNAi by repressing or activating gene expression at the transcriptional level, while RNAi represses gene expression at the mRNA level [[66]35]. Single-cell level The immune system encompasses various cell types and functional states. Population- or bulk-based profiling performed by averaging results from thousands of cells of distinct types presents an inherent heterogeneity problem for data analysis and interpretation. However, the advent of single-cell technologies to profile the transcriptome (scRNA-seq) [[67]36], proteome (mass cytometry or CyTOF, NanoLC-MS) [[68]37,[69]38], genome [[70]39], and chromatin accessibility or epigenome (scATAC-seq, scChIP-seq, scBS-seq, scHi-C) [[71]40,[72]41,[73]42,[74]43,[75]44] has provided an unprecedented opportunity to overcome this challenge by simultaneously quantifying molecular features at the single-cell resolution. Indeed, single-cell technology was recognized as the breakthrough of the year for 2018. In immunology research, scRNA-seq [[76]45**] and mass cytometry [[77]37] are widely adapted. In the last few years, advances in technologies of cell suspension, automation, microfluidics and implementation of unique molecular identifiers have boosted the scRNA-seq field by improving the throughput (the number of cells), sensitivity (the number of uniquely-detected genes), precision (level of noise), and reproducibility [[78]46**]. The scRNA-seq technology has been widely used in immunology to reveal immune cell heterogeneity and dynamics in healthy and malignant conditions [[79]47*,[80]48*,[81]49*]. Significant efforts have been invested to profile the entire human and mouse cell atlas [[82]50,[83]51,[84]52]. Because of their high-throughput of cells, droplet-based scRNA-seq platforms, including 10X Genomics Chromium [[85]53], inDrop [[86]54], and Drop-seq [[87]55], are becoming more popular than FACS- or plate-based protocols for immunology studies [[88]56]. However, plate-based methods have no sequencing bias on the 5’ or 3’ end of transcript tags and capture more molecules than droplet-based platforms [[89]57]. The combined use of both platforms can provide more comprehensive and in-depth information [[90]58]. Imaging-based, single-molecule fluorescence in situ hybridization (smFISH) [[91]59,[92]60] is another powerful, emerging technology for high-throughput single-cell transcriptomics with additional spatial information integrated, but is yet to be applied to the immune system. Flow cytometry uses fluorescent antibodies to simultaneously profile multiple proteins per cell and has been the mainstay for immune-phenotyping. Mass cytometry overcomes the limitation associated with the spectral overlap of fluorophores in flow cytometry by using metal-conjugated antibodies that increase the dimension [[93]37]. It has enabled the identification and characterization of a variety of immune cell types and states in the mammalian immune system with emerging applications in the clinic [[94]61]. However, this technology is limited to a small number of pre-defined parameters (e.g., surface markers), and the profiling of these parameters depends on the availability of protein-specific antibodies. More recently, a multiplexed immunofluorescence method has been developed to obtain 40 protein readouts of thousands of cells in situ [[95]62], which may also be adopted in immunology. To understand cellular behaviors in-depth, strategies to integrate multiple single-cell omics technologies or combine them with population-based profiling to simultaneously profile various dimensions of biological information from the same cell have emerged [[96]63]. For instance, recent studies have combined profiles of single-cell and bulk transcriptomes [[97]64]; transcriptomes and chromatin states [[98]65,[99]66*]; transcriptomes and protein epitopes [[100]67,[101]68,[102]69]; transcriptomes obtained by scRNA-seq and those obtained by smFISH [[103]70*]; epitomes and protein epitopes [[104]71]; transcriptomes and functional genomes [[105]72,[106]73,[107]74]; and genome, transcriptome, and methylome data [[108]75]. Application of these cutting-edge integrative technologies to immunological questions will likely provide new insight in our understanding of the immune system. Computational approaches for systems immunology Multi-omics technologies providing population– and single-cell–level information give rise to remarkably rich and complex datasets with which to tackle immunological questions. However, interpretation and integration of such “big” data remain a challenge and a barrier to broad implementation of systems approaches in immunological studies. In this section, we review common computational algorithms and strategies for in-depth analysis and integration of multi-omics data in systems immunology ([109]Figure 2). We start with the immune cell deconvolution to identify proportions of cells within heterogenous populations, and then discuss various systems biology strategies to dissect the molecular pathways or features associated with immune cell identity, function and response, with an emphasis on hidden driver analysis based on data-driven networks to decode regulatory mechanisms of the immune system. Figure 2. [110]Figure 2. [111]Open in a new tab Overview of common computational analyses and algorithms in systems immunology. Deconvolution of the immune cellular heterogeneity One of the most frequent analyses in immunology is immune-cell phenotyping because extensive cellular heterogeneity underpins the functional diversity of the immune system. For bulk microarray or RNA-seq gene expression profiles, linear regression-based deconvolution algorithms [[112]76,[113]77,[114]78] have been developed to predict the frequency of diverse cell subsets based on predefined signatures. However, these approaches rely on prior knowledge on existing immune cell types. Instead, the widespread adoption of single-cell profiling enables unbiased identification of known and unknown subsets of immune cells. Several algorithms for clustering analysis, cell-type identification, and visualization from single-cell transcriptomics data have emerged. For instance, SC3 employs consensus k-means clustering method with a combination of various distance metrics and initial conditions that improves the accuracy and robustness of clustering in comparison with previous approaches [[115]79]. For more detailed discussion of cluster algorithms for scRNA-seq, we refer readers to other comprehensive reviews [[116]80**,[117]81]. However, more advanced and efficient algorithms of scRNA-seq analysis remain needed to capture the nonlinear cell–cell correlations, to reduce noise from the “dropout” effects, and to handle datasets with millions of cells. Gene signature and pathway enrichment analysis Genome-wide transcriptomic and proteomic profiles of immune cells following treatments, stimulations or genetic perturbations provide valuable insights into molecular signatures and pathways that define cell identity, gene regulation, and immune responses. Differential gene expression analysis is the mainstream strategy to define a gene signature, followed by functional or pathway enrichment by hypergeometric test or gene set enrichment analysis (GSEA)-type approaches [[118]82,[119]83,[120]84,[121]85]. However, the signature analysis may be limited by poor correlation between different studies, as signature genes derived from independent experiments may not be entirely consistent. Additionally, the pathway databases may lack context-specific information and are limited by incomplete or inaccurate prior knowledge. Immune cell deconvolution gives the proportion of heterogeneous cell types while functional enrichment analysis defines the molecular features in each cell type. A combination of these two approaches facilitates downstream analysis of intracellular and intercellular interactions. Intracellular gene network inference The availability of large-scale profiling platforms enables the study of relationships among the molecular elements (i.e., intracellular gene networks) in the immune system. Most of the network reconstruction methods are based on gene expression profiles of perturbation experiments (e.g., gene silencing, deletion or overexpression), as previously reviewed [[122]86,[123]87]. Here, we highlight two common network inference strategies that use baseline transcriptomic data. One is co-expression network analysis by WGCNA [[124]88] based on Pearson or Spearman correlations. However, co-expression networks usually contain a large number of redundant interactions that lack biological relationships. To overcome this problem, ARACNe [[125]89] uses mutual information to capture nonlinear gene–gene relationships and applies data processing inequality to remove redundant edges. It has been widely used to infer transcription factor (TF) regulatory networks from gene expression data. Recently, SJARACNe [[126]90] was developed to scale up and extend ARACNe to infer both TF regulatory and signaling networks from large-input datasets, including scRNA-seq data. For example, SJARACNe was used to reverse-engineer the signaling interactome of dendritic cells (DCs), leading to novel molecular insights into the functions of DC subsets [[127]91**]. For scRNA-seq data, SCENIC utilizes TF motif databases to reconstruct regulatory networks that improves clustering and reduces batch effects [[128]92]. Other modeling approaches including Bayesian network, Boolean network, and diffusion or differential equation–based network approaches, are used for inference of small-sized networks [[129]93]. For example, Bayesian network was used to identify causal correlations of molecular and clinical features of Alzheimer’s disease [[130]94]. However, it remains challenging to scale up these approaches for genome-wide networks, because of the high complexity of parameters and limited samples [[131]95]. To complement and improve networks predicted in silico, experimental approaches are also used to directly infer subnetworks of proteins of interest (e.g., TF regulatory network by ChIP-seq [[132]96], post-transcriptional networks by CLIP-seq [[133]27], PPI by AP-MS [[134]19], enzyme-substrate network by PTM-enriched proteomics [[135]97], and metabolic networks by metabolomics [[136]21]). However, these networks are limited to selected proteins and lack generalizability. Network-based integrative analysis Integration of multi-tier omics data increases the sensitivity and reliability of discoveries in the immune and other complex biological systems by aggregating information at multi-layers to increase the signal-to-noise ratio [[137]93,[138]98]. This approach is particularly important in understanding immune system function, given the high complexity of cellular components and molecular circuits in the immune system. However, different omics platforms have distinct features and dimensions, making the meta-analysis challenging. The most popular strategy is to superimpose co-expression or regulatory networks, constructed from transcriptomes and/or knowledge-based network databases (e.g., MSigDB, PPI, TF-target, kinase-substrate) on various omics data to identify network modules that control immune cell development and response [[139]99]. For example, this strategy has been applied to integrate temporal transcriptome, proteome, and phosphoproteome data, leading to the identification of novel signaling circuits and bioenergetics pathways that mediate T-cell quiescence exit [[140]11]. Additionally, PARADIGM [[141]100] integrates genomic and transcriptomic alterations to identify dysregulated pathways. NetGestalt [[142]101] defines the hierarchical architecture in the network of omics data clustering. CellNet [[143]102] utilizes co-expression networks to determine the cell identity and master regulators of cell types/states. PageRank combines ATAC-seq and transcriptomic datasets to identify master regulators of T-cell residency in non-lymphoid tissues and tumors [[144]103**]. Both VIPER [[145]104] and NetBID [[146]91**] use ARACNe/SJARACNe-derived regulatory networks to infer protein activities in individual samples and master regulators associated with phenotypes. While VIPER is focused primarily on gene expression data, NetBID uses a distinct activity inference algorithm and a Bayesian framework to integrate multiple omics data. Hidden driver analysis based on data-driven and context-specific networks In addition to transcription factors that are the focus of most network-based algorithms, signaling and epigenetic factors are also crucial drivers of immunological functions. However, many of these factors are hidden drivers, because their activities are associated with PTM but not with genetic alterations or expression abundance. PTM proteomics-based direct measurements of protein activities are technically challenging. Here, we highlight NetBID [[147]91] ([148]Figure 3), a recently developed algorithm to identify hidden drivers from multi-omics data by using data-driven networks and Bayesian inference. In our study of DC subset functions [[149]91], NetBID superimposed a DC-specific signaling interactome, which was computationally reconstructed from a set of transcriptomic profiles of total DCs, onto multi-layer omics datasets (transcriptome, proteome, phosphoproteome) to infer activities of signaling proteins in CD8α^+ and CD8α^− DCs, followed by a Bayesian approach to integrate information at all levels, leading to the identification of putative hidden drivers that selectively modulate functions of DC subsets. In particular, NetBID has identified the Hippo kinase Mst1/Stk4 as a hidden driver, selectively active in CD8α^+ DCs, which was further validated by genetic and functional experiments. Of note, there is no differential expression of Mst1 at mRNA levels, while Mst1 protein expression is even lower in CD8α^+ than CD8α^− DCs. One advantage of NetBID for successfully capturing Mst1 is that the Mst1 subnetwork inferred in silico is enriched in its putative downstream targets as defined by perturbation experiments, enabling inference of its true functional activity. NetBID currently relies on bulk omics data. An improved version that handles single-cell omics data to infer cell-type–specific hidden drivers remains to be developed. Figure 3. [150]Figure 3. [151]Open in a new tab Hidden driver analysis by NetBID. (A) The overview flowchart of NetBID analysis to identify hidden drivers of phenotype case vs. control. (B) An illustration of an example hidden driver (HD) that has no differential expression but has network enrichment and activity. Diff-exp, differential expression. Intercellular network inference Cell–cell communication fundamentally regulates how the immune system operates as a network to effectively respond to infection and other insults. Systematically decoding intercellular networks that modulate immunity has been a longstanding challenge. Recently, an algorithm developed by text mining the literature has predicted previously unappreciated cell–cytokine interactions [[152]105*], but the attempt is limited by the inherent bias of existing knowledge. Single-cell technologies have provided a unique opportunity to tackle this challenge in a more unbiased manner. Systematic inference of intercellular communications is still in early development, with a few limited examples based on scRNA-seq to date [[153]106,[154]107*,[155]108,[156]109**,[157]110]. Closing remarks: towards translational systems immunology Technological advancement is driving fundamental discoveries in immunology. Recent advances in single-cell technologies enable the study of immunological diversity and complexity at an unprecedented resolution. Next-generation, single-cell omics methods are able to simultaneously capture additional information, such as spatial organization [[158]111], dynamic clonality via lineage barcoding [[159]112], and immune receptor repertoire [[160]113*,[161]114,[162]115]. An equally important and complementary effort is to develop sophisticated computational algorithms to analyze and integrate high-throughput multi-omics and multi-sourced data [[163]116]. The importance of translational immunology is illustrated by the remarkable success of cancer immunotherapies that demonstrate durable responses in the clinic, including CAR-T-cell therapies [[164]117] and checkpoint-blockade therapies [[165]118], which were recently recognized by the Nobel prize [[166]119]. For example, tumor cells escape immune surveillance by up-regulating PD-L1 that interacts with PD-1 receptor on T cells to elicit the immune checkpoint response. Therefore, blocking the crosstalk between PD-L1 on tumor cells and PD-1 on T cells will reactivate the cytotoxic T cells to kill tumor cells. However, immunotherapies are efficacious for only a fraction of patients, and existing biomarkers based on tumor mutation burden and single protein expression (e.g., PD-L1) have limited prediction power. The emerging systems immunology approaches could be translated to tackle pressing issues in the clinic [[167]98] by dissecting the heterogeneity and interactions of tumor and immune microenvironment [[168]116]. For instance, integrative systems biology analysis of bulk omics data from over 10,000 patient samples of 33 cancer types has provided instrumental insights into the immune landscape of cancer [[169]120]. More recently, scRNA-seq and high-dimensional flow cytometry analyses of human tumors have revealed a unique CD8 T cell subset that infiltrates tumors and responds to checkpoint blockade immunotherapy to mediate effective tumor immunity [[170]121,[171]122], and this control mechanism is also observed and validated in murine tumor models [[172]123,[173]124]. The state-of-the-art technologies are enabling comprehensive molecular characterization of tumor cells and their microenvironment from large cohorts of patient samples at the single-cell resolution. The development of immune-competent and humanized mouse models is facilitating immune-related functional and mechanistic studies. We envision that network-based systems immunology analysis of multi-omics data, from both the human and mouse model systems, will enable identification of hidden drivers of resistance to existing cancer immunotherapies, novel predictive biomarkers to better stratify patients, and novel therapeutic targets and combination strategies to overcome drug resistance and develop more precise immunotherapy. These strategies may also manifest legitimate therapeutic opportunities for other immune-related disorders, including autoimmune, inflammatory and neurodegenerative diseases. Highlights. * Systems immunology is emerging with omics tools at population & single-cell levels * Integrative analysis of multi-omics data has revealed novel insights in immunology * Single-cell sequencing technology is driving immunology research * Data-driven and context-specific networks enable capture of hidden drivers Acknowledgements