Abstract

   Objective: Septic shock is the severe complication of sepsis, with a
   high mortality. The inflammatory response regulates the immune status
   and mediates the progression of septic shock. In this study, we aim to
   identify the key immune-related genes (IRGs) of septic shock and
   explore their potential mechanism.

   Methods: Gene expression profiles of septic shock blood samples and
   normal whole blood samples were retrieved from the Gene Expression
   Omnibus (GEO) and Genotype-Tissue Expression Portal (GTEx). The
   differential expression genes (DEGs) and septic shock-specific
   immune-related genes (SSSIRGs) were evaluated and identified, along
   with the immune components by “cell type identification by estimating
   relative subsets of RNA transcripts (CIBERSORT, version x)” algorithm.
   Additionally, in order to explore the key regulatory network, the
   relationship among SSSIRGs, upstream transcription factors (TFs), and
   downstream signaling pathways were also identified by Gene Set
   Variation Analysis (GSVA) and co-expression analysis. Moreover, the
   Connectivity Map (CMap) analysis was applied to find bioactive small
   molecules against the members of regulation network while Chromatin
   Immunoprecipitation sequencing (ChIP-seq) and Assay for Targeting
   Accessible-Chromatin with high-throughput sequencing (ATAC-seq) data
   were used to validate the regulation mechanism of the network.

   Results: A total of 14,843 DEGs were found between 63 septic shock
   blood samples and 337 normal whole blood samples. Then, we identified
   septic shock-specific 839 IRGs as the intersection of DEGs and IRGs.
   Moreover, we uncovered the regulatory networks based on co-expression
   analysis and found 28 co-expression interaction pairs. In the
   regulation network, protein phosphatase 3, catalytic subunit, alpha
   isozyme (PPP3CA) may regulate late estrogen response, glycolysis and
   TNFα signaling via NFκB and HLA; Kirsten rat sarcoma viral oncogene
   homolog (KRAS) may be related to late estrogen response and HLA; and
   Toll-like receptor 8 (TLR8) may be associated with TNFα signaling via
   NFκB. And the regulation mechanisms between TFs and IRGs (TLR8, PPP3CA,
   and KRAS) were validated by ChIP-seq and ATAC-seq.

   Conclusion: Our data identify three SSSIRGs (TLR8, PPP3CA, and KRAS) as
   candidate therapeutic targets for septic shock and provide constructed
   regulatory networks in septic shock to explore its potential mechanism.

   Keywords: septic shock, immune-related genes, TLR8, PPP3CA, KRAS

Introduction

   Septic shock is the most severe complication of a severe microbial
   infection (sepsis), with a short treatment time-window and a high death
   rate ([33]Sharawy and Lehmann, 2015). Pathogenetically, pathogens
   invade microorganisms; induce intracellular events in immune,
   epithelial, and endothelial cells (ECs); and trigger an uncontrolled
   systemic inflammatory response ([34]Annane et al., 2005). In the
   process, the inflammatory response associated with “cytokine storm”
   leads to the damages to host tissues and organs, while the
   anti-inflammatory response reprograms the immune microenvironment and
   regulates the immune status ([35]Russell and Walley, 2010). Thus, the
   regulation and dysregulation of immune microenvironment may play
   important roles in the development, progression, and prognosis of
   septic shock.

   Current therapeutic strategies often focus on controlling the infection
   source by antibiotic therapy and restoring hemodynamic homeostasis by
   norepinephrine ([36]Kumar, 2014). In addition, septic shock patients
   may also benefit from some drugs such as corticosteroids or activated
   protein C ([37]Seam, 2008). Despite unceasing advances in management,
   septic shock still represents a major healthcare problem worldwide.
   Based on the crucial roles of immune microenvironment in septic shock,
   we suppose that targeting septic shock-specific immune-related genes
   (SSSIRGs) may be novel therapeutic options ([38]Walley et al., 2014;
   [39]Francois et al., 2018).

   In this study, in order to identify the SSSIRGs, the public databases
   were searched, and the RNA profiles of blood samples in septic shock
   and healthy patients were downloaded. The differentially expressed
   SSSIRGs were identified between them. In addition, the algorithm “cell
   type identification by estimating relative subsets of RNA transcripts
   (CIBERSORT)” was used to evaluate the cell fraction. Moreover, we also
   constructed regulatory networks based on upstream transcription factors
   (TFs), SSSIRGs, immune cells, and downstream signaling pathways to
   decipher the potential mechanism of septic shock and identify candidate
   targets.

Materials and Methods

Data Extraction and Differential Expression Analysis

   The gene expression profiles of 63 septic shock blood samples and 337
   normal whole blood samples were downloaded from the Gene Expression
   Omnibus (GEO^[40]1) ([41]Braga et al., 2019) and the Genotype-Tissue
   Expression Portal ([42]GTEx, 2015^[43]2), respectively. The gene
   expression profiles were retrieved in formats of HTseq-counts and were
   normalized to Transcripts Per Million (TPM) based on gene length.
   Clinical data including demographics (age and gender) and blood
   collection time points (T1: within 16 h of ICU admission; T2: 48 h
   after study enrollment; T3: on day 7 from ICU admission or before
   discharge from the ICU) were also extracted from the database.

   Samples and patients with incomplete clinical information were
   excluded, and conformers who met the inclusion and exclusion criteria
   were extracted for further analysis. In order to reduce the error
   caused by the different length and sequencing depth of each single gene
   in the sequencing process, firstly, raw counts (RCs) of RNA-seq data
   obtained from GTEx and [44]GSE131411 were quantified, respectively,
   which were then standardized using voom function in limma (Linear
   Models for Microarray Analysis) package. To eliminate the batch effect,
   these two batches of RNA-seq data were corrected using the
   normalizeBetweenArrays function in the limma package and then merged
   for differential expression analysis ([45]Smyth, 2004).

   Differential expression genes (DEGs) between 63 septic shock blood
   samples and 337 normal whole blood samples were determined using the
   Edge R package ([46]Smyth, 2004). The standard to define the DEGs was |
   log[2] Fold Change (FC)| > 1.0 along with False Discovery Rate (FDR) <
   0.05. Besides, the Gene Oncology (GO) and Kyoto Encyclopedia of Genes
   and Genomes (KEGG) items of enrichment analysis were performed to
   explore the biological processes (BPs), cellular components (CCs),
   molecular functions (MFs), and KEGG pathways that septic shock-specific
   DEGs enriched.

Identification of Septic Shock-Specific Immune-Related Genes in Whole Blood

   A total of 2,498 immune-related genes (IRGs) were retrieved from the
   ImmPort database^[47]3 and Molecular Signatures Database (MSigDB)
   v7.1,^[48]4 respectively ([49]Liberzon et al., 2015). The intersection
   of the 2,498 IRGs and DEGs was filtered by further non-parametric test
   (Kruskal–Wallis H test) to identify SSSIRGs.

Identification of Immune Components Potentially Regulated by Septic
Shock-Specific Immune-Related Genes

   The “CIBERSORT, version x” algorithm was applied to estimate the
   fraction of 22 types of immune cell in each whole blood sample
   ([50]Newman et al., 2019). The non-parametric test (Mann–Whitney U
   test) was conducted to illuminate the different immune cell components
   between septic shock blood samples and normal whole blood samples. In
   addition, the fraction of 29 immunomodulating gene sets was quantified
   using single-sample Gene Set Enrichment Analysis (ssGSEA) ([51]Barbie
   et al., 2009).

   Correlation analysis was applied to SSSIRGs, 22 immune cells, and 29
   immune gene sets. The pairwise interactions between SSSIRGs and immune
   cells (| correlation coefficient| > 0.500, p-value < 0.001) along with
   SSSIRGs and immune gene sets (| correlation coefficient| > 0.600, p <
   0.001) were identified as immune components potentially regulated by
   SSSIRGs.

Identification of Key Transcription Factors and Pathways of the
Shock-Specific Immune-Related Genes

   Fifty hallmark gene sets (signaling pathways) and the 318 TFs were
   retrieved from the MSigDB v7.1 (see text footnote 4) and Cistrome
   database,^[52]5 respectively ([53]Liberzon et al., 2015; [54]Zheng et
   al., 2019). Differential expression and correlation analyses were
   conducted to determine upstream TFs, while hallmark pathways of SSSIRGs
   were absolutely quantified by Gene Set Variation Analysis (GSVA)
   ([55]Hanzelmann et al., 2013). Next, the correlation analysis was
   applied to key SSSIRGs, TFs, and aforementioned signaling pathways.
   Pairwise interactions between TFs and SSSIRGs (| correlation
   coefficient| > 0.900, p < 0.001), SSSIRGs, and the 50 hallmark pathways
   (| correlation coefficient| > 0.600, p < 0.001) were extracted for
   further analysis. All above significant interaction pairs were
   integrated to construct the regulation network among SSSIRGs and their
   upstream TFs and downstream pathways/immune cells/immune gene sets.

Integrated Analysis of Clinical Correlation Based on Length of Hospital Stay

   In order to better understand and validate the relationship between
   SSSIRGs and prognosis of patients with septic shock, patients were
   categorized into three groups (T1–T3) based on blood sample collection
   time points (T1: within 16 h of ICU admission; T2: 48 h after study
   enrollment; T3: on day 7 from ICU admission or before discharge from
   the ICU). Further differential expression analysis based on different
   time points was carried out, respectively, using the edge R package.

   In addition, gene sets over-representation analysis (GSORA) was
   performed to evaluate the fraction of genes in tested gene clusters
   ([56]Pomyen et al., 2015). In this study, 54 septic shock-related gene
   sets downloaded from MSigDB were divided in nine subclusters on the
   basis of function features, which included C1–C8 and the hallmark
   pathway cluster (H) ([57]Liberzon et al., 2015). In addition, ORA was
   carried out to determine the enrichment level of these gene sets.^[58]6

   Based on the key SSSIRGs obtained previously and different lengths of
   hospital stay (LOS) of septic shock patients, non-parametric test and
   clinical correlation analysis were performed, and the intersection of
   the results from these two analyses which were statistically
   significant was extracted for further analysis. Specifically, LOS
   served as an important sign in clinical judgment of the severity of
   septic shock.

Connectivity Map Analysis

   Here, we applied the Connectivity Map (CMap, build 02) to find
   potential inhibitors that may target key SSSIRGs. In total, 6,100 gene
   expression cases covering 1,309 drugs were obtained from the CMap
   database^[59]7 ([60]Lamb et al., 2006), that is, a potential drug might
   correspond to various gene expression cases. Genes in each case were
   ranked via taking differential expression values between drug-untreated
   and drug-treated cell lines, and 6,100 gene lists associated with drugs
   were then generated. On the basis of identified key SSSIRGs involved in
   septic shock and 6,100 drug-related cases, we conducted a
   non-parametric test to explore the relationship between drugs and
   septic shock.

   Information of targeting compounds is available in the mechanism of
   actions (MoAs)^[61]8 that includes transcriptional responses of various
   human cell lines to perturbagens, structural formulas, and various
   targets. On the basis of MoA, compounds which may target to septic
   shock-related SSSIRGs in this study were extracted.

Validation of Chromatin Immunoprecipitation Sequencing and Assay for
Targeting Accessible-Chromatin With High-Throughput Sequencing

   Chromatin Immunoprecipitation sequencing (ChIP-seq) and Assay for
   Targeting Accessible-Chromatin with high-throughput sequencing
   (ATAC-seq) data were used to validated the regulation mechanism of the
   network. ChIP-seq and ATAC-seq data were obtained from Cistrome
   database (see text footnote 5) and [62]GSE139099,^[63]9 respectively
   ([64]Zheng et al., 2019; [65]Grubert et al., 2020). The WashU Epigenome
   Browser and Gviz package were used to visualize the binding peaks
   ([66]Hahne and Ivanek, 2016; [67]Li et al., 2019).

Processing of Single-Cell RNA Sequencing Data

   Many studies supported that COVID-19 severe forms share clinical and
   laboratory aspects with various pathologies such as hemophagocytic
   lymphohistiocytosis, sepsis, or cytokine release syndrome. Fatal
   COVID-19 patients, as would be expected with septic shock patients,
   produce inflammatory reaction and cytokine storm ([68]Bost et al.,
   2021). Therefore, single-cell RNA sequencing (scRNA-seq) data of
   peripheral blood mononuclear cells (PBMCs) from fatal COVID-19 patients
   were obtained from [69]GSE150728 to determine the specific cellular
   location of key SSSIRGs and cell cycle regulation in this study, which
   can further explain the molecular pathogenesis of septic shock
   spatiotemporally.

   The preprocessing of scRNA-seq data of seven COVID patients and six
   healthy samples was conducted using 10x Genomics Chromium^[70]10
   ([71]Chen et al., 2020). After demultiplexing, the sequencing results
   were divided into two paired-end reads fastq files that were then
   trimmed to eliminate the template switch oligo (TSO) sequence and polyA
   tail sequence. In addition, clean reads were aligned with the GRCh38
   (Version: 100) genome assembly, which were quantified using the Cell
   Ranger Software (Version 1.0.0).^[72]11

   The quantitative gene expression matrices (the row names of matrices
   were genes and col names were barcodes) were analyzed with Seurat
   pipeline (Version: 3.2.2) for further analysis ([73]Butler et al.,
   2018). Only cells with less than 10% mitochondrial gene mapped and
   expressing more than 100,000 transcripts were extracted for further
   analysis. In addition, genes expressed in over three single cells were
   included in follow-up analyses. After the completion of quality control
   (QC), all samples were integrated into one Seurat object with the
   function of “IntegrateData,” which were then scaled and normalized with
   the function of “ScaleData.” The “vst” method was utilized to determine
   the top 1,500 variable genes. To reduce model dimensionality, principal
   component analysis (PCA) was initially conducted, and the top 20 PCs
   were incorporated as input file for Uniform Manifold Approximation and
   Projection for Dimension Reduction (UMAP) analysis. UMAP plots
   illustrating cell subclusters were constructed using the “DimPlot” and
   “RunUMAP” function.

Differential Expression Analysis and Cell Type Annotation

   Genes with significantly differential expression patterns from the top
   1,500 variable genes were defined as DEGs with the “wilcox” method
   using the “FindAllMarkers” function.

   To identify the cell type of each unsupervised cluster, DEGs in all
   subclusters were applied as potential references, which were combined