Graphical abstract

   graphic file with name fx1.jpg
   [117]Open in a new tab

Highlights

     * •
       Across all organs, fibroblast, and immune cell populations increase
       in COVID-19 patients
     * •
       Organ-specific cell types and functional markers are lost in all
       COVID-19 tissue types
     * •
       Lung compartment identity loss correlates with SARS-CoV-2 viral
       loads
     * •
       COVID-19 uniquely disrupts co-occurrence cell type clusters
       (different from IAV/ARDS)
     __________________________________________________________________

   Park et al. report system-wide transcriptome damage and tissue identity
   loss wrought by SARS-CoV-2, influenza, and bacterial infection across
   multiple organs (heart, liver, lung, kidney, and lymph nodes) and
   provide a spatiotemporal landscape of COVID-19 in the lung.

Introduction

   In March 2020, the World Health Organization (WHO) declared a novel
   pandemic of coronavirus disease 2019 (COVID-19), an infection caused by
   the betacoronavirus severe acute respiratory syndrome coronavirus 2
   (SARS-CoV-2), which is currently attributed to over 320 million cases
   and over 5.5 million deaths globally
   ([118]https://coronavirus.jhu.edu).[119]^1 Since the presenting
   symptoms of COVID-19 resemble those of common viral respiratory
   infections, a molecular diagnosis is required to distinguish a
   SARS-CoV-2 infection from influenza and other respiratory
   illnesses,[120]^2^,[121]^3 and ongoing questions remain about the host
   responses to SARS-CoV-2 relative to other respiratory pathogens. As
   severe illness and death continue to impact a segment of
   COVID-19-positive individuals, urgent questions persist about the
   molecular drivers of morbidity and mortality associated with SARS-CoV-2
   infection. This knowledge could lead to improvements in both the acute
   treatment and the long-term management of pathological changes in
   multiple organs.

   Prior work has shown that COVID-19 leads to high systemic interferon
   responses (both alpha and gamma) and that co-infection with other
   pathogens is relatively rare (3%–15%).[122]4, [123]5, [124]6 Yet there
   are limited data for discriminating the molecular response between
   different kinds of respiratory infections or pulmonary conditions
   (e.g., influenza A (IAV) versus SARS-CoV-2 infections) and almost no
   data on the variegated impact of different pathogens across different
   human tissues other than clinical observations from intensive care unit
   (ICU) patients.[125]^7 Delineation of pathogen- and tissue-specific
   differences is critical for understanding the molecular determinants of
   mortality associated with COVID-19 and for developing novel diagnostics
   and therapeutic interventions.

   To address this knowledge gap, we used shotgun metatranscriptomics
   (total RNA sequencing [RNA-seq]) to comprehensively profile human
   tissues in 39 patients who died from COVID-19 (185 total autopsy
   samples), including heart, liver, lung, kidney, and lymph nodes,
   analyzed gene expression, and assessed the system-wide impact of
   SARS-CoV-2 infection. We also used a spatial protein and transcript
   mapping platform (GeoMx) to visualize the cartography of the infection
   in these tissues and to discover disruption of regional and
   cell-type-specific expression. The spatial transcriptomics data
   examined 357 total regions of interest (ROIs), which were selected from
   13 patients who had SARS-CoV-2, influenza, or bacterial infections and
   from 3 normal patients as controls, revealing the cellular and
   regulatory signatures that define these distinct pathological states.
   Finally, to provide context to earlier stages and sites of infection,
   we compared these in-depth spatial and tissue-specific transcriptome
   maps with an independent cohort of nasopharyngeal (NP) swabs from 216
   COVID-19-positive patients and 430 COVID-19-negative controls, which
   revealed a significant and distinct disruption of cellular and
   transcriptional programs induced by SARS-CoV-2 infection in the
   patients who unfortunately succumbed to the disease. As a resource for
   the field, these data have also been placed in an online portal
   ([126]https://covidgenes.weill.cornell.edu/) for additional data mining
   and visualization.

Results

System-wide host responses and transcriptome changes by COVID-19

   We first used shotgun metatranscriptomics (total RNA-seq) for host and
   viral profiling on 39 patients who died from COVID-19, including 185
   organ-specific tissue samples from the respective autopsies and 22
   healthy control samples from organ donor remnant tissues
   ([127]Figure 1A; [128]Table S1). We examined the COVID-19-specific host
   responses and transcriptome changes across various organs (heart,
   kidney, liver, lung, and lymph nodes) to ascertain the differentially
   expressed genes (DEGs) between COVID-19 and control sample sets (q <
   0.01, expression fold change > 1.5 by the differential expression
   analysis method [DESeq2]; [129]Figure 1B; [130]Table S2). Separately,
   SARS-CoV-2 RNA reads were aligned to the SARS-CoV-2 genome and the
   number of reads was discerned across multiple tissues including the
   lung, lymph node, kidney, liver, and heart ([131]Figure S1) but mostly
   in the lung. Viral reads were robustly detected in NP swab samples from
   COVID-19 patients, consistent with previous reports.[132]^4^,[133]^8
   Normalized coverage values in SARS-CoV-2-positive autopsy tissue
   samples revealed a detection bias toward the SARS-CoV-2 3′ end
   sequences, which is consistent with the known viral transcript
   abundance ([134]Figures S1A and S1B).[135]^9 Reconstruction of the
   viral genomes revealed known and unknown variants common to many
   patients and some evidence of intra-host variability ([136]Figure S1C).
   Coverage limitations notwithstanding, we generally saw evidence of the
   same viral strain across the organ systems within each patient. We also
   assessed the variability of the COVID-19 patient samples by clustering
   based on the SARS-CoV-2 viral loads (and noted as SC2 high and SC2
   low), and the viral loads were inversely correlated with the duration
   of disease and independent of factors such as race, age, or gender
   ([137]Figures S1D and S1E). In addition to the annotated clinical
   metadata table ([138]Table S1), we summarized the clinical courses of a
   few representative COVID-19 patients from hospitalization to intubation
   to death ([139]Figure S1D).

Figure 1.

   [140]Figure 1
   [141]Open in a new tab

   Tissue- and duration-specific dysregulation of gene expression from
   SARS-CoV-2 infection

   (A) Sample overview by tissue for RNA-seq and GeoMx experiments. Number
   of regions of interest (ROIs) and patient numbers (n) are summarized,
   and a representative tissue slide image for GeoMX spatial profiling is
   presented along with the tissue types.

   (B) Volcano plots of the COVID-19-positive versus normal tissues are
   shown for the five tissues from autopsy: heart (n = 41), kidney (n =
   27), liver (n = 40), lung (n = 40), and lymph node (n = 27).
   Differentially expressed genes (>1.5-fold, q < 0.05, DESeq2) are shown
   as purple spots (downregulated in COVID-19) and orange (upregulated in
   COVID-19).

   COVID-19 pathway enrichment analysis revealed significant changes (q <
   0.01) in pathways for viral infection (regulation of viral genome
   replication and viral entry into host cell) and immune response
   (regulation of type 1 interferon response and regulation of tyrosine
   phosphorylation of Stat protein, Gene Ontology (GO) Regulation of
   toll-like receptor signaling pathway), and the enrichment varied as a
   function of the viral load ([142]Figure 2A; [143]Table S2). We observed
   that each tissue showed its own distinct transcriptional disruption in
   response to SARS-CoV-2 infection, with the lymph node exhibiting the
   greatest number of DEGs when compared with controls (in both high- and
   low-viral-load groups). Of note, both tissue-specific and pan-tissue
   disruptions of normal-expression programs were observed
   ([144]Figure 1B), and these were then summarized using gene set
   enrichment analysis (GSEA; [145]Figures 2A and [146]S2A). Some pathways
   were consistently dysregulated in all tissues during early infection
   (SARS-CoV-2 high), such as the G2M checkpoint (q values of 6.4 ×
   10^−19, 3.0 × 10^−5, 1.2 × 10^−8, 9.7 × 10^−8, and 0.002 for lung,
   liver, kidney, lymph node, and heart, respectively; [147]Table S2), E2F
   targeting (q values of 2.8 × 10^−26, 0.00672, 1.9 × 10^−7, 9.9 × 10^−9,
   and 0.047, respectively), and epithelial mesenchymal transition (EMT;
   q values of 2.1 × 10^−22, 3.0 × 10^−4, 2.9 × 10^−7, 1.4 × 10^−7, and
   0.0287, respectively), but in the late infection (SARS-CoV-2 low), gene
   networks showed more inter-tissue heterogeneity in their disrupted
   pathways, including cytokine activity and inflammatory response.
   However, in both the SARS-CoV-2 high- and low-viral-load groups, the
   G2M checkpoint and E2F networks were consistently upregulated,
   indicating a core, persistent set of dysregulated cell-cycle regulation
   genes during early and late stages of infection.

Figure 2.

   [148]Figure 2
   [149]Open in a new tab

   Pathways and cell population changes of COVID-19

   (A) The pathways that show significant differences in all or the
   majority of the five tissue types are shown, with statistical
   significance from GSEA testing across five kinds of tissue (colors in
   legend), including heart (n = 41), kidney (n = 27), liver (n = 40),
   lung (n = 40), and lymph node (n = 27). The x axis shows the normalized
   enrichment score for COVID-19-positive versus control, SARS-CoV-2 high
   or low versus control, and SARS-CoV-2 high versus Low comparisons.

   (B) Cellular deconvolution distribution violin plots for the heart,
   kidney, liver, and lung. SARS-CoV-2 high (red), low (orange), and
   normal (blue) tissues are shown using a square root scale. The number
   of biological replicates is the same as in (A).

   The DEGs and GSEA results were then examined for the largest
   differences between the infection level and stage (SARS-CoV-2 high,
   early COVID-19 infection versus SARS-CoV-2 low, late COVID-19
   infection). Interestingly, the heart tissues showed the largest
   transcriptional differences, revealing that the later stage of the
   infection had a much greater impact on cardiac tissues
   ([150]Figure S2A). To place these results into a larger context and to
   compare them with the findings of other data sets, we compared DEGs
   from each tissue with RNA-seq data from NP swabs as well as RNA-seq
   data from a publicly available data set on monocytes from
   COVID-19-positive and -negative patients ([151]Figure S2B).[152]^4
   While the highest correlations were seen within the same tissue types,
   most of the tissues with a high viral load showed a statistically
   significant, positive correlation with the DEGs in the NP swab samples
   (q < 0.01) when compared with normal/negative patients. In contrast,
   the later infection (low-viral-load) patients' tissues showed a
   negative correlation with the NP swab samples, indicating that the
   systemic impact of SARS-CoV-2 can be missed when not considering the
   biological impact on different organs. Interestingly, when matched with
   disease severity (high-, medium-, and low-viral-load groups within NP
   swab samples), the difference was bigger in the low and medium groups
   than in the high groups.

   To create a more fine-grained analysis of the cellular gene expression
   states in each tissue, we used the cell deconvolution multiple signal
   classification algorithm (MuSiC) on each tissue’s RNA-seq data (see
   [153]STAR Methods). The MuSiC results showed distinct disruptions of
   the transcriptional programs for each type of tissue in the COVID-19
   patients and in the gain or loss of cell types ([154]Figure 2B).
   Consistent with previous reports, the lung showed a loss of the
   capillary intermediate cells and alveolar epithelial cell
   types.[155]^10^,[156]^11 Strikingly, we also found decreases in the
   major cell types in each organ type, suggesting a systemic disruption
   of the COVID-19 response. The kidney and liver showed a loss of
   proximal tubule in the kidney and of hepatocyte marker expressions in
   the liver but an increase in T cells in both organs. Furthermore, the
   heart showed a near-complete loss of the cell signatures for
   cardiomyocytes in both SARS-CoV-2 high and low viral loads
   ([157]Figures 2B and [158]3A), despite no obvious gross or histologic
   changes in the heart ([159]Figure 3B). This observation extends from
   previously reported cardiovascular involvements in COVID-19 and further
   defines the SARS-CoV-2-specific transcript and cellular changes in lung
   as well as other organs such as heart, liver, and
   kidney.[160]^12^,[161]^13

Figure 3.

   [162]Figure 3
   [163]Open in a new tab

   Cellular disruption and tissue identity loss from SARS-CoV-2 infection

   (A) Specific gene expression distributions for cardiomyocyte-related
   genes (from patient autopsy samples, n = 41).

   (B) Sample collection strategy for COVID-19 autopsy samples from
   complete adult cases.

   (A–J) Two cases with representative hematoxylin and eosin (H&E) images
   are shown: (A–E) Cases 58 and (F–J) 73. (A and F) Lung, (B and G)
   kidney, (C and H) heart, (D and I) liver, and (E and J) mediastinal
   lymph node (H&E stain). (B, D, E, I, and J) Original magnification ×50.
   (A, C, F, G, and H) Original magnification ×100.

   (C) Gene expression changes of representative functional markers for
   each organ: heart (n = 41), kidney (n = 27), liver (n = 40), and lung
   (n = 40). Five genes related to the organ function were chosen (from
   the Human Protein Atlas), along with five housekeeping genes on the
   rightmost side. Darker gray shaded area represents the ranges of the
   housekeeping gene changes (baseline noise) and error bars show standard
   error from DESeq2 output, relative to the uninfected control.

   We then looked at several markers related to the functions and
   processes of each organ ([164]Figure 3C). While the lung showed the
   biggest changes in response to COVID-19, the loss of functional markers
   was specific to organ type. For example, surfactant proteins (i.e.,
   SFTPA1, SFTPB, SFTPC), which are components of the alveolar lipoprotein
   complex crucial for gas exchange, were lost in the lung but increased
   in other organs. Similarly, markers associated with liver function
   (i.e., liver enzymes and proteins such as ALB, HAO1, and ALDOB) and
   solute carrier family proteins (i.e., SLC22A13, SLC34A1, uromodulin
   [UMOD]) were specifically lost in liver and kidney, respectively. In
   addition to the cardiomyocyte markers ([165]Figure 3A), we also found
   that markers such as phospholamban (PLN, calcium pump inhibitor), heart
   and neural crest derivatives expressed 1 (HAND1), and troponin cardiac
   type (TNNT2) were specifically lost in the heart. This observation
   suggests that in addition to cell type losses, the disruptions due to
   COVID-19 affect the biological processes and functions for which each
   organ system is responsible. Consistent with recent reports related to
   virus-induced senescence found in COVID-19 lung models,[166]^14 these
   data also show that each organ system (lung, heart, liver, and kidney)
   exhibits its own distinct inflammatory response and resultant change of
   tissue identity.

Spatial and expression profiling of high and low SARS-CoV-2 infection in the
lung

   For a deeper examination of COVID-19 lung tissue, we then used the
   GeoMx Digital Spatial Profiling (DSP) platform to perform multiplexed
   high-resolution spatial transcriptomic profiling of 357 lung tissue
   ROIs from 16 patients. The ROIs were selected from deceased patients
   with COVID-19 (n = 8), nonviral acute respiratory distress syndrome
   (ARDS) (n = 2), influenza-induced ARDS (n = 3), and healthy tissues
   from individuals without infections as controls (n = 3) using nCounter
   Multiplex Analysis incorporated with targets for SARS-CoV-2 (i.e.,
   IO360 panel plus COVID-19 Spike-in; [167]Figure 1A). Among COVID-19
   patients characterized with total RNA-seq, we identified patients that
   had high overall SARS-CoV-2 expression (SC2 high) or had low overall
   expression of SARS-CoV-2 (SC2 low) from their lung-tissue samples, and
   four representative samples from each group were selected for further
   downstream analysis. Serial sections were stained with an RNA scope
   probe against the viral Spike (S) gene, Syto13 (nuclear DNA),
   macrophages (CD68), immune cells (CD45), and epithelial cells
   (pan-cytokeratin) along with the GeoMx Cancer Transcriptome Atlas Panel
   (CTA, 1,811 targets) supplemented with 23 human genes associated with
   lung biology and two open reading frames (ORFs) from the SARS-CoV-2
   genome ([168]Figures 1A and [169]S1). We chose tissue ROIs that
   captured three structural components of the lung, including vascular,
   airway, and alveolar regions ([170]Figure S3).

   We observed significant differences across tissue types within the lung
   (DEGs with q < 0.05 and > 1 fold change by DESeq2). For example,
   vascular ROIs showed an increase in ACTA2 and FLNA, while alveolar
   regions exhibited a mixture of signals from macrophage and monocyte
   genes (CD163 and CD68) with collagen (COL1A1, COL1A2, COL6A3). In large
   airway tissues, genes related to the mucosal layer (MUC4, MUC5AC,
   MUC5B) as well as cytokine-mediated signaling pathway genes (CCL20,
   CXCL1, CXCL6, CXCL6, MMP1) and type I interferon genes (IFIT3, ISG15,
   STAT1, MX1) were upregulated in COVID-19. Regardless of the tissue
   type, we saw consistent increases in genes such as B2M, CD81, GNAS,
   HLA.B, HSPA1A, HSPB1, and SERPING1 in COVID-19 lungs ([171]Figure 4A;
   [172]Table S3). When we summarized these genes using GO terms,
   tissue-type-specific processes in response to COVID-19 were distinct
   from overall inflammation ([173]Table S3). While inflammatory responses
   were present in all tissue compartments, macrophage and monocyte
   activation were prominent in alveolar regions, while fibrosis occurred
   near the vascular region. Moreover, type I interferon- and
   cytokine-related pathways were found significant in large airway
   tissues, while complement activation was found in vascular tissues.

Figure 4.

   [174]Figure 4
   [175]Open in a new tab

   Spatial transcriptomics identifies tissue- and disease-specific
   differences

   (A) Venn diagram of tissue-specific COVID-19 DEGs (relative to normal,
   adjusted p values < 0.01, >1-fold) from 16 patients across 357 ROIs in
   total.

   (B) UpSet plot depicting intersections of disease-specific DEGs (p
   values < 0.05, fold change [|FC|] > 1), also from 16 patients across
   357 ROIs in total.

   (C) Volcano plot showing differences between normal (n = 3, 64 ROIs)
   and SARS-CoV-2 high-viral-load samples (n = 4, 86 ROIs) after
   accounting for compartmental differences. Top genes, in terms of p
   value or FC are indicated in gray, and COVID-19 Spike-in genes are
   labeled in black.

   (D) Ternary plot of a combined analysis of SARS-CoV-2 high (n = 4, 86
   ROIs), low (n = 4, 97 ROIs), and normal (n = 3, 64 ROIs), where genes
   are projected away from the center based on their marginal means. Genes
   upregulated in a single group approach that group’s corner. Top genes
   in terms of p value or FC are labeled. Genes with color were
   significant with p < 0.05. Genes with outlines are significant after
   correcting for multiple hypothesis testing.

   We also compared differences across injury sources (SARS-CoV-2
   infection, influenza, and nonviral ARDS) and found significant
   differences between normal and SARS-CoV-2-high samples, even after
   accounting for compartmental variability ([176]Figures 4B and [177]S4).
   These differences included a decrease in the expression of surfactant
   genes (SFTPA1, SFTPB, and SFTPC) associated with type 1 and type 2
   pneumocytes as well as an enrichment for genes associated with basal
   cells (TP63) and club cells (SCGB1A1), and several immune markers
   (e.g., HLA-B, HLA-E, p all < 0.05, mixed effects model) in
   SARS-CoV-2-high patients compared with both normal and SARS-CoV-2-low
   patients. Ternary plots of a combined analysis of normal,
   SARS-CoV-2-high and -low tissues—where transcripts are projected away
   from the center based on their marginal means—revealed upregulation of
   several genes enriched in each set of lung tissues. Enrichments
   included SFTPA1, SFTPB, and SFTPC (alveolar epithelial cell markers) in
   normal lungs, CLU (lung injury and repair) and S100A9 (enriched in
   activated macrophages) in SARS-CoV-2-low lungs, and TP63 (basal cell),
   ID1 (upregulated and a key regulator of lung injury and repair), and
   interferon-regulated genes including IFI6, IFI27, ISG15, and LY6E in
   SARS-CoV-2-high lungs ([178]Figures 4C and 4D). Enrichment of
   interferon-stimulated genes was observed only in SARS-CoV-2-high
   samples, which corresponds to early stages of the infection, by timing
   of disease onset and histopathology ([179]Figures S1D and S1E).
   Moreover, we observed enrichments of CASP3 and ID1, suggesting ongoing
   cellular injury and repair responses in SARS-CoV-2-high patients. In
   contrast, we find an enrichment of several markers of pulmonary
   fibrosis (CLU, COL1A1, COL1A2, and COL3A1) in SARS-CoV-2-low patients
   (these correspond to the later stages of infection), indicating that
   there are two distinct stages of infection.[180]^15

   We also examined the spatial transcriptome data for differences between
   the high- and low-SARS-CoV-2- and the IAV-infected lung samples. This
   analysis revealed a significant enrichment of THBS1 and NR4A1 in the
   influenza samples, which are both genes that have been shown to be
   engaged in response to influenza-induced lung injury ([181]Figure S4B).
   When lung tissues from SARS-CoV-2-high and -low samples were compared
   with those from nonviral ARDS, enrichment for S100A8 and S100A9 was
   observed, which is consistent with the significant enrichment of
   neutrophils in these samples and was previously suggested to be a
   driver of COVID-19 pathogenesis ([182]Figure S4A).[183]^3^,[184]^16
   Importantly, an enrichment for genes involved in lung injury and repair
   (ID1), as well as those involved in type I interferon responses
   including IFI6, IFI27, ISG15, and LY6E in SARS-CoV-2-high lungs, is
   observed even when compared with IAV infection of the lung and nonviral
   lung injury. Similarly, we find enrichment of several markers of
   pulmonary fibrosis (CLU, COL1A1, COL1A2, and COL3A1) in SARS-CoV-2-low
   samples as compared with nonviral and viral ARDS samples, exemplifying
   the profound lung injury and fibrosis during later stages of COVID-19.

COVID-19-specific heterogeneity and spatial tropism in the lung

   Regardless of specific lung-tissue types, the expression profiles and
   respective cell-type proportions were enough to distinguish and cluster
   normal versus COVID-19 (high and low SARS-CoV-2 viral loads) lung ROIs
   ([185]Figure 5A), which indicates that the SARS-CoV-2 infection is
   altering the cellular interaction landscape and composition of the lung
   tissue. To gain additional insights into the heterogeneity and spatial
   tropism of lung infection independent of ROI origins, we assessed how
   each ROI compares with the typical healthy tissue. Some ROIs
   (especially of alveolar regions) in COVID-19 showed significantly
   diminished similarity to those of healthy lungs ([186]Figures 5B,
   [187]S5A, and S5B). Interestingly, the loss of this similarity did not
   result in convergence to other tissue types we observed (i.e., loss of
   alveolar similarity score did not increase the vascular or large airway
   scores). To evaluate the cause of this loss of similarities, we
   discovered genes attributing disease- and tissue-specific clustering
   and identified 35 genes with the highest contrast ([188]Figure 5C). For
   example, some of these genes were related to regulatory T cell
   differentiation in COVID-19 high- and low-viral-load alveolar tissues
   (PLK1, ATM) and IL6/JAK/STAT3 signaling in SARS-CoV-2-low vascular
   regions (SOCS1, CXCL11; [189]Table S3). In addition to the known
   molecular responses to COVID-19, these genes can be used as a set to
   locate infection response within the lung and disease states.

Figure 5.

   [190]Figure 5
   [191]Open in a new tab

   Evaluation of tissue identity loss and heterogeneity from infection

   (A) Principal-component analysis (PCA) to see sample- and
   disease-clustering. Colors denote disease conditions (64 normal ROIs,
   86 and 97 SARS-CoV-2 high- and low-viral-load ROIs), while shapes show
   the tissue types (alveolar, large airway, and vascular regions).

   (B) Similarity score distribution when compared with a tissue-specific
   (278 alveolar, 45 large airway, and 34 vascular ROIs) healthy reference
   gene profile (ns, non-significant, ∗p ≤ 0.05, ∗∗p ≤ 0.01, ∗∗∗p ≤ 0.001,
   and ∗∗∗∗p ≤ 0.0001).

   (C) Genes identifying disease- and tissue-specific conditions.

   (D) Cell-type proportions in normal versus COVID-19. The median and
   quartiles are noted by the box plot inside. p value two-tailed t tests
   were done to compare the means (ns, ∗p ≤ 0.05, ∗∗p ≤ 0.01, ∗∗∗p ≤
   0.001, and ∗∗∗∗p ≤ 0.0001).

   (E) Enrichment of cell type- and COVID-19-specific gene signatures from
   11 patients across 247 ROIs.

   To pinpoint specific changes in different lung tissue types, count
   estimates of 15 distinct cell types were imputed based on gene
   expression profiles from the Human Cell Atlas (HCA) adult lung dataset,
   including neutrophil profiles derived from single-nucleus RNA-seq
   (snRNA-seq) of lung tumors (see [192]STAR Methods). Consistent with
   other studies,[193]^17^,[194]^18 we observed that COVID-19 was
   associated with an increase in tissue-infiltrating immune cells,
   including T cells, natural killer (NK) cells, monocytes, and
   macrophages ([195]Figures 5D and [196]S5C). Some immune cell types,
   such as monocytes, NK cells, and regulatory T cells, showed a
   statistically significant increase only in the SARS-CoV-2-high
   conditions. Also, while fibroblasts and endothelial cells increased in
   both high- and low-viral-load samples (52% and 65% increase,
   respectively), type 1 and type 2 alveolar epithelial cell proportions
   decreased (26% and 16% decrease, respectively), reflecting the ongoing
   tissue remodeling or selective epithelial cell death induced by
   infection ([197]Figure S5C). We validated these findings by comparing
   the expression levels with cell-type-specific COVID-19 gene signatures
   derived from snRNA-seq,[198]^19 which showed enrichments across
   different cell types ([199]Figure 5E; [200]Table S3). Using an
   orthogonal approach, we stained the lung tissues with Masson’s
   trichrome and observed a statistically significant increase (p < 0.01)
   in cellular collagen-rich areas, confirming the increase in lung
   fibroblasts ([201]Figure S3A).

   Given the observed changes in cell proportions that are induced during
   SARS-CoV-2 infection, we next examined the impact on the co-occurrence
   of these changes as a metric of intrapulmonary cellular heterogeneity
   ([202]Figure 6). Pairwise correlations of all detected cell types under
   five different conditions (SARS-CoV-2 high and low, IAV, nonviral ARDS,
   and control) were calculated and visualized ([203]Figure 6A). In normal
   lungs, we observed three clear cellular correlation clusters: (1)
   monocytes, fibroblasts, T cells, and NK cells, (2) neutrophils and
   ciliated cells, and (3) plasmacytoid dendritic cells (pDCs),
   macrophages, and B cells. While perturbations of these cellular
   clusters were observed across all injury conditions, the NK cell-T cell
   correlation was lost only in the SARS-CoV-2 high-viral-load patients
   and was not present in low-viral-load patients, with the low viral load
   corresponding to the later stages of infection ([204]Figures 6A and
   6B). We then quantified the correlation differences in the lungs'
   cellular landscapes between SARS-CoV-2-high and -low patients, which
   indicated that the greatest changes were in the monocyte-T cell
   correlations and dendritic-neutrophil correlations ([205]Figure S6A),
   further supporting the view that SARS-CoV-2-specific T cell activity
   may be disrupted. Of note, it may be possible that the changes in
   correlation reflect both the generation of long-term memory and
   T cell-mediated killing of infected epithelial cells.[206]^20

Figure 6.

   [207]Figure 6
   [208]Open in a new tab

   Cellular tropism and heterogeneity in response to COVID-19

   (A) Correlation matrix of pairwise cell-type correlations.
   Statistically insignificant correlations were not displayed (gray area,
   p < 0.05); from 16 patients across 357 ROIs in total, as shown in
   [209]Figure 1A.

   (B) Average proportion changes of the cell types relative to normal.
   The cell types were ordered by average increase in proportions across
   357 ROIs. Error bars indicate 0.5∗SD. On the left, stacked bar plot
   depicts the overall proportions by conditions.

   (C and D) Entropy estimates of the (C) tissue types and (D) cell types
   within the normal (n = 3, 64 ROIs), SARS-CoV-2-high (n = 4, 86 ROIs)
   and -low (n = 4, 97 ROIs), influenza (n = 2, 46 ROIs), and ARDS (n = 3,
   67 ROIs) conditions.

   To compare changes across the disease states (COVID-19, influenza,
   ARDS), we visualized the average proportions of each cell type.
   Macrophage and neutrophil population levels were found to be much
   higher in the COVID-19 lungs (both high and low viral loads), while
   T cell, monocyte, and epithelial cell numbers were much lower than
   normal. The entropy estimates of the given cell types ([210]Figure 6C),
   which are a function of variance, showed that macrophages and
   neutrophils commonly displayed increased heterogeneity (across ROIs and
   patients) across all injury conditions. Entropy in fibroblast,
   epithelial, vessel, and T cell populations was greater in
   SARS-CoV-2-high and -low lung tissue than in IAV infection and ARDS
   (121.5%, 104.7%, 111.9%, and 124.7% changes, respectively, when
   compared with the average of SARS-CoV-2 high/low and that of IAV/ARDS;
   [211]Figure 6D), suggesting an increase in cellular heterogeneity. When
   measuring and comparing cellular tropism within the tissue, nearly all
   COVID-19-positive ROIs showed an increase in heterogeneity of the cell
   populations, with the sole exception being vascular regions with high
   COVID-19 ([212]Figure 6D). The decrease in heterogeneity in the
   vascular regions of high-viral-load conditions is mainly from the
   decrease of fibroblasts and epithelial, T, and NK cells
   ([213]Figure S6B). The signals from B cells are also specific to large
   airway tissues, and this observation is consistent with the cell
   fraction increase in large airway ROIs of COVID-19 samples
   ([214]Figure S6B). Such changes in heterogeneity are related to
   specific damages and tissue dysregulations. Single-sample GSEA (ssGSEA)
   of macrophage, neutrophil, and T cell regulatory pathways showed
   enrichment in COVID-19, even when compared with influenza and ARDS,
   including macrophage activation and apoptotic process (1.3- and
   2.8-fold increase in averaged ssGSEA scores relative to normal, with p
   values of 2.91 × 10^−8 and 0.001) in patients with high viral loads
   ([215]Figures S6C and S6E).

Discussion

   In this study, we established a clinical analytical pipeline to collect
   and examine autopsy samples to elucidate and compare the spatial
   transcriptional landscape induced by SARS-CoV-2, IAV, and nonviral
   ARDS. By combining transcriptional profiles of 39 patient autopsy
   tissues from heart, liver, lung, kidney, and lymph nodes, we presented
   body-wide transcriptome changes in response to COVID-19. Across all
   tissues, we found system-wide disruptions of cellular and
   transcriptional pathways and matched the lung data with spatial protein
   and expression profiling (GeoMx, across 357 tissue sections from 16
   representative patient lung samples). We also identified
   tissue-compartment-specific damage (alveolar, vascular, and large
   airway compartments within the lung tissue) and the loss of tissue type
   identity caused by the SARS-CoV-2 infection, which correlated with
   viral loads (high versus low) and the clinical course of infection.

   Patient lung tissue samples containing significant levels of SARS-CoV-2
   showed enrichment for genes related to a variety of immune markers
   specific to certain immune cells and lung injuries as well as for
   interferon-stimulated genes (e.g., IFI27, IFITM1, and LY6E) and
   macrophage activation (S100A9, TYMP, and SERPING1). In contrast,
   patient lung tissue samples containing low levels of SARS-CoV-2 RNA
   show enrichment for COL1A1 and other markers of pulmonary fibrosis.
   Compared with other viral-related diseases (influenza), COVID-19 tissue
   samples still show significant enrichment for genes involved in lung
   injury and repair, interferon-signaling genes, and pulmonary fibrosis.
   Of note, COVID-19 (high-viral-load samples), influenza, and ARDS each
   show differential HLA-B and -C expressions, which are known mediators
   of NK and T cell activation[216]^21^,[217]^22 and which can mediate
   host risk of infection; one example is enrichment for HLA-DRB5, whose
   expression and specific gene polymorphisms are associated with
   pulmonary fibrosis and severity.[218]^23^,[219]^24 Compared across
   different disease types, all diseases, COVID-19 (low viral load),
   influenza, and ARDS showed enrichment for DMBT1, a gene known to be
   upregulated and dysregulated in pulmonary injury and
   fibrosis.[220]^25^,[221]^26 Virus-related diseases (SARS-CoV-2 high
   viral load and influenza) in particular showed significant changes on
   the expression of lung epithelial-cell-related transcripts (i.e., ACTB,
   C1R, and FN1); such changes are known markers of lung-injury gene
   signatures.[222]^27^,[223]^28

   The spatial analysis platform (Nanostring GeoMx) enabled us to
   investigate the impact of the disease by incorporating cellular and
   spatial organization. Consistent with recent reports from bulk cellular
   profiling, we observed an increase in immune cell types and fibroblasts
   in COVID-19 but a decrease in alveolar epithelial cells.[224]^10 In
   SARS-CoV-2 low-viral-load tissues, the proportions of some immune cells
   (i.e., monocytes, NK cells, or regulatory T cells) were normal, but
   fibroblasts and vessel cells still exhibited an increase similar to
   those observed in the high-viral-load samples. Some of these cell types
   form a “cellular correlation cluster,” and these co-occurrence clusters
   of cellular changes are uniquely disrupted in COVID-19 (relative to
   influenza and ARDS), particularly in the COVID-19 high-viral-load
   sample group. While macrophages and neutrophils showed an increase in
   entropy across all lung-related injury conditions, NK and T cells
   showed an increase only in COVID-19 samples. While few studies have
   interrogated the tissue environments, multiple studies have examined
   the changes occurring during COVID-19 infection in the peripheral blood
   and have identified poor T cell responses and T cell
   dysregulation.[225]^27^,[226]29, [227]30, [228]31, [229]32, [230]33
   Together, these findings highlight the robust and dynamic nature of
   SARS-CoV-2 engagement with tissue homeostatic processes and that the
   stage of COVID-19 infection impacts the pathophysiological landscape of
   the lung.

   When these spatial transcriptomics data were compared to the
   multi-organ bulk RNA-seq data from the autopsy issues, confirmatory as
   well as additional signatures of COVID-19 disease were found. First,
   fibroblasts and immune cells (i.e., macrophages) were increased in most
   tissue types, while tissue-specific cell types, such as alveolar
   epithelial types 1 and 2 cells in the lung and cardiomyocytes in the
   heart, showed a decrease in COVID-19 relative to controls. Increases in
   fibroblasts, endothelial cells, and immune cells may be impacted by a
   variety of immune activations within each organ, particularly as a
   long-term response to the infection. This observation may also be
   related to the decrease in characteristic transcriptomic signatures of
   the main cell type within each organ, which may contribute to the
   morbidity and mortality of COVID-19. For example, the reduction in the
   cardiomyocyte cell fraction correlated with a reduction in several
   transcripts encoding sarcomeric and contractile proteins (in both high
   and low viral loads),[231]^34^,[232]^35 representing a persistent
   transcriptional perturbation and potentially long-term,
   cardiac-specific impact of COVID-19.

   These data support the view of both tissue-specific and time-dependent
   biological responses to the stages of SARS-CoV-2 infection, and this is
   buttressed by orthogonal data. For example, DEGs observed from NP swabs
   showed a high correlation to tissue-specific DEGs in the early stages
   of infection but very little correlation with later infection. The
   observed correlations likely reveal monocyte migration and infiltration
   into tissues during SARS-CoV-2 infection, as seen by others.[233]^36 It
   has also been reported that monocyte depletion/migration is associated
   with kidney disease, inducing lupus-like symptoms,[234]^37 which could
   potentially explain the correlation with kidney tissue. For lymph
   nodes, there exists evidence in the literature that SARS-CoV-2 will
   impact the lymph nodes at an early stage of infection, potentially
   causing T cell lymphopenia and possibly being responsible for focal
   necrosis seen in the lymph nodes.[235]^38 Nonetheless, the lung, heart,
   and lymph nodes were the tissues most disrupted by infection.

   Overall, these data represent one of the largest autopsy series of
   COVID-19 disease and synthesize several orthogonal methods, including
   bulk transcriptomics, digital spatial transcriptomics, multiple imaging
   technologies, and computational analysis, to build a map of SARS-CoV-2
   pathophysiology. Given the ability to combine bulk transcriptomics data
   from multiple organ types, we find organ-specific changes to immune
   responses and the loss of tissue functions unique to COVID-19 disease,
   which can help additional studies and methods for mitigating the
   systemic damage caused by the SARS-CoV-2 virus across the body.

Limitations of the study

   While the size of the cohort we used for spatial profiling is smaller
   than that used for bulk transcriptomics, we believe an in-depth
   characterization of patient tissues is crucial in directly validating
   and supporting the findings from cellular and animal
   models.[236]^39^,[237]^40 Spatial profiling technology captures key
   aspects of COVID-19 that bulk data cannot, including the locations of
   transcriptional and cellular changes caused by the disease (especially
   in late infection) and spatial heterogeneity of cell types. The
   compartment-specific COVID-19 gene signatures would need further
   validation (e.g. chromatin states and additional validation), as we
   rely on computational methodologies and deconvolution techniques, and
   could also benefit from matched profiling of peripheral blood,[238]^41
   examination of strain type,[239]^42 and also differences in vaccination
   status.[240]^43 Nonetheless, this molecular map of COVID-19 represents
   a needed cellular and molecular atlas for the community, which can
   inform future studies into COVID-19 progression and SARS-CoV-2
   pathology.

STAR★Methods

Key resources table

   REAGENT or RESOURCE SOURCE IDENTIFIER
   Antibodies
     __________________________________________________________________

   Immune Cell Profiling Panel (Core) Nanostring Technologies, Inc
   GMX-PROCONCT-HICP-12, Item 121300101, Lot# 0474026
   10 Drug Target Panel Nanostring Technologies, Inc
   GMX-PROMODNCT-HIODT-12, Item 121300102, Lot# 0474029
   Immune Activation Status Panel Nanostring Technologies, Inc.
   GMX-PROMODNCT-HIAS-12, Item 121300103, Lot# 0474032
   Immune Cell Typing Panel Nanostring Technologies, Inc
   GMX-PROMODNCT-HICT-12, Item 121300104, Lot# 0474035
   Cell Death Panel Nanostring Technologies, Inc GMX-PROMOD-NCTHCD-12,
   Lot# 0474050
   MAPK Signaling Panel Nanostring Technologies, Inc
   GMX-PROMOD-NCTHMAPK-12, Lot# 0474047
   Pl3K/AKT Signaling Panel Nanostring Technologies, Inc
   GMX-PROMOD-NCTHPl3K-12, Lot# 0474053
   COVID-19 GeoMx-formatted Antibody Panel including (TMPRSS2, clone
   EPR3861; ACE2, clone EPR4436; Cathepsin L/V/K/H, clone EPR8011; DDX5,
   clone EPR7239; and SARS-CoV-2 spike glycoprotein, polyclonal) Abcam
   ab273594, Lot# GR3347471-1
   GeoMx Solid Tumor TME Morphology Kit Nanostring Technologies, Inc
   GMX-PRO-MORPH-HST-12; Item 121300310
   Alexa Fluor® 647 alpha-Smooth Muscle Actin Antibody, clone 1A4 Novus
   Bio IC1420R
     __________________________________________________________________

   Biological samples
     __________________________________________________________________

   Autopsy tissues Weill Cornell Medicine Department of Pathology
   [241]https://pathology.weill.cornell.edu/
     __________________________________________________________________

   Chemicals, peptides, and recombinant proteins
     __________________________________________________________________

   TRIzol Invitrogen Cat. #15596026
   10% neutral buffered Formalin Electron Microscopy Sciences Cat. #15712
   DNAse I Zymo Research Cat. #E1010
     __________________________________________________________________

   Critical commercial assays
     __________________________________________________________________

   Super-Script III Platinum SYBR Green One-Step qRT-PCR Kit Invitrogen
   Cat. #12594025
   BD Univeral Viral Transport Media System Becton, Dickinson and Company
   Cat. #220526
   QIAsymphony DSP Virus/Pathogen Mini Kit Qiagen Cat. #937036
   NEBNext® rRNA Depletion Kit v2 (Human/Mouse/Rat) with RNA Sample
   Purification Beads New England BioLabs Cat. #E7405
   NEBNext® Ultra™ II Directional RNA Library Prep Kit for Illumina New
   England BioLabs Cat. #E7760
   TapeStation 2200 Agilent Technologies Cat. #G2964AA
   Kapa Biosystems Illumina library quantification kit Roche Cat.
   07960140001
   GeoMx DSP system Nanostring Technologies, Inc MAN-10088-03
     __________________________________________________________________

   Deposited data
     __________________________________________________________________

   Raw and analyzed RNA-seq data This paper dbGAP: accession #38851 and ID
   phs002258.v1.p1
   Analyzed Nanostring GeoMx data This paper GEO: [242]GSE169504
   Human reference genome NCBI build 38, Gencode Human Release 33
   (GRCH38.p13) Genome Reference Consortium
   [243]http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/human/
   Raw RNA-seq data Rother et al.[244]^44 [245]GSE159678
   Reference scRNA-seq data Travaglini et al.[246]^45; MacParland
   et al.[247]^46; Stewart et al.[248]^47; Wang et al.[249]^8
   [250]https://www.humancellatlas.org/
   Molecular Signatures for GSEA (MSigDB) Liberzon et al.[251]^48;
   Subramanian et al.[252]^49; Kuleshov et al.[253]^50; Sergushichev
   et al., 2016 [254]http://www.gsea-msigdb.org/gsea/
     __________________________________________________________________

   Oligonucleotides
     __________________________________________________________________

   Primers for RT-PCR; ACTB-Forward: CGTCACCAACTGGGACGACA This paper N/A
   Primers for RT-PCR; ACTB-Reverse: CTTCTCGCGGTTGGCCTTGG This paper N/A
   Primers for RT-PCR; SARS-CoV-2-TRS-L: CTCTTGTAGATCTGTTCTCTAAACGAAC This
   paper N/A
   Primers for RT-PCR; SARS-CoV-2-TRS-N: GGTCCACCAAACGTAATGCG This paper
   N/A
     __________________________________________________________________

   Software and algorithms
     __________________________________________________________________

   ImageJ Schneider et al., 2012 [255]https://imagej.nih.gov/ij/
   nf-core/rnaseq pipeline Ewels et al.[256]^51
   [257]https://nf-co.re/rnaseq
   FastQC Andrews[258]^52
   [259]https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
   Trim Galore! N/A [260]https://github.com/FelixKrueger/TrimGalore
   STAR Dobin et al.[261]^53 [262]https://github.com/alexdobin/STAR
   Salmon Patro et al.[263]^54
   [264]https://salmon.readthedocs.io/en/latest/salmon.html
   Picard N/A [265]https://github.com/broadinstitute/picard
   StringTie Kovaka et al.[266]^55
   [267]https://ccb.jhu.edu/software/stringtie/
   Samtools Li and Durbin[268]^56 [269]http://samtools.sourceforge.net/
   DESeq2 R package Love et al.[270]^57
   [271]https://bioconductor.org/packages/release/bioc/html/DESeq2.html
   MuSiC R package Wang et al.[272]^58
   [273]https://xuranw.github.io/MuSiC/articles/MuSiC.html
   quanTIseq R package Finotello et al.[274]^59
   [275]https://icbi.i-med.ac.at/software/quantiseq/doc/
   Cocor R package Diedenhofen and Musch[276]^60
   [277]https://cran.r-project.org/web/packages/cocor/cocor.pdf
   synRNASeqNet R package Luciano Garofano
   [278]https://github.com/cran/synRNASeqNet
     __________________________________________________________________

   Other
     __________________________________________________________________

   Resource page to visualize and explore autopsy RNA-seq data This paper
   [279]https://covidgenes.weill.cornell.edu
   [280]Open in a new tab

Resource availability

Lead contact

   Further information and requests for resources and reagents should be
   directed to the lead contact, Christopher E. Mason
   ([281]chm2042@med.cornell.edu).

Materials availability

   This study does not involve new unique reagents or materials.

Experimental model and subject details

IRB statement

   Tissue samples were provided by the Weill Cornell Medicine Department
   of Pathology. The Tissue Procurement Facility operates under
   Institutional Review Board (IRB) approved protocol and follows
   guidelines set by Health Insurance Portability and Accountability Act
   (HIPAA). Experiments using samples from human subjects were conducted
   in accordance with local regulations and with the approval of the IRB
   at the Weill Cornell Medicine. The autopsy samples are considered human
   tissue research and were collected under IRB protocols 20-04021814 and
   19-11021069. All autopsies have consent for research use from next of
   kin, and these studies were determined as exempt by IRB at Weill
   Cornell Medicine under those protocol numbers.

Patient sample collection

   All autopsies are performed with consent of next of kin and permission
   for retention and research use of tissue. Autopsies were performed in a
   negative pressure room with protective equipment including N-95 masks;
   brain and bone were not obtained for safety reasons. All fresh tissues
   were procured prior to fixation and directly into Trizol for downstream
   RNA extraction. Tissues were collected from lung, liver, lymph nodes,
   kidney, and the heart as consent permitted. For GeoMx, RNAscope,
   trichrome and histology tissue sections were fixed in 10% neutral
   buffered formalin for 48 hours before processing and sectioning. These
   cases had a post-mortem interval of less than 48 hours. For bulk
   RNA-seq tissues, post-mortem intervals ranged from less than 24 hours
   to 72 hours (with 2 exceptions - one at 4 and one at 7 days - but
   passing RNA quality metrics) with an average of 2.5 days. All deceased
   patient remains were refrigerated at 4°C prior to autopsy performance.

Method details

qRT-PCR

   Total RNA was extracted in TRIzol (Invitrogen) according to the
   manufacturer’s instructions. To quantify viral replication, measured by
   the expression of sgRNA transcription of the viral N gene, one-step
   quantitative real-time PCR was performed using SuperScript III Platinum
   SYBR Green One-Step qRT-PCR Kit (Invitrogen) with primers specific for
   the TRS-L and TRS-B sites for the N gene as well as ACTB as an internal
   reference. Quantitative real-time PCR reactions were performed on an
   Applied Biosystems QuantStudio 6 Flex Real-Time PCR Instrument (ABI).
   Delta-delta-cycle threshold (ΔΔCT) was determined relative to ACTB
   levels and normalized to mock infected samples. Error bars indicate the
   standard deviation of the mean from three biological replicates. The
   sequences of primers/probes are provided in the [282]key resources
   table.

RNA-seq analysis

   Patient specimens were processed as described in Butler et al.[283]^4
   Clinical metadata is summarized in [284]Table S1. Briefly,
   nasopharyngeal (NP) swabs were collected N using the BD Universal Viral
   Transport Media system (Becton, Dickinson and Company, Franklin Lakes,
   NJ) from symptomatic patients. Total Nucleic Acid (TNA) was extracted
   from using automated nucleic acid extraction on the QIAsymphony and the
   DSP Virus/Pathogen Mini Kit (Qiagen). Autopsy tissues were collected
   from lung, liver, lymph nodes, kidney, and the heart and were placed
   directly into Trizol, homogenized, and then snap frozen in liquid
   nitrogen. At least after 24 hours these tissue samples were then
   processed via standard protocols to isolate RNA.

   For RNA library preparation, all samples' TNA were treated with DNAse 1
   (Zymo Research, Catalog #E1010). Post-DNAse digested samples were then
   put into the NEBNext rRNA depletion v2 (Human/Mouse/Rat), Ultra II
   Directional RNA (10 ng), and Unique Dual Index Primer Pairs were used
   following the vendor protocols from New England Biolabs. Completed
   libraries were quantified by Qubit and run on a Bioanalyzer for size
   determination. Libraries were pooled and sent to the WCM Genomics Core
   or HudsonAlpha for final quantification by Qubit fluorometer
   (ThermoFisher Scientific), TapeStation 2200 (Agilent), and qRT-PCR
   using the Kapa Biosystems Illumina library quantification kit.

   NYGC RNA sequencing libraries were prepared using the KAPA Hyper
   Library Preparation Kit + RiboErase, HMR (Roche) in accordance with
   manufacturer's recommendations. Briefly, 50-200ng of Total RNA were
   used for ribosomal depletion and fragmentation. Depleted RNA underwent
   first and second strand cDNA synthesis followed by adenylation, and
   ligation of unique dual indexed adapters. Libraries were amplified
   using 12 cycles of PCR and cleaned-up by magnetic bead purification.
   Final libraries were quantified using fluorescent-based assays
   including PicoGreen (Life Technologies) or Qubit Fluorometer
   (invitrogen) and Fragment Analyzer (Advanced Analytics) and sequenced
   on a NovaSeq 6000 sequencer (v1 chemistry) with 2x150bp targeting 60M
   reads per sample.

Spatial transcriptomics analysis

   Gene Expression profiling of freshly extracted RNA from formalin fixed
   paraffin-embedded (FFPE) lung samples was performed using the
   NanoString PanCancer IO360 panel with custom probes for SARS-CoV-2
   viral genes. After normalization, high and low COVID-19 clusters were
   identified by unsupervised analysis, and samples from each cluster were
   selected for additional profiling. GeoMx Digital Spatial Profiling
   (DSP) was performed on these samples, and control samples from
   non-viral ARDS, influenza, and normal lung tissues following standard
   protocols using the COVID-19 Immune Response Atlas.[285]^61^,[286]^62
   Samples were stained with immunofluorescent antibodies for CD68, CD45,
   PanCK, and DNA (Syto-13). Regions profiled included vascular zone,
   large airway, alveoli zone, and IF-guided segments focused specifically
   on macrophages. Samples were sequenced on an Illumina NextSeq,
   processed and filtered for quality as described in [287]supplementary
   methods. Differential expression was assessed on the resulting
   normalized data using mixed effect models, accounting for intra-patient
   heterogeneity to assess differences between SARS-CoV-2 high and low
   viral load samples, and among distinct tissue structures profiled. Cell
   deconvolution of the GeoMx data was performed using the SpatialDecon R
   package.[288]^63 Gene set enrichment analysis (GSEA) was performed to
   qualify coordinate gene expression changes quantified during
   differential expression analysis.[289]^64

Quantification and statistical analysis

RNA-seq analysis

Differential gene analysis

   RNAseq data was processed through the nf-core/rnaseq pipeline.[290]^51
   This workflow involved quality control of the reads with
   FastQC,[291]^52 adapter trimming using Trim Galore
   ([292]https://github.com/FelixKrueger/TrimGalore), read alignment with
   STAR,[293]^53 gene quantification with Salmon,[294]^54 duplicate read
   marking with Picard MarkDuplicates
   ([295]https://github.com/broadinstitute/picard), and transcript
   quantification with StringTie.[296]^55 Other quality control measures
   included RSeQC, Qualimap, and dupRadar. Alignment was performed using
   the GRCh38 build native to nf-core and annotation was performed using
   Gencode Human Release 33 (GRCH38.p13). FeatureCounts reads were
   normalized using variance-stabilizing transform (vst) in DESeq2 package
   in R for visualization purposes in log-scale.[297]^57 Cell
   deconvolution was performed using MuSiC on single cell reference
   datasets for lung, liver, kidney, and heart.[298]^8^,[299]45, [300]46,
   [301]47, [302]58 Immune cell deconvolution was performed on lymph node
   samples using quanTIseq.[303]^59 Differential expression of genes was
   calculated by DESeq2. Differential expression comparisons were done as
   either COVID + cases versus COVID- controls for each tissue
   specifically, correcting for sequencing batches with a covariate where
   applicable, or pairwise comparison of viral levels from the lung as
   determined by nCounter data. In the volcano plot protein coding genes
   were plotted using Gencode classifications using -log10 (adjusted
   value) and log2 fold-change metrics. Genes with BH-adjusted p value <
   0.01 and absolute log2 fold-change greater than 0.58 (at least 50%
   change in either direction) were taken as significantly differentially
   regulated.[304]^65 Genes were ranked by their Wald statistic and their
   log2 fold-change values and used as input for gene set enrichment
   analysis (GSEA) on the molecular signatures database
   (MSigDB).[305]^64^,[306]48, [307]49, [308]50 Any signature with
   adjusted p value < 0.01 was taken as significant. List of
   differentially expressed genes and significantly enriched pathways are
   reported in [309]Table S2.

Pairwise correlations of cell types by conditions

   Correlation matrix visualizes the Pearson correlation coefficient by
   cell types within each disease condition. Statistically insignificant
   correlations (p-value > 0.05) are filtered and identified clusters of
   positive and negative correlation is marked. The correlations from
   SARS-CoV-2 high and low viral load samples are compared with normal,
   using R package cocor (v1.1-3).[310]^60 Briefly, the correlation
   coefficients are tested using Fisher’s r-to-Z transformation to
   quantify the differences between the two correlations. To quantify
   correlations, each data point (or correlation coefficient) corresponds
   to a fisher-tested correlation (z statistics and –log(P-value) for x
   and y axes, respectively). The entropy calculations were done with the
   synRNASeqNet R package (v1.0, entropyML function,
   [311]https://github.com/cran/synRNASeqNet). The deconvoluted cell
   counts were used as an input to run maximum likelihood entropy
   calculations.

Similarity analysis

   The consensus gene profiles for alveolar, large airway, and vascular
   healthy samples were built by taking the average gene profiles from
   healthy ROIs of respective tissue origin. To validate gene profiles,
   healthy ROIs were randomly sampled (1, 5, and 10 ROIs) and compared
   with the consensus profiles. With these gene profiles, we assessed the
   similarity of the profile from each ROI with the reference profile by
   taking cosine similarity (1 being closer to the reference, 0 being
   orthogonal, [312]Figure S5A). To identify genes specific to tissue- and
   disease- states, we performed logistic regression with L1 norm to model
   the gene expression profiles. The logistic regression was done with
   glmnet (v4.1-1).[313]^66 The genes with highest coefficients were
   filtered to identify 35 genes that may distinguish diseased tissue
   types ([314]Table S3).

GeoMx transcriptomic data normalization and quantification

Spatial transcriptomics analysis

   Discerning viral load from bulk nCounter screening—Bulk expression
   profiling was performed to identify COVID-19 patients with high vs low
   viral load. To do this, RNA from fresh TRIzol extracted and fixed lung
   tissue from 29 COVID-19 autopsies, 4 non-COVID-19 lung injury and 3
   controls were evaluated by bulk expression analysis using NanoString’s
   nCounter PanCancer IO360 Panel plus custom probes for eight SARS-CoV-2
   viral genes (encoding ORF7a, surface glycoprotein, nucleocapsid
   phosphoprotein, ORF8, envelope protein, ORF3a, membrane glycoprotein,
   and ORF1ab) to assess viral content. At least 100 ng of RNA was loaded
   for hybridization and quantified by the nCounter MAX Analysis System
   (NanoString Technologies, Seattle, WA, USA).

   Transcript counts were normalized to ERCC positive controls and
   housekeeper reference gene expression prior to analysis. Hierarchical
   clustering of nCounter results revealed two clusters of high and low
   severity and four “SARS-CoV-2 high” and four “SARS-CoV-2 low” patients
   were selected ([315]Figure S1D). These eight samples were analyzed with
   two influenza-infected patients, three non-viral ARDS patients, and
   three normal lung control patients using the GeoMx platform.

RNA/NGS slide preparation for GeoMx DSP

   For GeoMx DSP slide preparation, we followed the GeoMx DSP slide prep
   user manual (MAN-10087-04). Briefly, tissue slides were baked in a
   drying oven at 60 °C for 1 hour and then loaded to Leica Biosystems
   BOND RX FFPE for deparaffinization and rehydration. After the target
   retrieval step, tissues were treated with Proteinase K solution to
   expose RNA targets followed by fixation with 10% NBF. After all tissue
   pre-treatments were done, tissue slides were unloaded from the Leica
   Biosystems BOND RX and incubated with RNA probe mix (COVID-19 Immune
   Response Atlas panel) overnight. The next day, tissues were washed and
   stained with tissue visualization markers; CD68-647 at 1:400 (Novus
   Bio, NBP2-34736AF647), CD45-594 at 1:10 (NanoString Technologies),
   PanCK-532 at 1:20 (NanoString Technologies) and/or SYTO 13 at 1:10
   (Thermo Scientific S7575).

GeoMx DSP sample collections

   For GeoMx DSP sample collections, we followed the GeoMx DSP instrument
   user manual (MAN-10088-03). Briefly, tissue slides were loaded on the
   GeoMx DSP instrument and then scanned to visualize whole tissue images.
   For each tissue sample, we collected 4 types of functional tissue
   regions: vascular zone, large airway, alveoli zone, and macrophages.
   Each tissue region was carefully selected by a board-certified
   pathologist. Regions of interest (ROIs) were then segmented with
   corresponding fluorescent tissue markers, when available within the
   region. Twenty-four to twenty-three GeoMx DSP regions were selected per
   tissue and collected following UV illumination within the defined
   segment as described in Merritt et al.[316]^62 Compartments that were
   segmented within a region of interest were collected separately as
   unique areas of illumination (AOIs).

GeoMx DSP NGS library preparation and sequencing

   Each GeoMx DSP sample was uniquely indexed using Illumina’s i5 x i7
   dual-indexing system. 4 μL of a GeoMx DSP sample was used in a PCR
   reaction with 1 μM of i5 primer, 1 μM i7 primer, and 1× NSTG PCR Master
   Mix. Thermocycler conditions were 37 °C for 30 min, 50 °C for 10 min,
   95 °C for 3 min, 18 cycles of 95 °C for 15 sec, 65 °C for 60 sec, 68 °C
   for 30 sec, and final extension of 68 °C for 5 min. PCR reactions were
   purified with two rounds of AMPure XP beads (Beckman Coulter) at 1.2×
   bead-to-sample ratio. Libraries were paired end sequenced (2 × 75) on a
   NextSeq550 up to 400 million total aligned reads.

Processing and filtering raw NGS data

   Three hundred seventy-nine AOIs plus non-template controls (NTCs) were
   sequenced, producing about 1.3B reads (with about ∼11% unique).
   NextSeq-derived FASTQ files for each sample were compiled for each AOI
   using Illumina’s bcl2fastq program and then demultiplexed and converted
   to Digital Count Conversion (DCC) files using Nanostring’s GeoMx DnD
   pipeline (v1). These DCC files were then converted to an expression
   count matrix using a custom python script. A minimum of 10,000 reads
   were required for each non-NTC sample (2 AOIs removed). Probes were
   checked for outlier status by implementing a global Grubb’s outlier
   test with alpha set to 0.01. The counts for all remaining probes for a
   given target were then collapsed into a single metric by taking the
   geometric mean of probe counts. A count of 1 was added to any probe
   that yielded 0 counts before the geometric mean was taken. For each
   sample, RNA probe pool specific negative probe normalization factor was
   generated based on the geometric mean of negative probes in each pool.

Quality control and AOI filtering

   Following initial screening above, there were 373 AOIs interrogated
   using DSP that span 16 patients and three compartments (288 alveolar,
   48 large airway, and 37 vascular regions). Of these, 370 AOIs yielded
   greater than 50 nuclei. The 75th percentile of the gene counts (i.e.,
   geometric mean across all non-outlier probes for a given gene) for each
   AOI were calculated and normalized to the geometric mean of the 75th
   percentile across all AOIs to give the upper quartile or Q3
   normalization factors for each AOI. The distribution of these Q3
   normalization factors were then checked for outliers defined as any AOI
   greater than two standard deviations from the mean log2 Q3
   normalization factor. This criterion removed 15 AOIs that fell below
   the range and 1 AOI that fell above the range. Following AOI filtering,
   358 (of 373, ∼96%) AOIs were used for downstream analyses.

Removal of gene outliers and normalization

   Gene outliers were detected by a limit of quantification (LOQ)
   approach. The LOQ for each AOI was defined as the sample’s negative
   geometric mean multiplied by its negative standard deviation raised to
   the power of two. Any target (1,837 total) that was not above LOQ in at
   least 2% of AOIs were deemed prohibitively low expressors and were
   removed from analysis. This feature-based filtering approach discarded
   171 genes (9.3%) leaving 1,666 genes. Genes were normalized by the Q3
   approach as above.

Deconvolution of cell proportions using GeoMx

   Cell deconvolution methods followed that of Desai et al.[317]^15 Cell
   mixing proportions were performed using the R package
   SpatialDecon[318]^63 using the cell profile matrix based upon the Human
   Cell Atlas adult lung 10× dataset and appended with a neutrophil
   profile derived from snRNA-seq of lung tumors.[319]^67 ROIs were
   selected to be representative across the FFPE slides, based on
   morphology and immunofluorescence of each tissue.

Differential expression analysis

   Two different sets of differential expression (DE) analyses were
   performed. Common in both sets of models were five groups (COVID-19
   SARS-CoV-2 high, SARS-CoV-2 low, influenza, Non-viral ARDS, and Normal)
   and three compartments (alveolar, large airway, vascular). In DE set
   one, differences between all 10 pairwise groupings were performed to
   look for differences between pairwise comparisons and to serve as the
   basis for downstream gene set enrichment analysis (GSEA). In the second
   DE set, SARS-CoV-2 high and low viral load groups were compared against
   one of three non-COVID groupings to identify genes that are up- or
   downregulated in COVID-19 (sensu lato) relative to non-COVID-19 groups
   and to identify genes that are consistently differentially expressed in
   SARS-CoV-2 high and in low separately.

   In the first model, each 10 pairwise groupings were considered
   separately (e.g., SARS-CoV-2 low vs SARS-CoV-2 high). DE analysis was
   performed by fitting each gene’s normalized log2 expression level using
   a Linear Mixed Effect model to account for interpatient variation with
   the R package lmerTest.[320]^6 Patient ID was used as a random effect
   (random intercept) and grouping, compartment, and
   grouping-by-compartment interactions were used as fixed effects.
   Satterthwaite's approximation for degrees of freedom for P-value
   calculation was used.[321]^68

   In the second set of models, SARS-CoV-2 high and SARS-CoV-2 low AOIs
   were always included, and DE was used to determine how genes were up-
   or downregulated compared to each of three non-COVID groups. As such,
   there were three different “sets” (influenza vs SARS-CoV-2 high vs
   SARS-CoV-2 low; Non-viral ARDS vs SARS-CoV-2 high vs SARS-CoV-2 low;
   and Normal vs SARS-CoV-2 high vs SARS-CoV-2 low). For a given gene and
   a given set, AOIs were first filtered to exclude non-members (i.e.,
   exclude normal and Non-viral ARDS in the set “influenza vs SARS-CoV-2
   high vs SARS-CoV-2 low”). The log2 expression of sample for a given
   gene was fit to a mixed effect model with group (three levels) and
   compartment (three levels) and their interaction as fixed effects and
   Patient ID as a random effect. The Least Squares (LS) means or
   “marginal means” were estimated for each comparison (i.e., log2 means
   for levels of “group” which are averaged over the levels of other
   factors in the model[322]^68). In addition to the LS means, the
   pairwise P-values for all 3 comparisons were computed.

   To visualize the marginal means for each gene relative to SARS-CoV-2
   high, SARS-CoV-2 low, and a given normal group, the three-dimensional
   data were collapsed into two-dimensional ternary plots. Specifically,
   for a given gene g in set S, the three-element vector of marginal means
   can be expressed as g[s]. By convention, the order of the elements of
   g[s] were normal, SARS-CoV-2 high, and SARS-CoV-2 low. Elements of g[s]
   were then rescaled while preserving their relative relationship by
   multiplying by a scaling factor and converting from log2 space to
   linear space. Then g[s] was scaled further by dividing each element by
   the element with the minimum value. To convert g[s] from a vector of
   three to a vector of two (representing points along a simplex plane),
   the new x coordinates were calculated by:
   [MATH: <mrow><mi>x</mi><mo
   linebreak="badbreak">=</mo><mfrac><mrow><msub><mi>g</mi><mrow><msub><mi
   >S</mi><mn>2</mn></msub></mrow></msub><mo
   linebreak="badbreak">+</mo><mfrac><mn>1</mn><mn>2</mn></mfrac><msub><mi
   >g</mi><mrow><msub><mi>S</mi><mn>3</mn></msub></mrow></msub></mrow><mro
   w><msubsup><mo>∑</mo><mrow><mi>i</mi><mo
   linebreak="badbreak">=</mo><mn>1</mn></mrow><mrow><mi>i</mi><mo
   linebreak="badbreak">=</mo><mn>3</mn></mrow></msubsup><msub><mi>g</mi><
   mrow><msub><mi>S</mi><mi>i</mi></msub></mrow></msub></mrow></mfrac></mr
   ow> :MATH]

   where represents the ith element in vector g[S]. Similarly, the y
   coordinates were calculated by:
   [MATH: <mrow><mi>y</mi><mo
   linebreak="badbreak">=</mo><mfrac><mrow><msub><mi>g</mi><mrow><msub><mi
   >S</mi><mn>3</mn></msub></mrow></msub><msqrt><mfrac><mn>3</mn><mn>4</mn
   ></mfrac></msqrt></mrow><mrow><msubsup><mo>∑</mo><mrow><mi>i</mi><mo
   linebreak="badbreak">=</mo><mn>1</mn></mrow><mrow><mi>i</mi><mo
   linebreak="badbreak">=</mo><mn>3</mn></mrow></msubsup><msub><mi>g</mi><
   mrow><msub><mi>S</mi><mi>i</mi></msub></mrow></msub></mrow></mfrac></mr
   ow> :MATH]

   A given gene is then assigned a “corner” by how close it is from each
   of the three simplex’s corners.

   Each gene has three p-values associated with it (from the pairwise
   contrasts above). The contrast with the lowest p-value was selected to
   represent a given gene (i.e., this corresponded well with the corner
   that the gene was assigned to). P-values were then adjusted to account
   for multiple comparisons by using the Benjamini-Hochberg
   procedure.[323]^65 Differentially expressed genes from this analysis is
   included in [324]Table S3.

Gene set enrichment

   MA plots[325]^69 from the 10 pairwise DE analyses (DE model 1) were
   used to ensure that low expressors were not accounting for the majority
   of the large fold changes. Gene Set Enrichment Analysis (GSEA) was
   conducted using the R package fGSEA[326]^64 with MSigDB Hallmark and
   Reactome[327]^70 databases. Gene ranks were based on the log2 fold
   change from the individual DE analyses and gene sets were bound between
   15 and 500 genes. P-values for enrichment were estimated by 1,000
   permutations of the data. The pathways were then sorted based on
   adjusted p-value first and then by their Normalized Expression Score
   (NES).

   For the 10 most extreme pathways in each direction as well as the
   COVID-19 spike-in genes, single sample GSEA (ssGSEA) was performed
   using the R package gsva[328]^71 with a min and max size of 15 and 500,
   respectively. Enrichment scores for a given pathway were rescaled by
   dividing AOI’s enrichment score by the mean across samples and then
   rescaled between 0 and 1. These rescaled enrichment scores were then
   visualized for each AOIs' x and y coordinates from their respective
   FFPE slide.

Histology and imaging analysis

   Sections from SARS-CoV-2 high (4), SARS-CoV-2 low (4) and normal lungs
   (3) cases used for the GeoMx analysis were stained using hematoxylin
   and eosin and Masson’|'s Trichrome according to standard protocol. Four
   20× regions were randomly selected from each slide the Color
   deconvolution2 algorithm for ImageJ[329]^72 and cellular trichrome rich
   areas were manually selected and measured for pixel area (cellular
   areas on red deconvolution, trichome on blue deconvolution). Total
   image pixel area was used to determine percent fibroblast-rich
   trichrome positive zones.

Correlation plot comparing different COVID-19 samples and tissues

   To generate the correlation plot comparing the global changes for
   COVID-19 infection between different tissues, we utilized several
   different RNA-sequence data. The collection of the RNA sequencing data
   for the NP swab samples were described previously.[330]^4 The NP swab
   samples were analyzed with different methodologies with one comparing
   COVID-19 viral infection to the negative patients and the other a
   regression on continuous variables as a function of SARS-CoV-2 sequence
   amount. The viral comparison was previously described in Butler
   et al.[331]^4 and the DESeq2[332]^57 was utilized to generate the
   differential expression data.

   The monocyte COVID-19 RNA-Seq data, published under the accession
   [333]GSE159678,[334]^44 was downloaded from SRA and gene expression was
   quantified using Salmon’s selective alignment approach.[335]^54 The
   RNA-Seq processing pipeline was implemented using pyrpipe
   ([336]https://github.com/urmi-21/pyrpipe/tree/master/case_studies/Covid
   _RNA-Seq).[337]^73 Exploratory data analysis and differential
   expression analysis were performed using MetaOmGraph.[338]^74 From the
   differential expression analysis for each group the fold-change values
   for the genes were filtered with an adjusted p-value < 0.05. A
   correlation plot between the fold-change values for the significantly
   regulated genes for each comparison of the COVID-19 samples were
   plotted using R program corrplot v0.84.

Viral genome analysis

   Total RNA-seq reads were classified against a custom, pan-kingdom
   reference using kraken2.[339]^75 Reads that mapped uniquely to
   SARS-CoV-2 were aligned to the Wuhan reference genome using
   bwa-mem.[340]^56 Alignments were deduplicated, assembled, and called
   for major variants (Variant Allele Frequency > 0.6) using IVAR[341]^76
   with a minimum coverage of 5 reads per site.

Acknowledgments