Abstract

   During the development of COVID-19 caused by SARS-CoV-2 infection from
   mild disease to severe disease, it can trigger a series of
   complications and stimulate a strong cellular and humoral immune
   response. However, the precise identification of blood immune cell
   response dynamics and the relevance to disease progression in COVID-19
   patients remains unclear. We propose for the first time to use changes
   in cell numbers to establish new subgroups, which were divided into
   four groups: first from high to low cell number (H_L_Group), first from
   low to high (L_H_Group), continuously high (H_Group), and continuously
   low (L_Group). It was found that in the course of disease development.
   In the T cell subgroup, the immune response is mainly concentrated in
   the H_L_Group cell type, and the complications are mainly in the
   L_H_Group cell type. In the NK cell subgroup, the moderate patients are
   mainly related to cellular immunity, and the severe patients are mainly
   caused by the disease, while severe patients are mainly related to
   complications caused by diseases. Our study provides a dynamic response
   of immune cells in human blood during SARS-CoV-2 infection and the
   first subgroup analysis using dynamic changes in cell numbers,
   providing a new reference for clinical treatment of COVID-19.

   Keywords: SARS-CoV-2, COVID-19, T Cells, NK Cells, Single Cell RNA-seq

Introduction

   Coronavirus disease 2019 (COVID-19) was induced by the severe acute
   respiratory syndrome coronavirus-2 (SARS-COV-2). COVID-19 is extremely
   contagious and spreads rapidly at home and abroad[55]^1. It is a
   disease mainly characterized by fever, fatigue, cough, and
   pneumonia[56]^2. SARS-CoV-2 belongs to the genus β-coronavirus, which
   consists of a genome of approximately 30 thousand bases and is an
   enveloped single-stranded RNA virus[57]^3^, [58]^4. This disease
   currently poses a great challenge to humans. So far, there is no highly
   effective treatment method for COVID-19, and the main pathogenesis of
   COVID-19 in humans has not been understood[59]^5. It has become an
   urgent need to understand the pathogenesis of COVID-19 for clinical
   treatment.

   SARS-CoV-2 is highly pathogenic and poses a great threat to public
   health. SARS-CoV-2 infection activates both innately and adaptive
   immune responses[60]^6. The transition between innate and adaptive
   immune responses is critical to the clinical progression of SARS-CoV-2
   infection[61]^7. The early immune response mainly plays a protective
   role, while the dysregulation and aggravation of the inflammatory
   response cannot clear the virus and cause the disease to worsen. The
   accumulation of pro-inflammatory cytokines, lymphopenia, and abnormal T
   cell response provide evidence that COVID-19 may be an immune-related
   disease[62]^8.

   We use single-cell sequencing analysis to study how the dynamic changes
   in the number of immune cells in the blood of COVID-19 patients and
   healthy individuals affect the progression of COVID-19 patients. In
   previous studies, single-cell analysis of COVID-19 was based on the
   overall level of immune cells, which led to the average number of
   cells. Our study found that the changes in the number of immune cells
   in the course of disease were caused by the changes in the number of
   cell subgroups. It is worth noting that the immune response in the T
   cell subgroup is mainly concentrated in the H_L_Group cell type, and
   the complications caused by the virus are mainly in the L_H_Group cell
   type. In the NK cell subgroup, the immune response is mainly
   concentrated in the front end of the disease progression, and the
   complications caused by viruses were mainly in the back end of the
   disease progression. This study provides us with a better understanding
   of the specific cell subgroups that contribute to the progression of
   the disease, which will bring us closer to identifying possible
   interventions for the treatment of COVID-19, especially those with
   severe complications.

Results

   Identification of cell types in the blood of healthy individuals and
   patients with COVID-19. To assess the progress of immune cells in
   COVID-19 patient blood, we first analyzed published single-cell
   sequencing data sets from COVID-19 patient blood
   samples[63]^9^-[64]^11. After data filtering, 136,014 cells passed the
   standard quality control and were used for subsequent analysis. Next,
   we show the number of genes, unique molecular identifiers (UMIs), and
   the percentage of mitochondrial genes in the single-cell data
   ([65]Figure S1). First, we performed an unsupervised cluster analysis
   on the entire data set to eliminate batch effects across different
   individuals (Figure [66]1A and [67]1B). We use ElbowPlot to determine
   the dimension of the data set. From [68]Supplementary Figure 2A, we can
   see that the change tends to be flat after PC=30. We use PC1 to 30 for
   clustering. Through unified manifold approximation and projection
   (UMAP) and marker gene analysis, a total of 136,014 blood cells were
   identified. Based on the results of UMAP, we identified 28 main cell
   clusters, and none of them came from a single person (Figure [69]1C).
   According to the expression of canonical lineage markers and other
   genes up-regulated in each cluster ([70]Figure S2B-S2D). We have
   identified cell types, including NK cells (PRF1, GNLY, NKG7, CCL5,
   KLRD1), T cells (IL7R, CD3E, CD69, CD8B, CD3D), Macrophages (S100A8,
   S100A9, FCN1, CD14, FCN1), B cells (CD79A, MS4A1, CD19, CD79A, CD79B),
   Monocytes (CDKNC1, NEAT1, FCGR3A, MS4A7 ), Platelets (PPBP, PF4), DCs
   (HLA-DQA1, FCER1A, IRF8, CLEC9A), pDCs (TCF4, ITM2C, LILRA4, CLEC4C),
   Basophils (HMGB2, CD38), Plasma cell (CD38, IGKC), RBC (HBB, HBA2,
   HBA2), HSC (CYTL1, GATA2, CDK6) (Figure [71]1D and [72]1E), For marker
   genes, the expression value in each cell located in UMAP (Figure
   [73]1E).

Figure 1.

   [74]Figure 1
   [75]Open in a new tab

   Single-Cell Map of COVID-19 patients and controls. (A) UMAP shows all
   sample types. Healthy (H), moderate (M), and severe (S) patients were
   all samples before the removal of batch effects. (B) UMAP shows all
   sample types. Healthy (H), moderate (M), and severe (S) patients after
   removal of batch effects for all samples. (C) Clustering of single
   cells in UMAP space, showing 29 clusters (clusters 0-28). (D) Cell
   populations were identified. The UMAP projections of 136014 single
   cells from healthy (n = 11), moderate (n = 8), and severe (n = 7)
   samples show 12 clusters with their respective markers. Each point
   corresponds to a cell and is colored according to the cell type. (E)
   Heat map showing the expression levels of specific marker genes
   according to each cluster.

   Comparison of basic characteristics between healthy people and COVID-19
   patients. In order to determine the basic characteristics of immune
   cells in the blood of healthy people and COVID-19 patients, we analyzed
   the changes in different cell types in healthy people and COVID-19
   patients. In healthy people (H) and COVID-19 patients (C), we found a
   decrease in NK cells, T cells, macrophages, monocytes, and DCs. B
   cells, Platelets, Basophils, Plasma, RBC, and HSC cells increased. The
   number of pDC cells does not change (Figure [76]2A and [77]2B).
   Notably, COVID-19 patients have the highest proportion of NK cells and
   T cells (Figure [78]2B). In summary, our results show that NK cells and
   T cells make up the largest fraction and that the number of cells has
   been decreasing in COVID-19 patients. In order to have a more precise
   understanding of the role of immune cells in disease progression in
   COVID-19 patients, we next divided COVID-19 patients into mild and
   severe patients for further study.

Figure 2.

   [79]Figure 2
   [80]Open in a new tab

   Comparison of basic characteristics between healthy and covid-19
   patient groups. (A) UMAP projection of healthy (H), moderate and severe
   (C) patient samples in different cells. Each point corresponds to an
   individual cell, colored according to cell type. (B) Frequency of
   sample types in each cell type. The bar chart is colored according to
   the sample type. The average proportion of each cell type in healthy
   (H), moderate and severe (C) patient samples.

   Overview of immune cells in the blood of patients with severe and mild
   COVID-19 compared with healthy subjects. In order to assess the blood
   immune system status of COVID-19 patients, we analyzed the changes of
   immune cells in the blood of COVID-19 patients in different clusters,
   and the changes in the number of cells in healthy people, moderate and
   severe patients. We found that the number of cells changed in different
   clusters as the disease progressed (Figure [81]3A). We further analyzed
   the changes in the progression of COVID-19 patients in different cell
   types. The UMAP chart shows that NK cells and T cells have obvious
   changes during the progression of the disease (Figure [82]3B). Next, we
   found that T cells and B cells increased first and then decreased in
   healthy individuals, moderate and severe COVID-19 patients.
   Macrophages, monocytes, pDC, RBC, and HSC cells are in a state of
   decreasing first and then increasing. Platelets, Basophils, and Plasma
   cells have been increasing. NK cells and DCs have been decreasing
   (Figure [83]3C and [84]3D). It is worth noting that COVID-19 patients
   had the highest proportion of NK and T cells (Figure [85]3C and
   [86]3D). In general, our results show that the number of NK cells in
   healthy individuals, moderate and severe COVID-19 patients have been
   decreasing, while T cells are in a state of first increasing and then
   decreasing, and the immune cell composition varies between patients.

Figure 3.

   [87]Figure 3
   [88]Open in a new tab

   Dynamic composition of immune cells during COVID-19 infection. (A)
   Display of healthy (n = 11), moderate (n = 8), and severe (n = 7)
   patient samples in different clusters of UMAP projection. Each point
   corresponds to an individual cell, colored according to cell type. (B)
   UMAP projection of healthy (n = 11), moderate (n = 8), and severe (n =
   7) patient samples in different cells. Each point corresponds to an
   individual cell, colored according to cell type. (C) Frequency of
   sample types in each cell type and different clusters. The bar chart is
   colored according to the sample type. (D) The average proportion of
   each cell type in healthy (n = 11), moderate (n = 8), and severe (n =
   7) patient samples.

   Characteristics of T cell subgroups and their responses in the blood of
   COVID-19 patients. We use UMAP to divide T cell subgroups into 19
   different clusters (Figure [89]4A). To further understand the changes
   of T cells in the onset of COVID-19 patients, we found that the
   proportion of cells in the 19 clusters at different stages is different
   (Figure [90]4B and [91]4C). We divided them into four different
   subgroups according to the type of response to the virus. The number of
   cells first from high to low cell number (H_L_Group), first from low to
   high (L_H_Group), continuously high (H_Group), and continuously low
   (L_Group) (Figure [92]4D and [93]4E). In order to find out whether the
   pathways are heterogeneous in different T cell subgroups, we
   intersected the different genes of each cell subgroup in healthy
   individuals, moderate and severe patients to find out the unique genes
   ([94]Table S1). We obtained that when healthy individuals developed the
   moderate disease, H_L_Group had 984 genes and L_H_Group had 106 genes.
   H_Group has 14 genes, L_Group has 8 genes and Common_Group has 780
   genes (Figure [95]5A). We performed Gene Ontology (GO) enrichment
   analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment
   analysis for these genes. From healthy development to moderate disease,
   GO analysis showed that H_L_Group was mainly enriched in regulation of
   mRNA metabolic process, RNA splicing; L_H_Group is mainly enriched in
   regulation of mRNA metabolic process, RNA splicing; Common_Group was
   mainly enriched in protein targeting to ER, establishment of protein
   localization to endoplasmic reticulum ([96]Figure S4A-S4C; [97]Table
   S2).

Figure 4.

   [98]Figure 4
   [99]Open in a new tab

   Subgroup composition and dynamic changes of T cells. (A) UMAP
   projection in T cell subgroups shows healthy (n = 11), moderate (n =
   8), and severe (n = 7) patient samples in different clusters. (B) The
   bar chart shows the composition of T cell clusters at the single sample
   level, colored according to different cluster types. (C) The proportion
   of each T cell cluster obtained from healthy (n = 11), moderate (n =
   8), and severe (n = 7) patient samples. (D) The line graph shows the
   changes in the number of healthy (n = 11), moderate (n = 8), and severe
   (n = 7) patient cells. (E) UMAP projection shows the renaming of T cell
   subgroups.

Figure 5.

   [100]Figure 5
   [101]Open in a new tab

   T cell response pattern during COVID-19 infection. (A) Moderate
   COVID-19 patients have a gene overlap in T cell subgroups (H_L_Group,
   L_H_Group, H_Group, and L_Group). (B) Severe COVID-19 patients have a
   gene overlap in T cell subgroups (H_L_Group, L_H_Group, H_Group, and
   L_Group). (C) Summarize the impact on the H_L_Group and L_H_Group
   subgroups in the process of COVID-19 infection. (D) In T cell
   subgroups, moderate patients (M) were compared with healthy controls
   (H). H_L_Group specific genes, L_H_Group specific genes, and genes
   common to all T cell subgroups (Common_Group) KEGG pathway enrichment
   analysis, node size represents the number of genes, and color intensity
   represents -log10 q-value. (E) In T cell subgroups, severe patients (S)
   were compared with moderate patients (M). H_L_Group specific genes,
   L_H_Group specific genes, and genes common to all T cell subgroups
   (Common_Group) KEGG pathway enrichment analysis, node size represents
   the number of genes, and color intensity represents -log10 q-value.

   In addition, KEGG enrichment analysis revealed that during the moderate
   infection stage, we found that SARS-CoV-2 infection shared with other
   viral infection pathways, for example, Human immunodeficiency virus 1
   infection, Human cytomegalovirus infection, Human T-cell leukemia virus
   1 infection, Epstein-Barr virus infection, Influenza A, Viral
   myocarditis, Measles, Kaposi sarcoma-associated herpesvirus infection,
   Prion disease. In addition, in our analysis, SARS-CoV-2 infection was
   found to be the same as bacterial, parasitic, and protozoan disease
   pathways. Related to Shigellosis, Yersinia infection, Vibrio cholerae
   infection, Salmonella infection, Bacterial invasion of epithelial
   cells. In addition, we found that SARS-CoV-2 is closely associated with
   Pathways of neurodegeneration-multiple diseases, and shares pathways
   with other diseases. The pathways associated with SARS-CoV-2 infection
   also include the T cell receptor signaling pathway, Regulation of actin
   cytoskeleton, Ubiquitin mediated proteolysis, Leukocyte
   transendothelial migration, Parathyroid hormone synthesis, secretion
   and action, Long-term potentiation, FOXO signaling pathway, Thyroid
   hormone signaling pathway, Apoptosis, Gap junction, Endocytosis, GnRH
   signaling pathway, Oxytocin signaling pathway, Th17 cell
   differentiation, Natural killer cell mediated cytotoxicity, cGMP-PKG
   signaling pathway, Insulin signaling pathway, Endocrine and other
   factor-regulated calcium reabsorption, Inflammatory mediator regulation
   of TRP channels, Circadian rhythm, VEGF signaling pathway, B cell
   receptor signaling pathway, Chemokine signaling pathway, Endocytosis,
   NOD-like receptor signaling pathway, Autophagy-animal, Necroptosis,
   Proteasome, Pyruvate metabolism, Citrate cycle (TCA cycle), Propanoate
   metabolism, Ribosome, Oxidative phosphorylation, Thermogenesis,
   Spliceosome, Focal adhesion, Protein export, RNA transport, Protein
   processing in endoplasmic reticulum, Cellular senescence, RNA
   degradation, Antigen processing and presentation (Figure [102]5C and
   [103]5D; [104]Table S3).

   In order to further investigate the role of specific genes in each
   subgroup of COVID-19 patients as they progressed from moderate to
   severe, we applied the same method above to obtain 1861 genes in
   H_L_Group, 354 genes in L_H_Group, 1 gene in H_Group, 22 genes in
   L_Group, and 75 genes in Common_Group (Figure [105]5B). These genes
   were analyzed by GO and KEGG enrichment. GO analysis shows that
   H_L_Group is mainly enriched in ribonucleoprotein complex biogenesis,
   regulation of chromosome organization; L_H_Group was mainly enriched in
   oxidative phosphorylation, mitochondrial ATP synthesis coupled electron
   transport; Common_Group is mainly enriched in SRP-dependent
   cotranslational protein targeting to membrane, cotranslational protein
   targeting to membrane ([106]Figure S4D-S4F; [107]Table S4). In
   addition, KEGG enrichment analysis found that Common_Group was mainly
   enriched in Ribosome, Coronavirus disease-COVID-19; H_L_Group was
   mainly enriched in Pathways of neurodegeneration-multiple diseases,
   Alzheimer disease; L_H_Group was mainly enriched in Pathways of
   neurodegeneration-multiple diseases, Alzheimer disease. Enriched in
   Huntington disease, Pathways of neurodegeneration-multiple diseases
   (Figure [108]5C and [109]5E; [110]Table S5).

   In order to get the metabolic activity of cells, we use single-cell
   expression data as input data, the data matrix we downloaded from GSEA
   metabolic gene sets ([111]https://www.gsea-msigdb.org/gsea/index.jsp).
   In addition, we used the scimpute package to assign data missing values
   and used the Deconvolution method to standardize single-cell data
   (Figure [112]6A). H_Group has high activity in Porphyrin and
   chlorophyll metabolism, Alanine, aspartate and glutamate metabolism;
   H_L_Group has high activity in Glycosphingolipid biosynthesis-lacto and
   neolacto series, Phenylalanine metabolism; in L_Group, Histidine
   metabolism, Phosphonate and phosphinate metabolism higher activity
   (Figure [113]6B). On the whole, H_L_Group showed high activity in the
   whole metabolic activity analysis (Figure [114]6B and [115]6C). GSEA
   results showed that T cell subgroups were significantly up-regulated in
   oxidative phosphorylation (Figure [116]6D).

Figure 6.

   [117]Figure 6
   [118]Open in a new tab

   T cell subgroups were significantly up-regulated in oxidative
   phosphorylation. (A) The Deconvolution method was used to standardize
   the single-cell data, and the results showed that the expression ratio
   after the correction was more consistent. (B) Metabolic pathway
   activity in T cell subgroups. (C) Distribution of metabolic pathway
   activity in T cell subgroups. (D) Major metabolic pathways are enriched
   in all subgroups. The node size is -log10 p-value. Color intensity
   indicates the enrichment score (NES) after correction.

   NK cell subgroups and status in the blood of COVID-19 patients. To
   further understand the changes in NK cells in healthy individuals and
   moderate and severe COVID-19 patients, we used UMAP to divide NK cell
   subgroups into 19 different clusters (Figure [119]7A). We counted the
   changes in the number of cells of these different clusters in healthy
   individuals, moderate and severe patients, and found that the
   proportion of cells in the 19 clusters at different stages was
   different (Figure [120]7B and [121]7C). According to the type of
   response to the virus, we divided them into three different types. In
   subgroups, the number of cells first from high to low (H_L_Group),
   first from low to high (L_H_Group), and continuously low (L_Group)
   (Figure [122]7D and [123]7E).

Figure 7.

   [124]Figure 7
   [125]Open in a new tab

   Subgroup composition and dynamic changes of NK cells. (A) UMAP
   projection in NK cell subgroups shows healthy (n = 11), moderate (n =
   8), and severe (n = 7) patient samples in different clusters. (B) The
   bar chart shows the composition of NK cell clusters at the single
   sample level, colored according to different cluster types. (C) The
   proportion of each NK cell cluster obtained from healthy (n = 11),
   moderate (n = 8), and severe (n = 7) patient samples. (D) The line
   graph shows the changes in the number of healthy (n = 11), moderate (n
   = 8), and severe (n = 7) patient cells. (E) UMAP projection shows the
   renaming of NK cell subgroups.

   To explore whether the pathways exist heterogeneously in different NK
   cell subgroups, we intersect the differential genes of each cell
   subgroup in healthy individuals, moderate and severe patients to find
   unique genes ([126]Table S6). In healthy individuals progressing to
   moderate disease, H_L_Group has 690 genes, L_H_Group has 277 genes,
   L_Group has 133 genes, and there are 1341 genes in Common_Group (Figure
   [127]8A). We enriched these genes with GO and KEGG respectively. GO
   analysis shows that H_L_Group is mainly enriched in regulation of mRNA
   metabolic process, RNA splicing; L_H_Group is mainly enriched in
   regulation of DNA-binding transcription factor activity, regulation of
   response to biotic stimulus; Common_Group is mainly enriched in RNA
   catabolic process and mRNA catabolic process ([128]Figure S5A-S5C;
   [129]Table S7).

Figure 8.

   [130]Figure 8
   [131]Open in a new tab

   NK cell response mode during COVID-19 infection. (A) Moderate COVID-19
   patients have a gene overlap in NK cell subgroups (H_L_Group,
   L_H_Group, and L_Group). (B) Severe COVID-19 patients have a gene
   overlap in NK cell subgroups (H_L_Group, L_H_Group, and L_Group). (C)
   Summarize the impact on the H_L_Group and L_H_Group subgroups in the
   process of COVID-19 infection. (D) In NK cell subgroups, moderate
   patients (M) were compared with healthy controls (H). H_L_Group
   specific genes, L_H_Group specific genes, and genes common to all NK
   cell subgroups (Common_Group) KEGG pathway enrichment analysis, node
   size represents the number of genes, and color intensity represents
   -log10 q-value. (E) In NK cell subgroups, severe patients (S) were
   compared with moderate patients (M). H_L_Group specific genes,
   L_H_Group specific genes, and genes common to all NK cell subgroups
   (Common_Group) KEGG pathway enrichment analysis, node size represents
   the number of genes, and color intensity represents -log10 q-value.

   In addition, KEGG enrichment analysis revealed a moderate stage of
   SARS-CoV-2 infection, and we found that SARS-CoV-2 infection shares
   other viral infection pathways, such as Epstein-Barr virus infection,
   Human cytomegalovirus infection, Hepatitis B, Human papillomavirus
   infection, Human T-cell leukemia virus 1 infection, Hepatitis C,
   Measles, Kaposi sarcoma-associated herpesvirus infection, Viral
   myocarditis, Prion disease. In addition, in our analysis, SARS-CoV-2
   infection was found to be the same as bacterial, parasitic, and
   protozoan disease pathways. Related to Shigellosis, Salmonella
   infection, Pathogenic Escherichia coli infection, Toxoplasmosis, Vibrio
   cholerae infection, Leishmaniasis, Legionellosis.

   We also found that SARS-CoV-2 has carcinogenic effects, is also closely
   related to Pathways of neurodegeneration-multiple diseases, and shares
   pathways with other diseases. Pathways related to SARS-CoV-2 infection
   also include MAPK signaling pathway, RNA transport, Apoptosis, NOD-like
   receptor signaling pathway, Thyroid hormone signaling pathway, FOXO
   signaling pathway, Ubiquitin mediated proteolysis, Spliceosome,
   Cellular senescence, NF-kappa B signaling pathway, Neurotrophin
   signaling pathway, Cell cycle, Endocrine resistance, C-type lectin
   receptor signaling pathway, Th17 cell differentiation, RIG-I-like
   receptor signaling pathway, Prolactin signaling pathway, Antigen
   processing and presentation, Natural killer cell mediated cytotoxicity,
   Glycosaminoglycan biosynthesis - keratan sulfate, Graft-versus-host
   disease, Ribosome, Thermogenesis, Oxidative phosphorylation,
   Spliceosome, Endocytosis, Protein processing in endoplasmic reticulum,
   RNA transport, Phagosome, RNA degradation, mRNA surveillance pathway,
   Proteasome, Protein export (Figure [132]8C and [133]8D; [134]Table S8).

   In order to further investigate the role of specific genes in each
   subgroup of COVID-19 patients as they progressed from moderate to
   severe, we applied the same method above to obtain 406 genes in
   H_L_Group and 759 genes in L_H_Group. There were 10 genes in L_Group
   and 102 genes in Common_Group (Figure [135]8B). We performed GO and
   KEGG enrichment analyses on these genes. GO analysis shows that
   H_L_Group is mainly enriched in oxidative phosphorylation,
   mitochondrial ATP synthesis coupled electron transport; L_H_Group is
   mainly enriched in neutrophil mediated immunity, neutrophil activation;
   Common_Group is mainly enriched in RNA SRP-dependent cotranslational
   protein targeting to membrane, cotranslational protein targeting to
   membrane ([136]Figure S5D-S5F; [137]Table S9).

   In addition, KEGG enrichment analysis found that H_L_Group was mainly
   enriched in Pathways of neurodegeneration-multiple diseases,
   Amyotrophic lateral sclerosis; L_H_Group Mainly enriched in Pathways of
   neurodegeneration-multiple diseases, Amyotrophic lateral sclerosis;
   Common_Group was mainly enriched in Coronavirus disease-COVID-19,
   Ribosome (Figure [138]8C and [139]8E; [140]Table S10).

   In order to obtain the metabolic activity of cell subgroups, we used
   the single-cell expression data matrix as input data, and we downloaded
   the metabolic gene set from GSEA
   ([141]https://www.gsea-msigdb.org/gsea/index.jsp). In addition, we use
   the scimpute package to assign missing values to the data and used the
   Deconvolution method to standardize single-cell data (Figure [142]9A).
   H_L_Group has high activity in Arginine biosynthesis, Taurine and
   hypotaurine metabolism; in L_H_Group, Drug metabolism-cytochrome P450,
   Glycosaminoglycan biosynthesis-chondroitin sulfate / dermatan sulfate
   has high activity; in L_Group, Ascorbate and aldarate metabolism,
   Butanoate metabolism had high activity (Figure [143]9B). On the whole,
   H_L_Group has higher activity in the entire metabolic activity analysis
   (Figure [144]9B and [145]9C). The results of GSEA showed that NK cell
   subgroups were significantly up-regulated in oxidative phosphorylation
   (Figure [146]9D).

Figure 9.

   [147]Figure 9
   [148]Open in a new tab

   NK cell subgroups were significantly up-regulated in oxidative
   phosphorylation. (A) The Deconvolution method was used to standardize
   the single-cell data, and the results showed that the expression ratio
   after the correction was more consistent. (B) Metabolic pathway
   activity in NK cell subgroups. (C) Distribution of metabolic pathway
   activity in NK cell subgroups. (D) Major metabolic pathways are
   enriched in all subgroups. The node size is -log10 p-value. Color
   intensity indicates the enrichment score (NES) after correction.

   Experimental detection of T cell and NK cell related pathway genes. In
   scRNA-seq analysis, we found that some genes involved in the T cell
   receptor signaling pathway, Coronavirus disease-COVID-19, and Natural
   killer cell-mediated cytotoxicity pathway were stably up-regulated
   during the pathogenesis of COVID-19 patients. To further validate this
   finding, we performed a quantitative real-time RT-PCR test to detect 11
   COVID-19 patients (9 mild patients, 2 severe patients) And the
   expression of related genes in normal controls (Figure [149]10). ISG15
   is an interferon-inducible protein that is a central player in the host
   antiviral response, directly inhibiting viral replication and
   modulating host immunity[150]^12. During the COVID-19 infection
   process, its expression is increasing (Figure [151]10). LCP2 (also
   known as SLP-76) regulates T cell receptor signaling, and it is also a
   key determinant of NK cell development and NK cell-mediated elimination
   of missing self-target cells[152]^13^, [153]^14. Studies have shown
   that SLP-76 plays a key role in promoting the spread of viruses between
   cells[154]^15. And our experiments have proved that there is a
   significant increase in the pathogenesis of COVID-19 (Figure [155]10).
   Targeting SLP-76 may be a new target for future therapeutic
   interventions against COVID-19. PTPN6 (also known as SHP-1) is a
   negative regulator of antiviral immunity and suggests that SHP-1 may be
   a target for intervention in acute viral infections[156]^16. It was
   found in the experimental results that the expression of this gene
   increased during COVID-19 infection (Figure [157]10). In the future,
   PTPN6 can be used as a molecular target for the impact of COVID-19
   infection. EIF2AK2 also increased significantly during the disease
   (Figure [158]10). Studies have reported that EIF2AK2 has a broad
   antiviral spectrum[159]^17. IL6R expression was increased in the
   pathogenic process of COVID-19 (Figure [160]10). This may be related to
   the synergistic induction of a strong inflammatory response by elevated
   IL-6, and patients may benefit from treatment with IL-6 or IL-6R
   antagonists[161]^18. These results indicate that ISG15, LCP2, PTPN6,
   EIF2AK2, and IL-6R are candidate marker genes for SARS-CoV-2 infection.

Figure 10.

   [162]Figure 10
   [163]Open in a new tab

   Validate genes associated with antiviral, immune response, and cell
   storm during COVID-19 infection. The real-time quantitative polymerase
   chain reaction was used to detect COVID-19 in blood samples of moderate
   patients, severe patients, and healthy people. The expression levels of
   ISG15 (the core participant involved in the host's antiviral response),
   LCP2 and PTPN6 (promoting the cell-to-cell spread of viruses, the
   regulator of antiviral immunity), EIF2AK2 and IL6R (genes related to a
   strong inflammatory response).

Discussion

   Patients with COVID-19 showed different clinical characteristics during
   the disease[164]^19. SARS-CoV-2 infection can activate innate and
   adaptive immunity[165]^20. Coordinated responses that work together in
   innate and adaptive immune cells may lead to rapid control of the
   virus, while a failed immune response may lead to virus spread,
   cytokine storms, and high mortality[166]^21. Lymphocytes are the core
   component of the immune system to resist virus invasion, especially T
   cells and NK cells, which play a crucial role in removing
   viruses[167]^22. COVID-19 patients can be treated with T cells and
   memory NK cells[168]^8^, [169]^23. With the development of new
   sequencing technologies, the ability to use next-generation sequencing
   to analyze single cells provides a better method for studying the
   diversity and heterogeneity of cells[170]^24. Cell subgroups are
   classified based on their effector function and characteristic
   transcription factor expression[171]^25^, [172]^26. So far, many
   different T cell subgroups have been defined in humans, but there will
   likely be many more in the future. This single-cell transcriptomic
   classification is far more than the traditional cell classification.
   What is the difference?

   Because the current evidence supporting the role of T cells in the
   protection and pathogenesis of COVID-19 is incomplete, and few studies
   are analyzing the role of T cells in the progression of COVID-19
   disease. Understanding the mechanisms of different T cell subgroups is
   important for the prevention and treatment of SARS-CoV-2. Continuous
   viral antigen stimulation and immune disorders may lead to T cell
   failure, a state of T cell dysfunction that occurs during many chronic
   infections and cancers[173]^27. We divided the cells into 19 clusters
   through single-cell analysis (Figure [174]1C). Compared with
   traditional methods, this method can separate more cell types, which
   helps us to have a more detailed understanding of the pathogenesis of T
   cells in COVID-19. Next, we reclassified four groups according to the
   type of response to the virus: H_L_Group, L_H_Group, L_Group, and
   H_Group (Figure [175]4D and [176]4E), to explore the function of this
   new grouping in the infection process of COVID-19 patients. We further
   expressed through the CD4^+ marker (CD4, CTLA4, FOXP3, IL2RA) and CD8^+
   marker (CD8A, CD8B, GZMK) and found that CD4^+ is expressed in
   H_L_Group and L_H_Group ([177]Figure S3A), but the expression level in
   H_L_Group is the highest, and CD8^+ is expressed in H_L_group and
   L_H_group ([178]Figure S3B). In the process of COVID-19 infection, a
   large number of CD4^+ T cells, CD8^+ T cells, and neutralizing antibody
   response all contribute to clearing the acute infection[179]^28.
   Through KEGG enrichment analysis, we found that H_L_Group mainly
   activates the immune pathway in the early stage, and is involved in
   bacterial virus infection. With the increase of viral load, the immune
   activity decreased. In the later stages of disease development,
   disease-inducing pathways become activated (Figure [180]5C). L_H_Group
   activates pyruvate metabolism, citrate cycle (TCA cycle) in the early
   stage, and activates oxidative phosphorylation in the later stage of
   disease development, which is related to metabolic pathways in the
   process of COVID-19 infection (Figure [181]5C). It has been reported
   that the TCA cycle is affected by SARS-CoV-2 infection. One of the most
   important metabolic pathways[182]^29.

   NK cells play an important role in controlling the early immune
   response of viral infections[183]^30. We found that NK cells in the
   blood of severe COVID-19 patients were exhausted, and the NK cell count
   was significantly lower than that of mild patients or healthy controls,
   which is consistent with previous studies[184]^11^, [185]^31^-[186]^34.
   NK cell depletion may be caused by immunosuppression caused by a viral
   infection, triggering cell apoptosis, or down-regulation of viral
   cytotoxicity[187]^35. We further studied the role of NK cell subgroups
   in the different stages of SARS-CoV-2 infection. According to the type
   of virus response, we divided NK cells into three types: H_L_Group,
   L_H_Group, L_Group (Figure [188]7D and [189]7E). Through KEGG pathway
   enrichment, we found that the early stage of viral infection mainly
   involves viral and bacterial infection and the activation of immune
   pathways (Figure [190]8C). In the late stage of viral infection, many
   disease-inducing pathways are activated (Figure [191]8C). NK cells are
   crucial in the early immune response against viral infections,
   especially by eliminating virus-infected cells[192]^35. If the drug is
   actively used in the early stage of SARS-CoV-2 infection, it will
   reduce the occurrence of complications.

   We further analyzed the metabolic activity of immune cell subgroups and
   found that oxidative phosphorylation dominates the metabolic
   heterogeneity of T cell and NK cell subgroups (Figure [193]6 and Figure
   [194]9). Oxidative phosphorylation is an important metabolic pathway of
   SARS-CoV-2 infection, which is consistent with previous
   studies[195]^36. Previous studies showed that the virus was changed by
   the passivation mPTP mitochondrial metabolism, to facilitate its
   production, while OXPHOS inhibition of viral replication is
   attenuated[196]^37, the virus replication and mitochondrial interaction
   may be necessary for viral load increases[197]^38. Studies have
   reported that the virus often changes the function of OXPHOS human main
   metabolic pathways to provide energy, biosynthetic resources, and
   immune escape to promote its production. It has now been used as an
   important target for the treatment of viral infections[198]^39^,
   [199]^40, and immune metabolism can be used as a cell-specific therapy
   targeting COVID-19[200]^41. At present, there is limited research on
   metabolism in COVID-19, and our research provides a new target for the
   treatment of COVID-19.

   In summary, our research is the first to use the type of virus response
   to analyze blood immune cell subgroups in the course of COVID-19
   patients. Compared with the traditional cell classification method, our
   method can more accurately study the antiviral immune regulation
   mechanism of COVID-19 patients. Our research provides a new target
   pathway for the further research of COVID-19, and immune metabolism can
   be used as SARS-CoV-2 is a new therapeutic target. Understanding these
   regulatory mechanisms will help develop appropriate diagnostic methods
   and new therapeutic targets.

Materials and Methods

   Source data. Download COVID-19 patient single-cell sequencing data
   [201]GSE154567, [202]GSE155673, and [203]GSE150728 from Gene Expression
   Omnibus (GEO) ([204]http://www.ncbi.nlm.nih.gov/geo/). The data set
   consists of blood single-cell sequencing data of eight moderate
   patients, seven severe patients, and eleven healthy people.

   scRNA-Seq data processing. We used the Seurat R package (version 3.2.0)
   to convert it into a Seurat object for quality control. The following
   criteria were then applied to each cell of all fifteen patients and
   eleven healthy controls: more than 15% of UMI cells that had less than
   300 expressed genes or derived from the mitochondrial genome were
   discarded. After filtration, a total of 136,014 cells remained for the
   following analysis. The genetic barcode matrix of all filtered samples
   is integrated with Seurat to eliminate batch effects across different
   individuals. Each Seurat object in the list performs lognormalize
   standardization and searches for hypervariable genes. The hypervariable
   genes of all Seurat objects are ranked together, and the top 2000 genes
   are selected as anchor genes. Each Seurat object uses anchor genes.
   Perform principal component analysis (PCA) dimensionality reduction,
   calculate anchor data sets in the PCA space and integrate the data for
   downstream analysis.

   Single-cell transcriptome recognizes cell types. Next, cluster analysis
   is performed on the first 30 principal components, and the clusters are
   visualized using UMAP to visualize the cells. At the same time, the
   data based on PCA dimension reduction was clustered to perform
   clustering analysis using Seurat, and the resolution is set to 0.7 to
   obtain better results. Determine cell type characteristics based on the
   expression of known marker genes in the cell marker database[205]^42.
   NK cells and T cells were re-clustered using Seurat v.3. In the
   clustering step, the NK cells and T cells of all samples were
   re-clustered using the same parameters as above, and the parameter
   resolution was set to 0.9.

   Identification of differentially expressed genes. Seurat FindAllMarkers
   function is used for differential analysis between the control group
   and disease group of the same cell type. The functional parameters we
   used in Seurat v.3 are default values. For each cluster, a
   differentially expressed gene (DEG) relative to all other cells is
   generated.

   Gene function analysis. The ClusterProfiler R software package[206]^43.
   GO and KEGG enrichment analysis of DEGs can be realized. DEGs with
   adjusted P value<0.05 were enriched in a large amount. We use bubble
   charts to show the GO and KEGG pathways enriched in each cell subgroup.

   Cell subgroup metabolism pathway activity and GSEA analysis. In order
   to analyze the metabolic pathway activity of cell subgroups, we use the
   latest algorithm to score the pathway activity of the
   subgroups[207]^44. The data we use was the single-cell data expression
   matrix of cell subgroups, and the count expression data of cell
   subgroups were converted into TPM, and log conversion was performed as
   input data. In addition, we entered the GSEA analysis, the software
   used was javaGSEA
   ([208]https://www.gsea-msigdb.org/gsea/downloads.jsp), which adopts
   KEGG_metabolism of data sets.

   Validate genes associated with antiviral, immune response, and cell
   storm during COVID-19 infection. We obtained nucleic acid samples from
   COVID-19 patients (9 mild patients and 2 severe patients) and 2 healthy
   individuals from the Luzhou Center for Disease Control and Prevention.
   Next, the FastKing RT Kit (TIANGEN, Chengdu, China) was used for
   reverse transcription to prepare cDNA. 2 x TSINGKE Master qPCR Mix
   (SYBR Green I) kit (TSINGKE, Beijing, China) was used for 10 uL
   quantitative polymerase chain reaction (qPCR). 18sRNA was used as an
   internal reference. All primers were obtained from TSINGKE (Beijing,
   China) and are listed in Table [209]1. mRNA expression levels were
   determined using Equation 2^-∆∆Ct.

Table 1.

   List of primers
   Primer name Forward Sequence (5'-3') Reverse Sequence (5'-3')
   PTPN6       GCCTGGACTGTGACATTGAC     ATGTTCCCGTACTCCGACTC
   LCP2        GGAAGAAGCCACCTGTGCCAAA   GCTCATAGGAAGTAGTGCTGGC
   ISG15       CTCTGAGCATCCTGGTGAGGAA   AAGGTCAGCCAGAACAGGTCGT
   EIF2AK2     GAAGTGGACCTCTACGCTTTGG   TGATGCCATCCCGTAGGTCTGT
   IL6R        GACTGTGCACTTGCTGGTGGAT   ACTTCCTCACCAAGAGCACAGC
   18sRNA      AAGTCCCTGCCCTTTGTACACA   GATCCGAGGGCCTCACTAAAC
   [210]Open in a new tab

   Statistical analysis. Statistical analysis was performed using GraphPad
   Prism 7.0 software. The student's t-test was used to compare the
   differences between the two groups, and p<0.05 was considered
   statistically significant.

Supplementary Material

   Supplementary figures.
   [211]Click here for additional data file.^ (6.5MB, pdf)

   Supplementary tables.
   [212]Click here for additional data file.^ (986.1KB, zip)

Acknowledgments