Abstract Genome-wide association study analysis has revealed the significant influence of genetic factors in the progression of COVID-19. However, the impact of these genetic variants on gene expression in various cell types remains largely unexplored. Here, we profiled immune cells from 42 COVID-19 cases (involving five stages) and 11 healthy individuals, and performed a single-cell RNA-seq assay with >200 000 cells to investigate cell type-specific and stage-specific genetic effects of genetic variants. Single-cell expression quantitative trait loci analysis of eight distinct cell types showed that the expression of 2593 genes is significantly associated with common genetic polymorphisms, and that the majority of these genes show their most prominent effects in specific cell types. Furthermore, we also discovered new gene associations for COVID-19-risk variants identified from genome-wide association studies and highlighted the monocytes in which their effects are most prominent. We found that 57 genes were regulated by variants associated with COVID-19, with significant enrichment for dynamic effects. In summary, our results highlight the importance of studying context-specific genetic regulation of gene expression and provide insights into the mechanisms underlying genetic susceptibility to COVID-19. Introduction Coronavirus disease 2019 (COVID-19), caused by SARS-CoV-2 infection, consists of a wide range of pathologies, including no symptoms, mild, and severe respiratory failure, and has caused serious consequences around the world [[44]1–3]. Although the aberrant activity of inflammatory cytokines in patients with COVID-19 indicates the functional changes of immune cells, the host immune genetic factors that determine susceptibility to COVID-19 are not fully revealed. Genome-wide association studies (GWAS) have identified a number of genomic variants associated with COVID-19 traits [[45]4–6]. The majority of these variants are located in gene regulatory elements such as promoter and enhancer regions [[46]7]. In addition, several COVID-19-related causal genes are identified by conducting colocalization and transcriptome-wide association study (TWAS) analysis, integrating datasets of GWAS and expression quantitative trait loci (eQTL) [[47]8, [48]9]. However, the immune-specific genetic regulation of causal genes in COVID-19 has not been fully revealed. Recent studies have shown that eQTLs play crucial roles in progression of diseases [[49]10, [50]11]. Notably, traditional eQTL studies with bulk tissues have largely masked the cell type-specific mediation of genetic variants on gene expression, which has been revealed to play an important role in COVID-19 pathogenesis [[51]12]. Single-cell RNA sequencing (scRNA-seq) is a comprehensive technology to profile the composition and cell type-specific transcriptional states of tissues or cell lines. And existing scRNA-seq analysis has revealed that monocytes are critical in cause of COVID-19 [[52]13, [53]14]. By integrating scRNA-seq and population genetic information, it is possible to evaluate cell type-specific effects of genetic variants on gene expression. Nonetheless, a comprehensive census of circulating immune cells in COVID-19 with different severity remains incomplete, and identifying cell type-specific and context-mediating genetic regulation remains challenging. Given the dynamic characteristics and the heterogeneity of circulating immune cells, in this study, we provide a comprehensive census to reveal cellular changes across four severity levels of COVID-19, and mapped genetic regulation of gene expression on eight immune cell types. In addition, we reconstructed cell differentiation trajectories for monocytes and identified dynamic eQTL effects across subtypes of monocytes at different COVID-19 severity. Furthermore, we identified 57 genes with colocalizing eQTL and GWAS loci for COVID-19. Notably, our results showed that part of the colocalizing genes involved in protein interaction network were enriched in interferon-mediated signaling pathways. In summary, our data suggest that dysregulation of gene expression of immune cells could mediate COVID-19 and emphasize the importance of cell type-specific gene expression regulation by genetic variants. Materials and methods Data collection scRNA-Seq data of 53 raw SRR files (42 COVID-19 cases, 11 healthy controls) from peripheral blood mononuclear cells (PBMCs) were collected from the Gene Expression Omnibus (GEO) database (accession number: [54]GSE165080) [[55]15]. The COVID-19 cases were divided into four stages, including asymptomatic (5 cases), mild (13 cases), severe (12 cases), and convalescent (12 cases). GWAS summary data of COVID-19 patients were obtained from the COVID-19 Host Genetics Initiative database ([56]https://www.covid19hg.org/), which comprises 18 152 COVID-19 samples and 1145 546 controls. The bulk data of three coronaviruses were collected from the GEO database (accession numbers: [57]GSE155986, [58]GSE193169, and [59]GSE198398) [[60]16–18]. scRNA-seq data analysis Quality controls First, raw scRNA-seq data were processed using the Cell Ranger software (v5.0.0). The human genome (GRCh38) was used as a reference for alignment. The Ensembl (v43) was used as a reference for gene annotation (website). Then, results from RNA quantification in Cell Ranger were analyzed using Seurat (v4.0.5) package [[61]19], implementing the following rigorous quality control measures: (i) cell filtering—removal of cells with <200 detected genes, >25 000 unique molecular identifiers (UMIs; upper threshold determined as mean +3SD of UMI distribution), or >10% mitochondrial gene content and (ii) gene filtering—exclusion of genes detected in fewer than three cells. As a result, a total of 222 456 cells and 22 167 genes passed quality filters. Gene enrichment analysis We performed gene ontology and pathway enrichment analysis using the clusterProfiler R package (version 4.0.5). For each cell type, the top 20 marker genes were selected based on absolute log[2] fold-change values from differential expression analysis, with statistical significance determined by Student’s t-test. These gene sets were analyzed among GO terms. Cell–cell communication analysis To systematically investigate cell–cell communication dynamics in COVID-19, we employed the CellChat package (version 1.6.1) to analyze intercellular signaling networks using single-cell transcriptomic data. Our analysis incorporated the curated ligand–receptor interactions from CellChatDB as reference, through which we first identified overexpressed genes and interacting pairs within specific cell types. We then computed communication probabilities for all potential ligand–receptor interactions and performed comparative analysis at both the pathway level (evaluating 36 distinct signaling pathways) and the system level between COVID-19 patients and healthy controls. Our analysis revealed that 32 of the 36 examined pathways were active in both cohorts, though with quantitative differences in signaling strength. In addition, we discovered four pathway-specific signatures: MIF and PARs pathways emerged as COVID-19-exclusive features, whereas APRIL and EPHB signaling were uniquely maintained in healthy controls. Ordering of cells in a pseudotime trajectory To perform trajectory inference, we imported normalized gene expression profiles for a total of 24 238 monocytes, including 15 000 classical monocytes (cMs) and 9283 nonclassical monocytes (ncMs), into R (v3.6.1). The profiles were analyzed using the monocle (v2.22.0) package [[62]20]. In brief, the monocle object was constructed using the newCellDataSet function. Then, the size factor and dispersion between cells were estimated using the estimateSizeFactors and estimateDispersions functions. Next, the highly variable genes related to the pseudotime were calculated using differentialGeneTest function. Finally, the differential genes were embedded into the monocle object using setOrderingFilter function, and the project was mapped into the low-dimensional space by reduceDimension function. Genetic variant detection and quality control Genetic variants were identified through the following analytical pipeline: First, raw sequencing data (SRR files) underwent quality assessment and adapter trimming using fastp (version 0.23.2), followed by alignment to the human reference genome (GRCh38) using STAR (version 2.7.10). Variant calling was then performed using the GATK Best Practices workflow for RNA-seq short variant discovery (GATK version 4.2.6.1), with parameters optimized for single-cell data. To ensure variant quality, we applied stringent filtering criteria: (i) Quality by Depth (QD) > 2; (ii) Mapping Quality (MQ) > 40; (iii) Fisher Strand (FS) < 60; (iv) Strand Odds Ratio (SOR) < 3; (v) MQRankSum > −12.5; (vi) ReadPosRankSum > −8; (vii) Minor Allele Frequency (MAF) > 5%; and (viii) Hardy–Weinberg Equilibrium P-value >5 × 10^−5. Finally, variants were annotated using snpEff (version 5.1d) against the dbSNP reference database. This rigorous pipeline yielded 153 527 high-quality autosomal SNPs for subsequent eQTL analysis. Cis-eQTL mapping analysis We performed cis-eQTL analysis using a rigorous linear modeling following the rigorous pipeline: First, we calculated mean normalized expression values for each gene across cell types and samples. These expression profiles were then analyzed using the matrixEQTL package (v2.3.0) with a linear model that incorporated age and sex as covariates to control for potential confounding effects. The analysis tested for associations between genotype and gene expression within a 1 Mb window centered on each gene’s transcription start site [[63]21]. We established statistical significance at a nominal P-value threshold of 0.05, with all analyses adjusted for multiple testing using the Benjamini–Hochberg procedure (FDR < 0.05). This covariate-adjusted approach ensured that identified eQTLs represented genuine genetic regulatory relationships independent of demographic variables. Modeling of context-specific eQTL effects We conducted mapping of stage-specific and pseudotime-dependent cis-eQTLs for monocytes. Initially, we categorized monocytes into five distinct stages, comprising four COVID-19 stages (asymptomatic, mild, severe, and convalescent) as well as healthy controls. Subsequently, we calculated the average expression of each gene per individual within each monocyte stage and employed the previously established method to map stage-specific eQTLs. Similarly, for the identification of dynamic eQTL effects, we divided the pseudotime trajectory of monocytes into nine bins, calculated the average gene expression per individual within each bin, and subsequently mapped eQTLs for each bin. Integration of eQTLs with GWAS signals To identify COVID-19 causal genes, first, we performed the coloc (v5.1.0) analysis [[64]22], which is based on Bayesian genetic colocalization. We intersected candidate loci by using the GWAS data of COVID-19 and eQTLs for each cell type. In brief, we defined a 500-kb window centered on the lead eQTL variant, and performed colocalization testing using all variants within the window that were present in both the eQTL and the GWAS summary statistics. This approach allowed us to assess the likelihood of colocalization between the genetic variants influencing gene expression and those associated with COVID-19 susceptibility. Then, we conducted the TWAS to impute cell type-specific genetic expression levels from genotypes, and identify potentially novel associated genes. The FUSION software was used to perform TWAS analysis for each transcriptome reference panel [[65]23]. The 1000 Genomes project was used as an LD reference file. Results Single-cell transcriptomes of immune cells in COVID-19 First, scRNA-seq analysis was performed to profile >220 000 PBMCs from 53 unique samples (Fig. [66]1A). The 53 samples corresponded to 11 healthy controls and 42 COVID-19 cases, including 5 asymptomatic cases, 13 mild cases, 12 severe cases, and 12 convalescent cases. After quality control, a total of 222 456 high-quality cells detecting 22 167 genes were ultimately obtained. Subsequently, all the cells were derived into eight major cell types, including CD4^+ T cells, CD8^+ T cells, B cells, monocytes, conventional and plasmacytoid dendritic cells (cDC and pDC), and platelets, according to classical markers (Fig. [67]1B). In addition, processing batches had no observable effects on the distribution of cells based on the uniform manifold approximation and projection (UMAP) visualization ([68]Supplementary Fig. S1A). Moreover, significant differences were observed in immune cell distribution across COVID-19 severity groups (Fig. [69]1C). Monocytes and platelets were markedly enriched in COVID-19 patients compared to healthy controls. Notably, monocytes exhibited the highest abundance in severe cases during disease progression. Furthermore, STARTRAC-dist analysis revealed distinct clustering patterns of cellular subsets among severity groups ([70]Supplementary Fig. S1B), with monocytes demonstrating progressive enrichment correlating with disease severity. Even for asymptomatic and convalescent patients, the proportion of monocytes remained higher than normal controls, indicating the abnormal activation of monocytes during SARS-CoV-2 infection. This finding was also consistent with the previous studies [[71]13, [72]24]. Next, functional enrichment analysis was conducted to describe the active pathways of upregulated genes in various cell types for COVID-19. For COVID-19-enriched cells, monocytes exhibited elevated expression of genes, such as IL1B, CCL3, and CXCL8 (Fig. [73]1D). These upregulated genes were enriched in pathways such as “response to inflammatory” and “regulation of viral transcription” ([74]Supplementary Fig. S1C), suggesting that monocytes played a critical role in virus defense and lymphocyte activation. Figure 1. [75]Figure 1. [76]Open in a new tab A single-cell transcriptional map of immune cells in COVID-19. (A) The schematic for single-cell analysis of the study design. (B) The UMAP visualization of scRNA-seq data for immune cells of 42 COVID-19 cases and 11 healthy controls. (C) Violin plot showing the expression levels of selected marker genes in each of the indicated cell types. (D) The proportion of different cluster groups present in COVID-19 patients and healthy controls (left) and different stages of COVID-19 patients (right). (E) The differential gene expression analysis showing upregulated and downregulated genes across all eight cell types. Next, to explore the potential intercellular relationships and transcriptomic changes, the cellular interaction differences among cells between COVID-19-positive and -negative samples were investigated by CellChat [[77]25]. The results showed that monocytes exhibited distinct interaction potentials with both CD4^+ and CD8^+ T cells ([78]Supplementary Fig. S2A), emphasizing the effect of monocytes in T lymphocyte activation and migration for COVID-19 patients. In addition, biologically specific signal pathways were conducted to explain the immune mechanisms of SARS-CoV-2 infection. The results revealed that MIF and PARs signal pathways were enriched in COVID-19 patients, while APRIL and EPHB signal pathways were enriched in healthy controls (Fig. [79]2A). Moreover, the MIF signaling network showed that T cells (both CD4 and CD8) contribute to monocyte-dominated MIF signal production in COVID-19 (Fig. [80]2B). Notably, MIF interactions among monocytes and T cells were more likely autocrine, while others were paracrine signaling (Fig. [81]2C). Furthermore, MIF signaling is dominated by MIF ligand and its multimeric CD74/CD44 receptor (Fig. [82]2D), which had been shown to be associated with the progression of COVID-19 [[83]26, [84]27]. Finally, the interaction strength analysis confirmed that monocytes had more incoming and outgoing strength in COVID-19 patients compared to healthy controls ([85]Supplementary Fig. S2B). Collectively, these findings highlighted that monocytes were increased in COVID-19 patients, and closely associated with the COVID-19 progression stage. Figure 2. [86]Figure 2. [87]Open in a new tab Recognition of immunomodulatory signaling pathways in COVID-19. (A) Overall information flow of communication network between COVID-19 patients and healthy controls. (B) The hierarchical diagram showing the autocrine (left) and paracrine (right) signals of MIF signaling transduction. Solid circles and hollow circles represent the source and the target points, respectively. The size of the circle is proportional to the number of cells in each cell type, and the edge width represents the communication probability. The edge color is consistent with the signal source. (C) The heatmap showing the relative importance of each cell type calculated based on the four network center metrics of the MIF signal network. (D) The relative contribution of each ligand receptor to the overall communication network of the MIF signaling pathway. Identification of immune cell type-specific cis-eQTLs of COVID-19 To investigate genetic regulation of gene expression in COVID-19 immune cells, we first identified genetic variants by calling SNPs from raw single-cell RNA-seq reads, followed by re-annotation and rigorous quality control (detailed in the “Genetic variant detection and quality control” of “Materials and methods” section). Using these high-confidence variants, we then performed comprehensive eQTL analysis across eight major immune cell types, testing associations between genotype and gene expression levels (Fig. [88]3A). In total, we identified 4549 eQTLs for 2593 autosomal genes in all cell types ([89]Supplementary Table S1). For each individual cell type, we detected 356–1509 eQTLs with statistical significance, of which 281–1026 eQTLs were only detected in individual cell types (Fig. [90]3B). In our study, we identified a total of 4284 cell type-specific eQTLs. To systematically evaluate our findings, we performed multi-level replication analyses across different eQTL resources. Compared to bulk tissue data from the Genotype-Tissue Expression (GTEx) database [[91]28]. We observed moderate eQTL replication (6.6%–43.2%) and eGene overlap (2.4%–23.3%) rates across eight cell types, with 94.1%–100% consistency in allelic effect direction for replicated variants ([92]Supplementary Table S2). One plausible explanation for this observation is that the eQTLs identified in GTEx are primarily based on bulk tissue analysis, which tends to mask cell type-specific effects due to the averaging of gene expression across multiple cell types within the tissue [[93]29]. Our analysis demonstrated that while bulk analyses obscure cell type-specific regulatory variants, the core genetic regulation of target genes remains detectable. In addition, our findings showed strong concordance with established single-cell eQTL resources. The DICE database comparison revealed replication rates of 16.1%–28.1% across major immune cell types, including monocytes, B cells, T cells, and NK cells, with 96.3%–99.1% consistency in effect directions ([94]Supplementary Table S3 and [95]Supplementary Fig. S3A). Similar robust agreement emerged when analyzing [[96]11] PBMC sc-eQTL data. The CD4^+T cells demonstrated the highest replication rate (20.8%) with perfect effect direction consistency (100%) ([97]Supplementary Fig. S3B). Although pDCs showed the minimum number of overlapping eQTLs (n = 9), this limited set exhibited remarkable effect size correlation (Pearson’s r = 0.89, P< .001), demonstrating exceptionally precise replication when present. While our comparative analysis with the [[98]10] dataset revealed a limited overlap in identified eQTLs, we observed high directional consistency in shared regulatory variants across matched cell types ([99]Supplementary Fig. S3C). The reduced overlap likely stems from both biological and methodological factors. Biologically, differences between our COVID-19 patient cohort and their healthy donor samples may reflect disease-specific alterations in regulatory landscapes. Methodologically, the use of Spearman rank correlation in their analysis versus the linear models employed in our study represents a key technical distinction, as these approaches capture different aspects of genotype-expression relationships. Together, these factors help explain the observed variation in eQTL detection while maintaining the consistency of shared regulatory effects. The reduced overlap likely stems from both biological and methodological factors. Biologically, differences between our COVID-19 patient cohort and their healthy donor samples may reflect disease-specific alterations in regulatory landscapes. Methodologically, the use of Spearman rank correlation in their analysis versus the linear models employed in our study and most other sceQTL analyses represents a key technical distinction, as these approaches capture different aspects of genotype-expression relationships. Together, these factors help explain the observed variation in eQTL detection while maintaining the consistency of shared regulatory effects. Importantly, the high consistency in both regulatory direction and effect magnitude for overlapping variants provides robust validation of our methodological approach and confirms the reliability of our findings despite the distinct sample characteristics. These results suggest that while disease state influences the overall eQTL profile, core regulatory mechanisms remain detectable across different physiological conditions. Figure 3. [100]Figure 3. [101]Open in a new tab Identification of sc-eQTLs in immune cells of COVID-19 patients. (A) The flowchart for identification of cis-eQTL from COVID-19 scRNA-seq data. (B) The upset plot showing the number and composition of significant cis eQTLs for eight cell types. (C) Scatter plot showing the correlation between number of cells per donor and number of detected eGenes in each cell type. (D) Example of monocyte-specific eQTL. The SNP rs11516630 has a significant association with ADK expression just in the monocytes. adj. association P-value *P < .05, **P < .001, and ***P < .0001. We next investigated whether the complexity of genetic regulation varies across immune cell populations. Interestingly, we observed variations in the number of these cell type-specific eQTLs across different cell types, suggesting a potential relationship between the number of eQTLs and cellular heterogeneity [[102]10]. In addition, the proportion of these cell type-specific eQTLs to significant eQTLs also exhibited cellular heterogeneity among different cell types. For instance, we observed that the proportions of independent eQTLs in NK cells and pDCs were remarkably high, reaching up to 82.2% and 85.9%, respectively. However, in the case of CD8 T cells, the proportion of independent eQTLs was comparatively lower, at only 47.7%. The variation in the proportion of independent eQTLs emphasizes the importance of considering cell-specific regulatory mechanisms when studying gene expression in different immune cell populations. Based on this characteristic, we defined these independent eQTLs as cell type-specific eQTLs. Additionally, we also found that monocytes had a higher number of total cis-eQTLs, significant cis-eQTLs, and cell type-specific eQTLs. Notably, monocytes also displayed elevated numbers of cell counts (Fig. [103]3C), suggesting that the statistical power to detect eQTLs in cell types is associated with the number of cells, consistent with previous research [[104]30]. These results suggest that while disease state influences the overall eQTL profile, core regulatory mechanisms remain detectable across different physiological conditions. Notably, this persistence was particularly evident in monocytes. This monocyte-specific pattern suggests that certain genetic regulatory programs may operate in a cell-autonomous manner during infection. For example, the eGene ADK, despite having higher baseline expression in B cells, exhibited monocyte-specific regulation. Further supporting this cell type-dependent mechanism, we found that the homozygous C allele of rs11516630 significantly reduced ADK expression exclusively in monocytes (Fig. [105]3D), with no comparable effect observed in other cell types. In summary, the single-cell resolution of our analysis revealed an important dimension of cellular heterogeneity. Context-dependent eQTL effects on monocytes in COVID-19 Given the importance of monocytes in immunomodulatory functions, we further mapped cis-eQTL of monocytes in 5 stages of disease severity to explore the regulation of gene expression of genetic variants during COVID-19 progression. We observed notable variations in the quantity of genes with significant cis-eQTL effects (eGenes) across different stages of COVID-19 severity (Fig. [106]4A). Moreover, the highest number of eGenes was observed in the severe stage of COVID-19, suggesting that gene expression in severe disease states is more susceptible to genetic variation. For example, the eGenes S100A8 and S100A12 showed the most significant eQTLs in severe stage (Fig. [107]4B). S100A8 encodes a calcium-binding protein that belongs to the S100 family of proteins. Clinical studies have provided evidence that elevated levels of S100A8 are strongly associated with critical cases of COVID-19, and can effectively differentiate between mild and severe disease states [[108]29, [109]31, [110]32]. Moreover, S100A8 is believed to play a crucial role in regulating cytokine storms, which have emerged as a defining characteristic of severe and potentially fatal COVID-19 cases [[111]33]. Whereas S100A12, a member of the S100 family of proteins, plays a crucial role in immune responses by interacting with receptors such as TLR4 and RAGE, and its close association with COVID-19 has been reported [[112]34]. Another identified severity-induced eGene was STAT3, which encodes for a transcription factor that belongs to the STAT protein family. Although STAT3 was expressed in most stages of COVID-19, the most significant genotype-dependent effect was observed in the severe stage ([113]Supplementary Fig. S4). This suggests that for many monocyte-related genes, genetic regulation is only evident during specific stages of COVID-19 progression, particularly during the severe stage. Figure 4. [114]Figure 4. [115]Open in a new tab Context-specific eQTLs of monocytes in COVID-19 patients. (A) Upset plot showing the number and composition of significant cis eQTLs for eight cell types. (B) Example of monocyte-specific eQTL. The SNP rs11516630 has a significant association with ADK expression just in the monocytes. adj. association P-value *P < .05, **P < .001, and ***P < .0001. (C) The branched point plot showing the pseudotime development trajectories of cMs and ncMs (left) and nine bins (right). (D) Example genes that significantly change as a function of monocyte development pseudotime. Each dot corresponds to a cell, and colors represent different stages. (E) Example of dynamic eQTLs. The average expression of the gene within each pseudotime window was stratified by genotype. Building upon established evidence of context-specific eQTLs, we systematically examined the dynamic effects of genetic variation on gene expression across monocyte pseudotime [[116]35, [117]36]. Using an inferred trajectory model, we stratified all monocytes into nine consecutive pseudotime bins (Fig. [118]4C). When overlaying the expression of classical markers on the trajectory, we observed a continuous transition across the trajectory from cMs (B1) to ncMs (B9) (Fig. [119]4D). For example, CD14 is highly expressed in cMs and was found to be downregulated at late stages of the transition to cMs. Conversely, the expression of FCGR3A increased as the cells transitioned to cMs. In total, we identified 273 genes that exhibited changes in expression levels as a function of pseudotime. We found that dynamically regulated genes in quantiles were enriched in various pathways, such as response to zinc or copper ion (B1) and positive regulation of phagocytosis or lymphocyte migration (B4). By identifying dynamic gene expression, we gain deeper insights into regulatory processes and molecular changes underlying cellular transitions from cMs to ncMs. Next, we characterized dynamic changes in eQTL effects along the transition trajectory from cMs to ncMs, identifying patterns of variable regulatory activity across pseudotime. Of the 3204 cis-eQTLs identified in cMs to ncMs, 47 were expressed in at least five pseudotime bins and detected for dynamic effects. Many of the genes with dynamic eQTL effects play an important role in the positive regulation of phagocytosis for monocytes. For example, ERAP2 plays a crucial role in antigen processing and presentation, which is essential for phagocyte-mediated immune responses. As a key aminopeptidase in the endoplasmic reticulum, ERAP2 trims antigenic peptides for MHC class I loading, thereby influencing phagocytic cell functions [[120]37, [121]38]. The genetic variant rs2548524-A exhibited significant modulatory effects on ERAP2 expression throughout monocyte differentiation, with pseudotime-dependent variation in effect sizes. Notably, the strongest suppressive effect occurred in B7 (β = −0.25, FDR = 0.005), coinciding with the transitional stage between cMs and ncMs (Fig. [122]4E). This stage-specific repression suggests rs2548524 may regulate ERAP2-mediated antigen processing during critical phases of phagocyte functional maturation. The similar trend is identified for genetic variants that influence the expression of the FCER1G, a subunit gamma of IgE receptor involved in immunoglobulin-mediated immune response [[123]39]. Our results found that seven genetic variants showed significant allelic effects in at least a single bin. Among these variants, the T allele at rs2070901 was associated with increased expression of FCER1G, suggesting that the T allele is likely to promote the immune response process. In addition, rs2070901 has shown high interaction patterns with FCER1G [[124]40, [125]41], indicating that the interaction strength between rs2070901^C/C and FCER1G may be enhanced at the end of transition from cMs to ncMs. In summary, our analysis of pseudotime eQTLs suggested that the significant genetic regulation of immune function-related genes is only evident during specific stages of cellular transitions from cMs to ncMs. Identification of cell type-specific COVID-associated risk genes To identify potential disease-risk genes, we performed Bayesian genetic colocalization analyses using immune cell type-specific eQTLs to map loci associated with COVID-19 (Fig. [126]5A). As a result, we identified 19 unique colocalizations (PP4 > 0.95), corresponding to 19 GWAS loci for 21 SNP-gene pairs in seven immune cell types. Among these genes, 16 (84.2%) were protein-coding genes ([127]Supplementary Table S4). Further, we performed the TWAS analysis and identified 43 genes, 38 of which are protein-coding genes, associated with COVID-19 ([128]Supplementary Table S5). In total, we identified a set of 57 risk genes associated with COVID-19, with 5 genes overlapping between the two approaches (Fig. [129]5B). Importantly, despite having the same lead genetic variants, we discovered that the overlapping risk genes exhibited specific genetic effects across different cell types. For example, IL10RB, which has been identified as a prime candidate gene target for COVID-19 host susceptibility [[130]42]. IL10RB is highly expressed only in monocytes, but the genetic effects of IL10RB expression exhibited considerable variations across different cell types ([131]Supplementary Fig. S5A). Our analyses highlighted the advantage of using cell type-specific cis-eQTLs to annotate to identify novel associated genes with COVID-19. Figure 5. [132]Figure 5. [133]Open in a new tab Colocalization of immune-specific eQTLs with GWAS associations for COVID-19. (A) Identification of COVID-19 pathogenic genes using two methods. (B) The Venn diagram of pathogenic genes for Coloc and TWAS (right). (C) Sangi bubble plot of COVID-19 pathogenic gene enrichment analysis. (D) Interaction network diagram of genetic genes/loci determined by Coloc and TWAS. The dashed lines represent gene–gene interactions based on the STRING database. (E) Venn diagram of COVID-19 with HCoV-229E, HCoV-OC43, and MERS-CoV risk genes. Next, we investigated whether the identified risk loci and genes associated with COVID-19 had an impact on specific cellular functions. Our findings revealed that the risk genes were enriched in interferon-mediated signaling pathways, which play crucial roles in immune response and the development of COVID-19, particularly in response to viral infection (Fig. [134]5C). Further, for 57 risk genes and their corresponding genetic variants, we conducted an interaction network to identify crucial risk gene modules. We found that part of the genes clustered into connected modules based on the interaction pairs in STRING database (Fig. [135]5D) [[136]43]. In addition, we observed that neighboring genes within the modules were regulated by shared risk variants of COVID-19. For example, we found a set of interconnected genes, six of which were regulated by rs3176767, a potential risk variant for regulating ICAM1 and ICAM5 in rheumatoid arthritis [[137]44]. Notably, the expression of these COVID-risk genes was downregulated by rs3176767, suggesting that this variant may decrease COVID-19 risk ([138]Supplementary Fig. S5B). Our results indicated that immune disease loci colocalize with genes that may be involved in the COVID-19 process and that genes with similar functions tend to be regulated by disease risk variants. To provide insights into COVID-19 risk genes, we further explored the role of these genes in other coronaviruses, including HCoV-229E, HCoV-OC43, and MERS-CoV (Fig. [139]5E). We found that 57 COVID-19 risk genes partially overlapped with significant differential genes associated with other coronaviruses. Notably, several of the overlapping genes, such as ICAM1, ICAM5, NGFR, RPS6KL1, and SAMD14, are involved in synapses between neurons and CXCR chemokine receptor binding, suggesting that these risk genes might collectively influence the progression of coronaviruses by modulating specific biological pathways. Considering the nonuniformity of data types available for COVID-19 and other coronaviruses, we produced the pseudobulk expression profiles for COVID-19 and obtained a total of 715 significant differential genes (log[2]FC > 2, P-value <.05). Moreover, it was observed that out of the 48 differential genes identified in COVID-19, significant differences were also found in other three coronaviruses. Significantly, CXCL1 showed notable differences between case and control in all four coronaviruses, indicating its potential significance and involvement in the pathogenesis of these viral infections. In summary, our analysis expands the list of candidate risk genes for COVID-19 and provides a new reference for targeted therapies for COVID-19. Discussion The immune dysregulation associated with COVID-19 can lead to severe adverse reactions and infections, causing multiorgan dysfunction and even death [[140]45]. Through systematic single-cell profiling of 214 526 immune cells across five distinct disease stages, our study delineates the cellular and molecular landscape of COVID-19 pathogenesis, identifying eight immune cell populations. Our high-resolution analysis not only characterizes COVID-19-associated transcriptional alterations but also reveals cell type-specific enrichment of critical disease pathways. By conducting cell communication analysis, we observed a significant increase in both the quantity and the intensity of communication among monocytes. Monocytes emerged as the primary cellular source for several pathways, including the COVID-19-specific MIF and PARs pathways, as well as the healthy-specific APRIL and EPHB pathways. This observation highlights the crucial role of monocytes in the development of COVID-19. Therefore, our results provide valuable resources for investigating the immune dysregulation mechanisms associated with COVID-19. Clinical studies have indicated that susceptibility to COVID-19 can exhibit significant interindividual and population-specific variation [[141]46]. GWAS conducted in COVID-19 have identified multiple genetic risk variants that are associated with disease susceptibility. However, it has posed a challenge to identify causal genes and susceptible immune cell types of COVID-19, partly due to the lack of studies that systematically investigate the cell type-specific effects of genetic variants on gene expression. Based on the aforementioned background, we set out to map genetic effects of eight cell types on gene expression and identified >6000 significant cell type-specific cis-eQTLs of COVID-19 patients. Our results indicated that there is a high degree of cell type-specific genetic regulation on genes that is not captured with bulk RNA-seq, such as GETx. In addition, we identify examples of how genetic variants contribute to key immune-related eGenes. Given the essentiality of exploring the effects of genetic variants in biological contexts, it is crucial to fully uncover the context-specific genes and pathways regulated by disease-risk variants. Therefore, we further captured stage-specific and dynamic eQTLs for monocytes of COVID-19 patients. The findings revealed that a number of context-specific eQTLs were not identified through the initial cis-eQTL analysis. Moreover, to gain further insights into the relationship between eQTLs and COVID-19 risk loci, we performed TWAS and colocalization analysis, providing complementary information regarding the association between eQTLs and COVID-19 risk loci. Through the integration of cell type-specific eQTLs with COVID-19 risk loci identified via GWAS, we successfully identified a total of 57 causal genes at these loci. Additionally, we also determined the specific cell types in which these genes exert their pathogenetic effects. While this study provides valuable insights into the genetic regulation of immune responses in COVID-19, several limitations should be acknowledged. First, our variant annotation approach utilized genetic variants called from the RNA reads rather than dedicated genotyping arrays or whole-genome sequencing (WGS), which may have limited the comprehensive detection of genetic variants. While this RNA-based approach provides a practical method for leveraging existing transcriptomic data, it has important technical considerations compared to genotype data or WGS. The expression-dependent nature of RNA-seq data results in nonuniform genomic coverage, potentially missing variants in intergenic regions or lowly expressed genes that may harbor regulatory elements. Compared to standard SNP arrays typically used in eQTL studies, our approach had reduced power to detect variants in nontranscribed regions. Second, our analyses incorporated key covariates such as age and sex, but other potential confounders were not included due to data availability constraints. While age and sex are major demographic factors, unaccounted variables may introduce residual confounding, affecting the interpretation of genetic associations. Future studies with more extensive phenotypic data could further refine these analyses. Finally, sample size limitations in subgroup analyses may have reduced statistical power, increasing the risk of effect size biases. Larger cohorts would improve the robustness of cell type- and context-specific eQTL detection, particularly for rare cell states or modest genetic effects. Additionally, multi-omics integration future work could elucidate mechanistic links between genetic variants and immune dysregulation in COVID-19. Despite these limitations, our findings establish a foundation for understanding context-dependent genetic regulation in infectious diseases. Future studies with expanded genotyping, deeper phenotyping, and larger sample sizes will further advance precision medicine approaches for COVID-19 and related pathologies. Conclusion Collectively, our work combines genetic information with scRNA-seq profiles to elucidate the association underlying interindividual variants and gene expression in COVID-19. By integrating these approaches, we are able to identify key drivers responsible for the observed differences in immune responses among individuals. Our results showed how genetic variants influence the expression of genes involved in critical immune signaling pathways in a cell type-specific manner. Gaining a comprehensive understanding of the genetic regulation of the immune system holds significant implications for the treatment of both infections and cancer. Such knowledge can contribute to the development of targeted therapies and interventions that aim to modulate immune responses effectively and improve patient outcomes in disease contexts. Supplementary Material lqaf130_Supplemental_File [142]lqaf130_supplemental_file.xlsx^ (549.2KB, xlsx) Acknowledgements