Abstract The ongoing COVID-19 pandemic caused by SARS-CoV-2 infections has quickly developed into a global public health threat. COVID-19 patients show distinct clinical features, and in some cases, during the severe stage of the condition, the disease severity leads to an acute respiratory disorder. In spite of several pieces of research in this area, the molecular mechanisms behind the development of disease severity are still not clearly understood. Recent studies demonstrated that SARS-CoV-2 alters the host cell splicing and transcriptional response to overcome the host immune response that provides the virus with favorable conditions to replicate efficiently within the host cells. In several disease conditions, aberrant splicing could lead to the development of novel chimeric transcripts that could promote the functional alternations of the cell. As severe SARS-CoV-2 infection was reported to cause abnormal splicing in the infected cells, we could expect the generation and expression of novel chimeric transcripts. However, no study so far has attempted to check whether novel chimeric transcripts are expressed in severe SARS-CoV-2 infections. In this study, we analyzed several publicly available blood transcriptome datasets of severe COVID-19, mild COVID-19, other severe respiratory viral infected patients, and healthy individuals. We identified 424 severe COVID-19 -specific chimeric transcripts, 42 of which were recurrent. Further, we detected 189 chimeric transcripts common to severe COVID-19 and multiple severe respiratory viral infections. Pathway and gene enrichment analysis of the parental genes of these two subsets of chimeric transcripts reveals that these are potentially involved in immune-related processes, interferon signaling, and inflammatory responses, which signify their potential association with immune dysfunction leading to the development of disease severity. Our study provides the first detailed expression landscape of chimeric transcripts in severe COVID-19 and other severe respiratory viral infections. Keywords: chimeric transcripts, severe COVID-19, aberrant splicing, cis-SAGe, respiratory viral infections 1. Introduction The ongoing COVID-19 pandemic is caused by SARS-CoV-2, a recently identified member of the Coronaviridae family [[30]1,[31]2]. The clinical representation of COVID-19 is variable, and sometimes it can lead to the development of disease severity resulting in immune dysfunction, acute respiratory distress syndrome (ARDS), and multi-organ failure [[32]3,[33]4,[34]5]. Despite several studies, the molecular mechanisms behind the development of severe COVID-19 are still unclear. During the severe COVID-19 infection, SARS-CoV-2 antagonizes interferon (IFN) induction and IFN signaling, which generates dysfunctional immune response [[35]6,[36]7,[37]8,[38]9,[39]10,[40]11,[41]12]. Recent studies demonstrated that SARS-CoV-2 suppresses the interferon (IFN) response by disrupting the host cell splicing [[42]13]. Aberrant splicing and dysregulation of transcripts were found to be strongly correlated with the clinical severity of COVID-19 [[43]14,[44]15]. Aberrant splicing promotes the increase in non-canonical splicing events, resulting in the generation of novel chimeric transcripts [[45]16,[46]17,[47]18,[48]19]. In several cancers and complex diseases, non-canonical splicing was observed, which was reported to generate novel chimeric transcripts [[49]18,[50]20,[51]21]. Chimeric transcripts are formed by the fusion of the exon/intron from two separate genes [[52]22]. The generation of chimeric transcripts was found to be functionally involved with the development of several diseases [[53]23,[54]24]. Further, the appearance of chimeric transcripts in the cell is thought to be associated with the generation of phenotypic plasticity that helps the cell’s survival in response to specific stress [[55]25,[56]26]. Chimeric transcripts could modify cell functionality by altering the regulatory network and protein interaction network, and they can also switch pathways [[57]27,[58]28,[59]29]. SARS-CoV-2 induces the widespread alternation of host cell-splicing machinery and transcriptional dynamics in severe COVID-19 infection, which raises the possibility of the generation and expression of novel chimeric transcripts in severe COVID-19-infected cells. However, no study so far has attempted to check if SARS-CoV-2-induced aberrant splicing in severe COVID-19 infections could generate novel chimeric transcripts, nor the potential functional impact of these chimeric transcripts’ generation. In this study, we integrated publicly available RNA-seq datasets of blood from 85 severe COVID-19 patients, 10 mild COVID-19 patients, 84 other severe respiratory viral infected patients, 54 healthy persons, and 199 samples from 32 different tissues and identified the chimeric transcripts unique to severe COVID-19 infections. 2. Materials and Methods 2.1. Acquisition of the RNA-Seq Datasets RNA-seq data from several sources were collected for this study. A total of 44 blood samples of RNA-seq data from severe COVID-19 patients and 10 samples from healthy donors were obtained from the Gene Expression Omnibus (GEO) database ([60]GSE171110) [[61]30]. RNA-seq data for 24 severe COVID-19 and 34 healthy blood samples were obtained from the [62]GSE152418 dataset [[63]31]. We also collected single-cell RNA sequencing data of peripheral blood mononuclear cells (PBMCs) from 8 severe COVID-19 patient samples and 6 healthy controls (PRJNA633393) [[64]32]. We downloaded the RNA-seq data of blood from EBI ArrayExpress (accession E-MTAB-10926) for 10 mild COVID-19 patients and 10 severe COVID-19 patients [[65]33]. From the [66]GSE157240 dataset, we downloaded the RNA-seq data for 84 blood samples of patients with severe respiratory viral infections caused by influenza, enterovirus/rhinovirus, human metapneumovirus, dengue virus, cytomegalovirus, Epstein–Barr virus, or adenovirus was also obtained [[67]34]. Further, we downloaded RNA-seq data from 199 samples from 122 individuals representing 32 different healthy human tissues from EBI ArrayExpress (accession E-MTAB-2836). The downloaded raw sequence reads were converted to FASTQ using the SRA toolkit, version 2.10.7 [[68]35]. 2.2. Identification of Chimeric Transcripts from the RNA-Seq Data Using our in-house reference-based method ChiTaH [[69]36], all the above samples were used to identify chimeric transcripts. Recently, ChiTaH [[70]36] was demonstrated to be the most efficient reference-based approach for chimeric transcript detections from RNA-seq data compared to other popular tools such as EricScript [[71]37], STAR-Fusion [[72]38], JAFFA [[73]39], and FusionCatcher [[74]40]. Furthermore, ChiTaH can also detect the chimeric transcripts from single-cell RNA (scRNA)-seq data. ChiTaH maps RNA-Seq datasets and predicts potential chimeric transcripts in each sample using 43,466 non-redundant high-quality human chimeras from the ChiTaRS 5.0 database [[75]41]. The three rigorous criteria listed below were used to determine which transcripts were chimeric: (i) Reads should only map with human chimeric transcripts found in the ChiTaRS database and not with the human genome or transcriptome, (ii) at least five reads should cover the chimeric transcript junction length, and (iii) the mapping reads’ quality should be MAPQ > = 10. 2.3. Differential Gene Expression Analysis STAR aligner [[76]42] was used to map the RNA sequencing reads from all samples; the human reference genome (hg38) was used for alignment. Next, the featureCounts tool [[77]43] was used to determine the total amount of reads mapped to each human gene. Finally, DEseq2 [[78]44] was used to perform differential gene expression analysis. When log2foldchange ≥ 2 and Padj ≤ 0.05, a gene was deemed significantly upregulated, and when log2foldchange was negative and Padj ≤ 0.05, it was deemed significantly downregulated. The Benjamini–Hochberg method was used to adjust the p-values and the false discovery rate (FDR) [[79]45]. 2.4. Gene Ontology and Pathway Enrichment Analysis of the Parental Genes of Chimeric Transcripts Metascape [[80]46] was used to perform the Gene ontology (GO) biological process and pathway enrichment analyses. The p-value cutoff < 0.05 was set to select the enriched processes and pathways. The KEGG [[81]47] and REACTOME [[82]48] pathway databases were used during the pathway enrichment analysis. 3. Results 3.1. Identification of Chimeric Transcripts in the RNA-Seq Data of Blood Samples from Severe COVID-19, Mild COVID-19, and Other Severe Respiratory Virus-Infected Patients We identified a total of 957 chimeric transcripts from 78 bulk RNA-seq and 7 single-cell RNA (scRNA)-seq data from severe COVID-19 patients. In total, 1175 chimeric transcripts were identified in mild COVID-19 patients from 10 bulk RNA-seq samples, and 1281 chimeric transcripts were identified in various severe respiratory viral infections from 84 bulk RNA-seq samples. Moreover, we also identified 1061 chimeric transcripts from 54 healthy samples, while 2066 chimeric transcripts were identified in 199 samples from 32 normal tissues. Chimeric transcripts were observed to be abundant in all samples analyzed ([83]Table S1 ([84]Supplementary Materials)). However, from a pathological perspective, it would be interesting to identify the chimeric transcripts that are specific to severe COVID-19 infections, as well as those that are commonly expressed across multiple severe viral infections. We have identified that 44.3% (424 out of 957) of chimeric transcripts identified in the severe COVID-19 samples are unique to severe COVID-19 infections ([85]Figure 1). Furthermore, we have identified 189 chimeric transcripts, which are common to severe COVID-19 infection and other severe respiratory viral infections. These results indicate that a significant number of chimeric transcripts detected in individuals are common to severe COVID-19 and other severe respiratory viruses. Figure 1. [86]Figure 1 [87]Open in a new tab The table represents the number of chimeric transcripts detected in the different samples. The Venn diagram represents the common and unique chimeric transcripts detected in each sample. 3.2. Identification of Severe COVID-19 Specific Recurrent Chimeric Transcripts and Functional Analysis of Their Parental Genes Chimeric transcripts expressed in multiple severe COVID-19 patient blood samples could be associated with the development of the severity of COVID-19. We have identified 42 chimeric transcripts as recurrent, which are expressed in at least 3 severe COVID-19 patient samples ([88]Table S2). The generation of chimeric transcripts could impact on the functionality of their parental genes [[89]27,[90]49]. For example, chimeric transcripts could translate into chimeric proteins that could replace the interactions of their parental proteins in the PPI network and alter the functionality of the cell [[91]27,[92]29,[93]50]. Furthermore, chimeric transcripts could act as long non-coding RNA (lncRNA), which could regulate and alter the functionality of their parental genes [[94]49,[95]51,[96]52]. We have found 75 parental genes generating recurrent chimeric transcripts in severe COVID-19 patients. So, it can be assumed that severe COVID-19 specific, recurrent chimeric transcripts could impact on the functionality of their parental genes, which may lead to the development of disease severity. We performed the GO functional and pathway enrichment analysis to understand the potentiality for alteration of chimeric transcripts mediated by cellular functionality. A Metascape analysis of the most enriched biological processes revealed that the parental genes of chimeric transcripts are involved in a broad range of important processes such as localization, cellular processes, immune system processes, metabolic processes, regulation of biological processes, etc. ([97]Figure 2A). This observation indicates that chimeric transcripts generated from these genes could play a potential role in alternations of immune-related processes and alter the regulation of the diverse biological and metabolic processes that led to the development of disease severity. Further, the top-most enriched pathway and process analysis revealed that cellular response to stress, phagocytosis, and interferon alpha/beta signaling are the most important processes that could be altered by these chimeric transcripts ([98]Figure 2B). Figure 2. [99]Figure 2 [100]Open in a new tab (A) GO biological process enrichment analysis of the parental genes of severe COVID-19 specific recurrent chimeric transcripts; (B) Top-most enriched processes and pathways of the parental genes of severe COVID-19 specific recurrent chimeric transcripts detected by Metascape (p-value cutoff < 0.05). 3.3. Genomic Neighborhood Analysis and Differential Gene Expression Analysis of the Parental Genes of Severe COVID-19 Specific Recurrent Chimeric Transcripts Several chimeric transcripts could be generated by the non-canonical splicing-based mechanisms, such as the trans-splicing and cis-splicing of adjacent genes (cis-SAGe) [[101]25]. Trans-splicing is the mechanism by which two transcripts from two different genes could fuse and generate a novel chimeric transcript [[102]53,[103]54,[104]55,[105]56]. However, the mechanism of trans-splicing events is not well understood in humans, and no such computational method exists that can predict whether a chimeric transcript is generated by a trans-splicing mechanism. cis-SAGe is a frequent mechanism in human cancers and other complex diseases in which this mechanism generates a significant number of chimeric transcripts [[106]17,[107]18,[108]57,[109]58,[110]59]. However, a recent study demonstrated that cis-SAGe is the common mechanism for generating chimeric transcripts in healthy human cells [[111]17,[112]22]. cis-SAGe is the intergenic splicing of directly adjacent genes with the same transcriptional orientation, which can generate chimeric transcripts [[113]60]. We performed the genomic neighborhood analysis of the parental genes from severe COVID-19 specific recurrent chimeric transcripts, and we found seven confirmed cases where those chimeric transcripts were generated by cis-SAGe analysis ([114]Table S2). Integrative Genomics Viewer (IGV) [[115]61] was used to visualize the adjacent genes. [116]Figure 3 and [117]Figure S1 support the close proximity of the two genes from different recurrent severe COVID-19 specific chimeric transcripts and highlight the potentiality of cis-SAGe mechanism. Figure 3. [118]Figure 3 [119]Open in a new tab Genomic neighborhood analysis of the parental genes of severe COVID-19 specific recurrent chimeric transcripts. The two parental genes of chimeric transcripts adjacent to each other indicate the potentiality of cis-SAGe. If the emergence of chimeric transcripts can regulate the functionality of their parental genes, then we might expect the differential expression of at least one of their parental genes. Interestingly, we observed in several instances that at least one of the parental genes of recurrent severe COVID-19 specific chimeric transcripts from different datasets were differentially expressed ([120]Figure 4). This finding could suggest that chimeric transcripts generated in severe COVID-19 could significantly regulate their parental genes and alter the pathways and biological processes where their parental genes are involved. Figure 4. [121]Figure 4 [122]Open in a new tab Differential gene expression analysis of the parental genes of severe COVID-19 specific recurrent chimeric transcripts. 3.4. Identification of Common Chimeric Transcripts Expressed in Severe COVID-19 and Other Severe Respiratory Viral Infections Common chimeric transcripts from multiple severe respiratory viral infections and severe COVID-19 could be associated with the immune response common to the development of disease severity caused by most respiratory viral infections. With this aim, we have identified 189 common chimeric transcripts unique to both severe COVID-19 and multiple other severe respiratory viral infections ([123]Table S3). Then, we analyzed the functional and pathway enrichment analysis of the parental genes of these common chimeric transcripts. In line with the GO biological process enrichment of the parental genes of recurrent severe COVID-19 specific chimeric transcripts, here we observed the parental genes of common chimeric transcripts of severe COVID-19 infections and other severe respiratory viral infections are involved in important processes related to immune system processes, localization, biological processes involved in interspecies interaction, cellular processes, metabolic processes, regulation of biological processes, etc. ([124]Figure 5A). This observation suggests that the chimeric transcripts mostly generated in severe viral infections are crucial for immune dysfunctions and alternations of various important cellular, metabolic, and regulatory processes and pathways leading to the development of disease severity. From the top-most enriched biological process and pathway analysis, we found the adaptive immune response is the most important process ([125]Figure 5B), which indicates that these chimeric transcripts generated from these genes could significantly associate with the adaptive immune response. Furthermore, the inflammatory response was found to be another enriched process which is an important factor in viral infection-mediated disease severity ([126]Figure 5B). Next, we observed the significant enrichment of the vesicle-mediated transport pathway ([127]Figure 5B), which suggests that the chimeric transcripts could be associated with the transport of viral particles into the membrane-bounded vesicles, which is an efficient mechanism for viruses to enter host cells and evade the host immune response [[128]62,[129]63,[130]64]. This line of observations supports our hypothesis that the chimeric transcripts common to multiple severe respiratory viral infections, including COVID-19, could impact the various important biological processes and pathways that drive the stress responses leading to the development of disease severity. Figure 5. [131]Figure 5 [132]Open in a new tab (A) GO biological process enrichment analysis of the parental genes of common chimeric transcripts associated with multiple severe respiratory viral infections and severe COVID-19; (B) Top-most enriched processes and pathways of the parental genes of common chimeric transcripts associated with multiple severe respiratory viral infections and severe COVID-19 detected by Metascape (p-value cutoff < 0.05). 4. Discussion In this study, we analyzed several publicly available RNA-seq datasets containing 85 severe COVID-19 patients’ blood samples, 10 mild COVID-19 patients’ blood samples, 84 other respiratory viral infected patients’ blood samples, and 253 healthy human samples from 32 different tissues, including blood. We identified 424 severe COVID-19 specific chimeric transcripts; among these, 42 chimeric transcripts were recurrent which were mapped in at least 3 samples. The important enriched biological processes and pathways of the parental genes of these recurrent severe COVID-19 specific chimeric transcripts highlighted that they are involved in cellular stress response, phagocytosis, and interferon signaling. The genes associated with the cellular stress response pathway could be involved in the activation of innate and adaptive immune responses and play a pivotal role in host defenses and inflammation [[133]65]. Next, we observed the second most enriched process/pathway is phagocytosis. Recent studies demonstrated that alternations of the phagocytosis response are strongly associated with COVID-19 severity [[134]66,[135]67,[136]68]. However, the molecular mechanisms behind the phagocytosis alternations and COVID-19 severity development were not clearly understood. Our findings suggest that the recurrent severe COVID-19 specific chimeric transcripts could alter the functionality of their parental genes which are involved in the phagocytosis process. Further, we observed that parental genes of the recurrent severe COVID-19-specific chimeric transcripts were significantly involved in interferon signaling. Recent studies showed that the dysregulated interferon response is crucial for the development of severe COVID-19 [[137]69,[138]70]. Interferon signaling induces the IFN-stimulated gene (ISG) expression by phosphorylating STAT1, and STAT1 phosphorylation was found to increase in severe COVID-19 cases, indicating an imbalanced JAK/STAT signaling and lack of ISG-induced transcription [[139]69]. In this study, we observed that the cis-SAGe mechanism STAT1 gene generated a recurrent severe COVID-19 specific chimeric transcript. Altogether, these findings could suggest the potential involvement of recurrent severe COVID-19 specific chimeric transcripts in the generation of a stress response associated with dysregulated interferon signaling during severe COVID-19 infections. If the chimeric transcripts alter the functionality of their parental genes, then we should expect at least one of their parental genes could be differentially expressed. Interestingly, we observed in several instances that one of the parental genes of recurrent severe COVID-19 specific chimeric transcripts was differentially expressed. We observed consistent differential gene expression patterns for some chimeric transcript’s parental genes in all datasets; however, the chimeric transcripts mapped in some datasets, but not all ([140]Figure 4). We chose strict criterion (Materials and Methods (2.2)) for chimeric transcripts detection to reduce the chances of obtaining false positives. So, as the quality of the RNA-seq data is different in each dataset, we could have missed some chimeric transcripts because fewer reads were mapped, or mapping read qualities were low. Similarly, the number of differentially expressed genes in each dataset depends on the quality of sequencing data and the p-value criteria selected to screen the differentially expressed genes. We chose the strict p-value, p <0.05, as the criterion for detection of differential gene expression, so there is a chance that we could have missed some differentially expressed genes in some datasets because of the quality of the sequencing data and read count bias. Interestingly, instead of the difference of sequence quality in different datasets, we mapped all the recurrent severe COVID-19 specific chimeric transcripts detected in the E-MTAB-10926 samples in the single-cell high-quality (scRNA)-seq dataset from the PRJNA633393 samples. Therefore, the production of the chimeric transcripts could be the important signature behind the generation of stress responses leading to COVID-19 severity development. Furthermore, we found 189 common chimeric transcripts unique to both severe COVID-19 and other severe respiratory viral infections whose parental genes were enriched in adaptive immune response and inflammatory response. This could signify these chimeric transcripts might be associated with stress responses associated with dysfunctional immune responses common to multiple respiratory viral severe infections. Our study first identified the chimeric transcripts unique to severe COVID-19 infections and demonstrated their potential functional importance ([141]Figure 6). During a severe COVID-19 infection, SARS-CoV-2 targets the host cell, splicing to alter the transcriptional dynamics of the host cell so that they can efficiently translate the viral proteins and evade the host cell’s defense mechanisms. The aberrant splicing during severe COVID-19 infections could promote the generation of chimeric transcripts that can alter the function of their parental genes, which are mainly associated with immune-related processes that could lead to the development of immune dysfunction and disease severity development. Therefore, further exploring the functional role of individual recurrent severe COVID-19 specific chimeric transcripts could aid in understanding the individual’s immune responses to COVID-19 infection, which might help to design personalized treatment strategies. Figure 6. [142]Figure 6 [143]Open in a new tab Schematic representation of the generation of chimeric transcripts in severe COVID-19 and their potential functional impact. 5. Conclusions The findings of this present study suggest the potential role of recurrent severe COVID-19 specific chimeric transcripts in alternations of immune-related processes, activating the processes that favor the viral transcription, gene expression, and alternations of various biological and metabolic processes that could lead to the development of severe disease. Furthermore, we identified common chimeric transcripts found in severe COVID-19 and other severe respiratory viral infections. The parental genes of these common chimeric transcripts are functionally enriched for adaptive immune response and inflammatory response. These observations suggest these chimeric transcripts could be associated with different stresses associated with the dysfunctional immune response generated during multiple severe respiratory viral infections, including COVID-19. The limitation of our study was the lack of experimental validation, as we relied on publicly available RNA-seq data from patients. Further, the study’s use of publicly available datasets had varying experimental designs, and the patient samples came from diverse populations which were collected at different time points during the development of COVID-19 severity. Despite the limitations inherent to computational analysis, this study successfully uncovered the hidden layers of the blood transcriptome from the various respiratory viral infected patients and identified several potential functionally relevant chimeric transcripts unique to severe COVID-19 and common to multiple severe respiratory virus infections. Further experimental studies are necessary to elucidate the specific functions of these chimeric transcripts in COVID-19 pathogenesis. Acknowledgments