Abstract Background COVID-19 is a highly infectious respiratory disease that can manifest in various clinical presentations. Although many studies have reported the lipidomic signature of COVID-19, the molecular changes in asymptomatic severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)-infected individuals remain elusive. Methods This study combined a comprehensive lipidomic analysis of 220 plasma samples from 166 subjects: 62 healthy controls, 16 asymptomatic infections, and 88 COVID-19 patients. We quantified 732 lipids separately in this cohort. We performed a difference analysis, validated with machine learning models, and also performed GO and KEGG pathway enrichment analysis using differential lipids from different control groups. Results We found 175 differentially expressed lipids associated with SASR-CoV-2 infection, disease severity, and viral persistence in patients with COVID-19. PC (O-20:1/20:1), PC (O-20:1/20:0), and PC (O-18:0/18:1) better distinguished asymptomatic infected individuals from normal individuals. Furthermore, some patients tested positive for SARS-CoV-2 nucleic acid by RT-PCR but did not become negative for a longer period of time (≥60 days, designated here as long-term nucleic acid test positive, LTNP), whereas other patients became negative for viral nucleic acid in a shorter period of time (≤45 days, designated as short-term nucleic acid test positive, STNP). We have found that TG (14:1/14:1/18:2) and FFA (4:0) were differentially expressed in LTNP and STNP. Conclusion In summary, the integration of lipid information can help us discover novel biomarkers to identify asymptomatic individuals and further deepen our understanding of the molecular pathogenesis of COVID-19. Keywords: SARS-CoV-2, COVID-19, asymptomatic infection, lipids, long-term nucleic acid test positive Introduction At the end of 2019, an acute respiratory disease caused by SARS-CoV-2 continued to spread rapidly all over the world and attracted extensive attention ([41]1). Coronavirus disease 2019 (COVID-19) is a highly contagious disease that targets the respiratory tract. Asymptomatic infections (AS) are those who do not have any symptoms but carry SARS-CoV-2 ([42]2). The number of asymptomatic subjects has raised concerns worldwide because they are difficult to identify. In addition, some of the patients tested positive for the nucleic acid of SARS-CoV-2 by RT-PCR but did not become negative for a longer period of time (≥60 days, herein designated as long-term nucleic acid test positive, LTNP) whereas others tested negative for the viral nucleic acid in a shorter period of time (≤45 days, designated as short-term nucleic acid test positive, STNP). Asymptomatic and LTNP COVID-19 patients both threaten global public health. However, little is known about the detailed mechanisms responsible for the LTNP and STNP as well as AS and health controls. At present, research on virus–host interaction generally focuses on proteomic changes. However, metabolites, as the final product of cellular processes, provide different insights into the mechanism of viral infection ([43]3).. Previous studies have shown that the pathogenesis of viral infection is closely related to the lipid metabolism of infected cells. 25-Hydroxycholesterol (25-HC) inhibits viral invasion by inducing inflammatory and immune responses ([44]4). It has been found that the replication of SARS-CoV and SARS-CoV-2 is inhibited by 25-HC ([45]5). Omega-3 PUFA-derived lipid mediator protectins have been shown to inhibit influenza virus replication by blocking viral mRNA output ([46]3). Lipids play an important role in the life cycle of viruses. Their involvement in virus infection includes fusion of virus membranes and host cells, virus replication, endocytosis, and exocytosis ([47]6). SARS-CoV-2 is an enveloped virus surrounded by a lipid bilayer. Viruses manipulate host cells by targeting lipid synthesis and signal transduction pathways. They modify the host cells, enveloping them to produce lipids ([48]7). Therefore, understanding the alterations in host cell lipids can be beneficial in identifying potential biomarkers that can distinguish between healthy individuals, those with asymptomatic infections, and individuals with different types of symptoms caused by SARS-CoV-2. This knowledge can further aid in the development of targeted treatments. In this study, we aimed to investigate the potential pathogenesis of COVID-19 by analyzing the host response to SARS-CoV-2 infection in humans. We conducted lipidomics analysis on plasma samples obtained from a total of 220 individuals, including 104 COVID-19 patients and 62 healthy controls. We identified 175 differentially expressed lipids that were associated with the SASR-CoV-2 infections, disease severity, and viral persistence in the COVID-19 patients. For example, we found that 81 lipids showed significant differences between AS and healthy controls, and 15 lipids could distinguish between LTNP and STNP. Moreover, we performed machine learning to further identify and verify the potential biomarkers for different disease severity. These striking results provided the potential biomarkers to learn about the mechanism of COVID-19, particularly in relation to AS, LTNP, and STNP. Furthermore, these findings have the potential to identify diagnostic and treatment targets for SARS-CoV-2 infection. Methods Ethics statement This study was approved by the Ethics Committee of the First Affiliated Hospital of Xi’an Jiaotong University (XJTU1AF2020LSK-015) and the Renmin Hospital of Wuhan University (WDRY2020-K130). All participants enrolled in this study provided written informed consent by themselves or their surrogates. The high throughput sequencing of plasma samples was performed on existing samples collected during standard diagnostic tests, posing no extra burden to patients. Case definition and study cohort The definition and classification of all COVID-19 patients in this study follow the Guidelines of the World Health Organization and the “Guidelines on the Diagnosis and Treatment of the Novel Coronavirus Infected Pneumonia” developed by the National Health Commission of People’s Republic of China ([49]8–[50]10). This study cohort included 220 plasma samples derived from 166 individuals, consisting of healthy controls (HC, n=62), asymptomatic infections (AS, n=16), and symptomatic patients (SM, n=88). Symptomatic patients consisted of moderate diseases (MD, n=42) and severe diseases (SD, n=46). Then according to the time of a positive nucleic acid test, the individuals in the SM group were divided into 17 long-term nucleic acid test positive (LTNP, ≥60 days) and 34 short-term nucleic acid test positive (STNP, ≤45 days) individuals. In this study, based on the clinical observation that most of the COVID-19 patients hospitalized in the Renmin Hospital in Wuhan tested negative for the nucleic acid test within 45 days, we therefore defined the STNP as ≤45 days whereas the LTNP was ≥60 days. The demographic features, clinical laboratory testing results and other relevant information are provided in [51]Supplementary Table 1 . Blood sample collection and plasma preparation The peripheral blood was collected into the standard EDTA-K2 Vacuette Blood Collection Tubes (Jiangsu Yuli Medical Equipment Co., Ltd, China; Cat.Y09012282) and stored at room temperature or 4°C until processed, and the sample storage time did not exceed 12 hours. The plasma was prepared after centrifugation of the whole blood sample at 2500 rpm for 20 minutes and stored in the -80°C freezer until used for the studies. All the experimental procedures were completed inside a biosafety level 2 (BSL-2) laboratory at the Department of Clinical Diagnostic Laboratories, Renmin Hospital of Wuhan University. A total of 220 plasma samples were collected from 62 HC and 104 COVID-19 at different time-points from some of these individuals. For lipid compounds, samples were thawed on ice, whirled for around 10s, and then centrifuged for 3000 rpm at 4°C for 5 min. We took 50 μL of one sample and homogenized it with 1 mL mixture (include methanol, MTBE and internal standard mixture). We whirled the mixture for 2 min and then added 500μl of water and whirled the mixture for 1 min and centrifuged it for 12,000 rpm at 4°C for 10 min. We extracted 500 μL of supernatant and concentrated it. We dissolved powder with 100 μL mobile phase B and then stored it at -80°C. Finally, we moved the dissolving solution into the sample bottle for LC-MS/MS analysis. HPLC conditions For lipid compounds, the sample extracts were analyzed using an LC-ESI-MS/MS system (UPLC, Shim-pack UFLC SHIMADZU CBM 30A system, [52]https://www.shimadzu.com/; MS, QTRAP^® System, [53]https://sciex.com/). The analytical conditions were as follows UPLC: column, Waters ACQeITY UPLC HSS T3 C18 (1.8 µm, 2.1 mm*100 mm); column temperature, 40°C; flow rate, 0.4 mL/min; injection volume, 5μL; solvent system, water (0.04% acetic acid): acetonitrile (0.04% acetic acid); gradient program, 95:5 V/V at 0 min, 5:95 V/V at 11.0 min, 5:95 V/V at 12.0 min, 95:5 V/V at 12.1 min, 95:5 V/V at 14.0 min. The effluent was alternatively connected to an ESI-triple quadrupole-linear ion trap (QTRAP)-MS. ESI-Q TRAP-MS/MS LIT and triple quadrupole (QQQ) scans were acquired on a triple quadrupole-linear ion trap mass spectrometer (QTRAP), QTRAP^® LC-MS/MS System, equipped with an ESI Turbo Ion-Spray interface, operating in positive and negative ion mode, and controlled by Analyst 1.6.3 software (Sciex). For lipid compounds, the following applied: source temperature 550 °C; ion spray voltage (IS) 5500 V (positive), -4500 V (negative); ion source gas 1 (GS1), gas 2 (GS2), curtain gas (CUR) were set at 55, 60, and 25 psi, respectively. The collision gas (CAD) was high. Instrument tuning and mass calibration were performed with 10 and 100 μmol/L polypropylene glycol solutions in QQQ and LIT modes, respectively. DP and CE for individual MRM transitions was done with further DP and CE optimization. A specific set of MRM transitions were monitored for each period according to the metabolites eluted within this period in both hydrophilic and hydrophobic compounds. Statistical analysis and machine learning The mass spectrum data were processed in Analyst 1.6.3 software. The characteristic ions of each substance were screened by QQQ. The MultiQuant software was used to open the mass spectrum file of the samples, and the chromatographic peaks detected by each metabolite in different samples were integrated and corrected according to the retention time and peak type information of metabolites. The statistical analysis was performed using the first blood sample collected from the participants. Orthogonal Partial Least Squares-Discriminant Analysis (OPLS-DA) was used to evaluate the statistical significance in different groups. Metabolites were selected using the following criteria: a variable importance in projection (VIP) value ≥ 1 and Fold change (FC)≥1.5 or ≤0.67. The normalized quantitation of metabolites was used as modeling data for every two compared groups. Then, 75% of the modeling data was selected as the training cohort, and the rest was used in the testing cohort. From the training set, we selected important lipids via machine learning using the xgboost method with the R package “xgboost”(version 1.4.1.1) ([54]11). For analysis of the five classes, 28 subclasses, carbon chain length, and degree of saturation of lipids in five compared groups, we used independent t-tests when the data were normally distributed and homoscedasticity; otherwise, the Mann-Whitney was used. The statistical analyses were performed in SPSS version 18.0 software. The data statistics used the mean values of raw intensities of lipid molecules with five classes, 28 subclasses, and different lengths of carbon chain (unsaturation) in different classes. Results Lipid profiling of COVID-19 plasma In this study, lipidomics analysis was conducted on 220 plasma samples obtained from a total of 166 individuals. The sample distribution included 62 healthy controls, 16 asymptomatic infections, 42 individuals with moderate diseases, and 46 individuals with severe diseases. Additionally, there were repeated samples from 5 healthy controls, 12 individuals with moderate diseases, and 23 individuals with severe diseases, which were collected twice or more. The detailed descriptions of 166 individuals are shown in [55]Table S1 . We used OPLS-DA to analyze the 220 plasma samples. In total, we identified five major lipid classes [fatty acyls (FA), glycerophospholipids (GP), glycerolipids (GL), sphingolipids (SP), and sterol lipids (ST)] containing 28 lipid subclasses [carbocyclic fatty acids (CAR), free fatty acid (FFA), diglyceride (DG); monoglyceride (MG), lyso-phosphatidic acid (LPA), lyso-phosphatidylcholine (LPC), triglyceride (TG), ether-linked lyso-phosphatidyl-cholines (LPC-O), lyso-phosphatidylethanolamine (LPE), lyso-phosphatidylglycerol (LPG), lyso-phosphatidylinositol (LPI), lyso-phosphatidylserine (LPS), phosphatidic acid (PA), phosphatidylcholine (PC), cholesterylesters (CE), ether-linked phosphatidyl-cholines (PC-O), phosphatidylethanolamine (PE), phosphatidylglycerol (PG), phosphatidylinositol (PI), phosphatidylserine (PS), ceramides (Cer/Cert/Cerm), ceramides phosphate (CerP), and sphingomyelin], totaling 732 lipid molecules; detailed information of lipids is listed in [56]Table S2 . We analyzed differentially expressed lipids (DELs) in five compared groups, including COV vs HC, AS vs HC, SM vs AS, SD vs MD, and LTNP vs STNP ([57] Table S3 ). Lipids associated with SARS-COV-2 infection The 50 lipids were at significantly different levels in COVID-19 patients compared to healthy controls (VIP ≥ 1 and FC ≥ 1.5 or FC ≤ 0.67) ([58] Figures 1C, D ). Further, the DELs were used to perform the KEGG pathway analysis. The result suggested that the KEGG pathways of DELs were mainly involved in glycerophospholipid metabolism (hsa00564) and glycerolipid metabolism (hsa00561) ([59] Supplementary Figure 5A ). Figure 1. [60]Figure 1 [61]Open in a new tab Lipids associated with SARS-COV-2 infections. (A) The raw intensity of 28 subclasses of lipids in COV vs HC. (B) The raw intensity of different degrees of unsaturation for FA, GL, GP, SP, and ST classes in COV vs HC. (C) The OPLS-DA scores plot of COV vs HC. (D) the volcano plot of COV vs HC. Red dots represent the upregulated lipids (FC≥1.5, VIP≥1); blue dots represent the downregulated lipids (FC ≤ 0.67, VIP≥1); gray dots represent lipids without significant changes (0.6714); GL: short chain (C<32), medium chain (C=32 or C=34), long chain (C>34); GP: short chain (C<32), medium chain (C=32 or C=34), long chain (C>34); GL: short chain (C<32), medium chain (C=32 or C=34), long chain (C>34); SP: short chain (C<32), medium chain (C=32 or C=34), long chain (C>34); ST: short chain (C<18), medium chain (C=18 or C=20), long chain (C>20). [162]Click here for additional data file.^ (537.3KB, jpg) Supplementary Figure 4 Receiver operating characteristic (ROC) and performance of the xgboost model in the test set for 5 compared groups. [163]Click here for additional data file.^ (296.4KB, jpg) Supplementary Figure 5 The KEGG analysis of the differentially expressed lipids in 5 compared groups. [164]Click here for additional data file.^ (282.5KB, jpg) Supplementary Table 1 the detailed characteristic information of samples used in this study. [165]Click here for additional data file.^ (31.2KB, xlsx) Supplementary Table 2 Summary of lipids in this study. [166]Click here for additional data file.^ (1.1MB, xlsx) Supplementary Table 3 the differentially expressed lipids in 5 compared groups. [167]Click here for additional data file.^ (230.9KB, xlsx) Supplementary Table 4 the KEGG results for differentially expressed lipids in 5 compared groups. [168]Click here for additional data file.^ (16.1KB, xlsx) References