Abstract The COVID-19 pandemic boosted the development of diagnostic tests to meet patient needs and provide accurate, sensitive, and fast disease detection. Despite rapid advancements, limitations related to turnaround time, varying performance metrics due to different sampling sites, illness duration, co-infections, and the need for particular reagents still exist. As an alternative diagnostic test, we present urine analysis through flow-injection–tandem mass spectrometry (FIA-MS/MS) as a powerful approach for COVID-19 diagnosis, targeting the detection of amino acids and acylcarnitines. We adapted a method that is widely used for newborn screening tests on dried blood for urine samples in order to detect metabolites related to COVID-19 infection. We analyzed samples from 246 volunteers with diagnostic confirmation via PCR. Urine samples were self-collected, diluted, and analyzed with a run time of 4 min. A Lasso statistical classifier was built using 75/25% data for training/validation sets and achieved high diagnostic performances: 97/90% sensitivity, 95/100% specificity, and 95/97.2% accuracy. Additionally, we predicted on two withheld sets composed of suspected hospitalized/symptomatic COVID-19-PCR negative patients and patients out of the optimal time-frame collection for PCR diagnosis, with promising results. Altogether, we show that the benchmarked FIA-MS/MS method is promising for COVID-19 screening and diagnosis, and is also potentially useful after the peak viral load has passed. Keywords: amino acids, COVID-19, diagnostic, metabolomics, urine 1. Introduction SARS-CoV-2 caused the worst pandemic in the last 100 years. Modern-day laboratory medicine was highly impacted by: the need for the implementation of new technologies; the shortage of the workforce and of supplies, equipment overload, and regulatory changes; this being in addition to the emergence of new mutations [[58]1,[59]2,[60]3]. Considering the incessant demand for fast and accurate diagnosis, the critical role of clinical laboratory tests in human health has become apparent [[61]4,[62]5]. The increasing need for patient testing motivated many clinical laboratories to explore different methods for the collection [[63]6,[64]7,[65]8,[66]9,[67]10], handling [[68]11,[69]12,[70]13], and analysis [[71]14,[72]15,[73]16,[74]17,[75]18,[76]19] of samples, along with different specimen types [[77]20,[78]21]. The most commonly implemented methods for COVID-19 diagnosis rely on molecular-based tests for viral RNA detection [[79]22,[80]23]—which are quantitative antigen tests based on enzyme immunoassays for saliva or nasopharyngeal swabs [[81]24]—or serological tests for the purposes of anti-SARS-CoV-2 immunoglobulin detection [[82]25]. Although the COVID-19 pandemic has accelerated the microbial diagnostics field, accurate and fast diagnosis of SARS-CoV-2 still has several limitations. Some drawbacks are the slow turnaround time, the varying performance of the tests according to sample site, illness duration, the presence of co-infections [[83]26], and the need for particular reagents. With the expected endemic circulation of the virus, finding new tools for COVID-19 diagnosis remains necessary as virus surveillance may require tests in routine situations such as going to work, school, or large gatherings. The constant threat of new pathogens is an additional motivation for new diagnostic strategies [[84]27]. Moreover, while most tests rely on analyzing nasopharyngeal swabs (NPS), using other biological samples, particularly non-invasive samples, is appealing to facilitate sample collection. Although NPS is a relatively well-tolerated technique, the discomfort level of its collection may vary according to the patient’s age and sex [[85]28,[86]29]. Training and anatomical knowledge are also necessary for NPS collection, as it strongly impacts test sensitivity [[87]30]. Moreover, the implementation of more comfortable and less invasive samples may lead to a higher adherence of individuals to routine testing [[88]31,[89]32]. Urine samples can be self-collected and are acquired noninvasively, representing an attractive alternative for non-stressful disease monitoring. Previous studies have explored using urine analysis for COVID-19 diagnosis. For example, in a study by Li et al. using LC-MS—i.e., 25 lipids representing important molecular signatures—were comparatively evaluated across urine and plasma samples, along the course of infection, for the prediction of severity in 30 patients with COVID-19, resulting in a prediction power of 0.904 and 0.988, based on the area under the curve (AUC), for urine and plasma, respectively [[90]20,[91]33]. In a separate study by Bi et al., also using LC-MS, the urinary metabolome was used to confirm altered cytokines and their receptors that are correlated with SARS-CoV-2 replication [[92]34]. Altered urinary metabolomes were also found in COVID-19-infected patients suffering from acute kidney injury (AKI) when compared to healthy controls in 46 subjects. Dewulf et al. compared urine from patients hospitalized with COVID-19 with different degrees of severity against healthy controls using LC-MS and found a significant increase in the levels of tryptophan metabolites [[93]35]. Based on these promising studies, we explored the detection of urinary amino acids and acylcarnitines by flow-injection analysis–tandem mass spectrometry (FIA-MS/MS) as a method for COVID-19 diagnosis [[94]21,[95]36,[96]37,[97]38,[98]39,[99]40,[100]41,[101]42,[102]43,[103 ]44,[104]45,[105]46,[106]47,[107]48,[108]49,[109]50]. FIA-MS/MS is a method that is widely used to analyze dried blood spots (DBS) and to detect innate errors of metabolism in newborn screenings [[110]51]. The metabolites targeted in this study, amino acids and acylcarnitines, and their relation with COVID-19 infection have also been investigated with MS in different matrices such as plasma [[111]21,[112]36,[113]37,[114]38,[115]39,[116]40,[117]41,[118]42], serum [[119]39,[120]40,[121]43,[122]44,[123]45,[124]47,[125]48,[126]49], and feces [[127]50]. Although FIA-MS/MS is one of the most popular and successful clinical applications of MS, it has not been used for COVID-19 testing, as the totality of the mentioned studies employed chromatographic separation prior to MS detection. Here, we demonstrate that urinary amino acids and acylcarnitines are helpful in diagnosing COVID-19 patients based on a cohort of 246 subjects using FIA-MS/MS. We also show that statistical classifiers generated from the metabolic information allow for the diagnosis of COVID-19 with an agreement with PCR of 95%, indicating the utility of this widespread method to be considered as a new screening tool for COVID-19. 2. Materials and Methods 2.1. Chemicals Unlabeled amino acid standards and labeled isovaleryl-DL-carnitine-(N,N,N-trimethyl-d9) hydrochloride were purchased from Merck (Merck KGaA, Darmstadt, Germany). Acetonitrile and methanol HPLC–MS grade solvents were from J.T. Baker. 2.2. Subjects Self-collected urine samples from 246 volunteers were prospectively obtained from July to October 2020 at three medical centers in Bragança Paulista (SP, Brazil); Santa Casa and Bragantino Hospitals; and at the Integrated Unit of Pharmacology and Gastroenterology (UNIFAG). No fasting guidelines were given to the volunteers prior to sample collection. In a convenience sampling, we also recruited healthy volunteers and patients hospitalized with moderate or severe [[128]52] symptoms after being admitted to the medical center. Patients older than 18 years old with suspicion of COVID-19 were recruited according to the following eligibility criteria: patients hospitalized in the medical center, non-pregnant, without mechanical ventilation or indwelling catheter; further, patients who were facing imminent death were excluded. Healthy non-pregnant volunteers older than 18 were selected if they declared no previous contamination by COVID-19 or close contact with infected people. Institutional Review Board (IRB) approval was received for the study (protocol number 31573020.9.0000.5514, approved on 29 May 2020). Samples were collected from healthy volunteers (n = 104) and hospitalized volunteers when they possessed symptoms similar to those found in COVID-19 infections (n = 142). All the healthy and symptomatic volunteers had their diagnoses confirmed via an analysis of nasopharyngeal swab samples through an RT-PCR, which were used for the purposes of recruitment into the study or as part of their clinical care, using Brazilian-certified analysis services. RT-PCR was performed using a TaqPath COVID-19 RT-PCR IVD Kit (Thermo Fisher), and the results were interpreted using the COVID-19 Interpretative Software, according to the manufacturer’s instructions, with a cycle threshold (Ct) value of <37. Positive SARS-CoV-2 infection was confirmed for 99 hospitalized volunteers and discarded for 43. [129]Table 1 provides the patient demographic and clinical information. Patients or volunteers with inconclusive RT-PCR results were resampled or excluded. Table 1. Clinic and demographic information of individuals recruited for the study, including SARS-CoV-2-negative non-hospitalized subjects (Neg-NH) and SARS-CoV-2-positive hospitalized subjects (Pos-H), used for model building and evaluation. Withheld Sets 1 and 2 containing symptomatic SARS-CoV-2-negative hospitalized subjects (Neg-H) are also shown. Classifier ^a (Training + Validation) Withheld Set 1 Withheld Set 2 Neg-NH Pos-H Neg-H Pos-H Neg-H Total = 246 104 n (%) 42 n (%) 24 n (%) 57 n (%) 19 n (%) Age—mean (min–max) ^b 38.2 (20–89) 56.2 (21–86) 58.8 (26–81) 56.3 (26–77) 59.8 (30–83) Female ^c 61 (58.7) 13 (31.0) 11 (45.8) 23 (40.4) 10 (52.6) Male ^c 43 (41.3) 29 (69.0) 13 (54.2) 34 (59.6) 9 (47.4) Symptoms Fever 0 (0.0) 23 (54.8) 11 (45.8) 36 (63.2) 10 (52.6) Cough 0 (0.0) 30 (71.4) 14 (58.3) 35 (61.4) 11 (57.9) Myalgia 0 (0.0) 8 (19.0) 2 (8.3) 13 (22.8) 3 (15.8) Sore throat 1 (1.0) 7 (16.7) 8 (33.3) 8 (14.0) 2 (10.5) Headache 3 (2.9) 12 (28.6) 2 (8.3) 7 (12.3) 5 (26.3) Coryza 0 (0.0) 5 (11.9) 3 (12.5) 6 (10.5) 2 (10.5) Dyspnea 0 (0.0) 29 (69.0) 16 (66.7) 29 (50.9) 12 (63.2) Oxygen saturation < 95% 0 (0.0) 17 (40.5) 10 (41.7) 19 (33.3) 3 (15.8) Tiredness/fatigue 0 (0.0) 3 (7.1) 2 (8.3) 7 (12.3) 3 (15.8) Loss of smell or taste 0 (0.0) 8 (19.0) 9 (37.5) 12 (21.1) 5 (26.3) Vomiting or nausea 0 (0.0) 2 (4.8) 3 (12.5) 9 (15.8) 1 (5.3) Diarrhea 0 (0.0) 11 (26.2) 2 (8.3) 12 (21.1) 3 (15.8) Comorbidity SAH ^d 16 (15.4) 21 (50.0) 10 (41.7) 29 (50.9) 8 (42.1) Cardiovascular disease 2 (1.9) 7 (16.7) 4 (16.7) 10 (17.5) 4 (21.1) Obesity 12 (11.5) 9 (21.4) 1 (4.2) 13 (22.8) 4 (21.1) Diabetes mellitus 3 (2.9) 17 (40.5) 2 (8.3) 18 (31.6) 4 (21.1) Neoplasia 0 (0.0) 3 (7.1) 0 (0.0) 1 (1.8) 1 (5.3) Lung disease 8 (7.7) 3 (7.1) 6 (25.0) 5 (8.8) 4 (21.1) COPD ^e 1 (1.0) 1 (2.4) 2 (8.3) 3 (5.3) 2 (10.5) Smoker or ex-smoker 6 (5.8) 3 (7.1) 4 (16.7) 3 (5.3) 3 (15.8) Asthma 2 (1.9) 2 (4.8) 2 (8.3) 2 (3.5) 1 (5.3) Kidney disease 0 (0.0) 2 (4.8) 0 (0.0) 1 (1.8) 1 (5.3) Tomography Findings Ground glass opacity 0 (0.0) 40 (95.2) 19 (79.2) 54 (94.7) 16 (84.2) Consolidations 0 (0.0) 20 (47.6) 13 (54.2) 32 (56.1) 8 (42.1) Crazy-paving appearance 0 (0.0) 19 (45.2) 10 (41.7) 22 (38.6) 8 (42.1) reticular pattern 0 (0.0) 6 (14.3) 6 (25.0) 16 (28.1) 2 (10.5) Pulmonary commitment degree 0 (0.0) 35 (83.3) 16 (66.7) 49 (86.0) 12 (63.2) Suggestive of viral infection 0 (0.0) 40 (95.2) 19 (79.2) 54 (94.7) 16 (84.2) [130]Open in a new tab ^a: Estimated statistical power of 99.2% (alfa = 0.05, Cohen, 1988); ^b: p-value for the age is 2.7 × 10^−6 for the classifier, and 0.31 for the Withheld Set 2 (Mann–Whitney–Wilcoxon test); ^c: p-value for the sex is 4.4 × 10^−3 and 0.35 (Xi-square test) for these groups, respectively; ^d: SAH: systemic arterial hypertension; and ^e: COPD: chronic obstructive pulmonary disease. 2.3. Sample Preparation Urine samples were heat-inactivated after collection (65 °C, 30 min) [[131]53] in a Class II biological safety cabinet before being aliquoted and frozen until extraction. All the samples were thawed at room temperature. A pooled sample was prepared from equal parts (10 μL) of each sample and then aliquoted in different quality control (QC) samples, which were extracted and distributed every ten injections for instrumental monitoring. This resulted in 10 QC samples for system suitability and 28 samples QC for intra-batch monitoring. Samples (300 μL) were randomized and centrifuged (12,000 rpm, 4 °C, 10 min). Next, the supernatant (150 μL) was collected, following the addition of water (120 μL), acetonitrile (15 μL), and internal standard (IS) solution (15 μL of isovaleryl-DL-carnitine-(N,N,N-trimethyl-d9) hydrochloride solution at 11.1 ng mL^−1 in methanol). Blank samples were prepared using ultrapure water instead of urine. 2.4. Flow Injection–Tandem MS Analysis Data acquisition was performed on a Waters^® Xevo TQD triple quadrupole mass spectrometer equipped with a Shimadzu^® SCL-10A controller, a Shimadzu^® LC-20AD pump controller, and a Shimadzu^® SIL-20A automatic sampler injector. The methodology employed Flow Injection Analysis (FIA) without chromatographic separation, and 10 µL was used as injection volume. Further, the mobile phase was composed of water:acetonitrile:formic acid (80:20:0.1 v/v/v). A flow gradient was used, starting with a zeroed flow until 0.5 min. We initially zeroed the flow rate to allow the integration of the entire peak, with no cuts due to the proximity to the y-axis. Afterward, the flow ranged from 0 to 0.5 mL min^−1 from 0.5 to 0.51 min, at which point it was maintained until 3.50 min, and was then decreased to 0.1 mL min^−1, with a total runtime of 4 min. Multiple reaction monitoring (MRM) transitions were optimized for each compound by analyzing labeled and unlabeled standards, as described in [132]Supplementary Table S1 (ST1). The acquisition was controlled by the Target Lynx software (Waters). 2.5. Data Analysis and Statistical Classifiers The ratio of the peak areas of the analytes and the IS was considered and processed using Metaboanalyst 5.0 ([133]http://www.metaboanalyst.ca) [[134]54]. Calculations were made based on the relative peak area ratios of each analyte/IS through the different groups. Missing values were replaced by 1/5 of the minimal positive values of their corresponding variables. Relative standard deviation (RSD) was calculated for the intra-batch QC samples, and those analytes found with RSD > 25% were not considered for statistical modeling. Interquartile range filtering was applied in order to remove variables with near-constant values. Data normalization was performed by sum, followed by generalized logarithm transformation [[135]55], while the Pareto scaling method was applied. The resulting dataset was used for statistical analysis using the least absolute shrinkage and selection operator (Lasso). As hospitalized patients had their urine samples collected in a time lapse from 0 to 95 days from the swab collection to RT-PCR diagnosis, we, therefore, used a time frame to select patients in order to build the statistical classifier. For this purpose, we considered time-qualified samples, such as those from volunteers with a time interval of two days between urine and swab collections and the onset of symptoms of 14 days or less from the urine collection. The classifier was built using 75% of data from healthy non-hospitalized COVID-19 PCR-negative (n = 78, Neg-NH) and hospitalized COVID-19 PCR-positive (n = 32, Pos-H) patients. We validated the model with the remaining 25% of the data composed of Pos-H (n = 10) and Neg-NH (n= 26) volunteers. Additionally, we tested the ability of this model to predict on a withheld sample set (Withheld Set 1) composed of suspected hospitalized/symptomatic COVID-19 PCR-negative (n = 24, Neg-H) patients. We also tested this classifier’s prediction on samples that were excluded because they did not meet the selected time interval criteria for swab collection/symptoms onset. This sample set (Withheld Set 2) was composed of Pos-H (n = 57) and suspected hospitalized/symptomatic Neg-H (n = 19) patients. Cutoff values for positivity definition were selected based on the receiver operator characteristics (ROC) curve for training and validation sets. We evaluated the model’s performance for the validation and test sets by measuring the predictive accuracy, sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV), which were all calculated based on the agreement with PCR diagnosis. Univariate analysis was performed after data normalization using the Kruskal–Wallis test for the three groups (Pos-H, Neg-NH, and Neg-H), followed by Dunn’s post hoc test, using the Benjamini–Hochberg (BH) correction for the p-value. Afterward, the Mann–Whitney test was used to examine differences between Pos-H vs. Neg-NH, Pos-H vs. Neg-H, and Neg-NH vs. Neg-H (25), followed by the BH correction of the p-value. The stability of the analytes to the heat-inactivation process was evaluated using RSD ([136]Supplementary Table S2). Calculations were performed in R version 3.6.3 (R Foundation for Statistical Computing). Discriminant metabolic markers found by Lasso analysis were interrogated for the purposes of pathway enrichment analysis by using the metabolite set enrichment analysis (MSEA) via over-representation analysis from the Metaboanalyst web platform [[137]56]. Two metabolomics databases were interrogated, i.e., Kegg and the MSEA’s disease-associated metabolite sets using urine as a reference ([138]Supplementary Figure S1) [[139]56,[140]57]. 3. Results Detection of 19 amino acids, such as alanine, leucine, glutamine, tryptophan, and 15 acylcarnitines—such as free-carnitine, malonyl-carnitine, octadecanoyl-carnitine—were achieved from urine analysis, as presented in the [141]Supplementary Table S1 along with the relative standard deviation (RSD) measured for the QC samples ([142]Supplementary Table S2). Although asparagine and aspartate were detected in our method, they were excluded from statistical analyses due to the higher variability measured in their peak area ratios (RSD > 25%, n = 28, ST2). Monitoring the labeled internal standard signal along the QC samples resulted in 3.3% of RSD (N = 28 QC samples, [143]Supplementary Table S2), showcasing the analytical stability of the method. Note that the heat inactivation process did not appear to alter the peak area of the analytes, as the RSD measured between heat-inactivated and non-inactivated samples was lower than 15% for the entire set of analytes ([144]Supplementary Table S2). Twenty-nine metabolites were detected with metrics above thresholds established for RSD and thermal stability ([145]Supplementary Table S3); further, these were then used for statistical analysis. [146]Figure 1 shows that high diagnostic performances were achieved using statistical analysis for the training and validation sample sets. Only 1 out of 32 Pos-H samples in the training set and 1 out of 10 Pos-H samples from the validation set were erroneously classified as negative, resulting in high sensitivity (97% and 90%) and negative predictive values (NPV) of 99.0% and 96.3% for the training and validation sets, respectively. Amongst negative samples, 4 out of 78 were misclassified in the training set. In contrast, none of the 26 samples were misclassified as positive in the validation set, resulting in positive predictive values (PPV) of 89.0% and 100.0% for the test and validation sets, respectively; further, specificities of 95% and 100% were noted for these sets. The overall agreement to PCR was 95.0% and 97.2% for the test and validation sets, respectively. The cutoff value for classification was 0.181. The influence of age on the classifier’s predictive performance was evaluated and was noted to have minimally improved the classification metrics ([147]Supplementary Table S4). However, we opted not to take this variable into account with the goal of building a model that is independent of age; this is because we expect to adapt the model to different populations in the future. We observed that other studies also reported age and sex disparities in their sample sets, which is one of the disadvantages of using convenience sampling approaches. Dewulf et al. investigated a targeted urinary metabolic panel in 56 patients who were hospitalized with COVID-19 (26 non-critical and 30 critical); further, they also utilized 16 healthy controls and 3 controls with proximal tubule dysfunction unrelated to SARS-CoV-2 [[148]35]. Their control set comprised 31% men, while their positive set comprised 69–83% men. Thomas et al. also reported a divergence in age and sex when evaluating serum metabolites of patients with COVID-19 (n = 33, which was diagnosed by nucleic acid testing), compared with COVID-19–negative controls (n = 16). They reported 76% of subjects in the disease group as male, aged 56.5 ± 18.1 years old (mean ± standard deviation), and a control group comprising 38% of men, aged 37.8 ± 11.6 years old [[149]46]. Ling Yan et al. used the serum peptidome as the diagnostic matrix for COVID-19 [[150]58]. The group infected by COVID-19 had an average age of 46.6 ± 14.9 and 47.2 ± 15.4 (training and validation sets), whereas the control group had an average age of 32.4 ± 11.4 and 29.6 ± 10.2, for training and validation sets. Figure 1. [151]Figure 1 [152]Open in a new tab The statistical classifier’s experimental design and performance when identifying healthy non-hospitalized COVID-19 PCR-negative (Neg-NH) volunteers or hospitalized COVID-19 PCR-positive (Pos-H) patients. The Withheld Set 1 was composed of suspected hospitalized/symptomatic COVID-19 PCR-negative (Neg-H) patients. In contrast, the Withheld Set 2 was composed of Pos-H and Neg-H patients who did not meet the time frame criteria. When the statistical classifier was used to predict the Withheld Set 1 ([153]Supplementary Table S6A)—which was composed of time-framed hospitalized/symptomatic COVID-19 PCR-negative patients (Neg-H)—22 of the 24 Neg-H samples were classified as positive. Interestingly, when inspecting the clinical data from these patients, 17 (77%) of them presented chest computed tomography (CT) scans with suggestive signals of viral infection, such as ground-glass opacity (GGO), consolidation, and pulmonary commitment [[154]59], despite the negative PCR result. Our model’s two remaining patients, classified as negative, presented a viral infection suggestive chest CT scan. When inspecting the results for Withheld Set 2, the time lapse between urine sample collection and the RT-PCR test, or days from symptom onset, did not strongly impact the model’s performance as 50 (87.7%) out of 57 Pos-H samples were correctly classified as being positive. Neg-H samples were not considered when calculating, which means the classifier’s performance since the diagnosis for the patients were not conclusive based on their clinical attributes. Nonetheless, 14 of 19 Neg-H samples in Withheld Set 2 were classified according to their chest CT scan findings. To view detailed clinical information of the patients from the Withheld Set 2, see [155]Supplementary Table S6B in the Data Supplement. The statistical classifier, built using the Lasso algorithms, was based on 14 predictive metabolites, which were given associated mathematical weights according to their relevance to each classifier class, as described in [156]Figure 2. Some variables, which were Lasso selected, also have significant values for the purposes of univariate statistical analysis, such as fold change and adjusted p-value, as presented in [157]Supplementary Table S5A. To visualize the changes in metabolite abundance—which are also in the Neg-H group—not accounted for when using the binary Lasso model, we additionally performed a univariate analysis based on the Kruskal–Wallis test ([158]Figure 3 and [159]Supplementary Table S5B). We could not find any significant metabolic alteration when comparing Neg-H and Pos-H groups, which is in agreement with their similar clinical states. On the other hand, 13 of 14 metabolites indicated by the Lasso analysis were also altered between Neg-NH and Neg-H, evidencing how the metabolites are affected by hospitalization and clinical symptoms. Figure 2. [160]Figure 2 [161]Open in a new tab Metabolites selected by Lasso analysis, the multiple reaction monitoring (MRM) transitions of detection (precursor > fragment), relative standard deviation (RSD) for quality control (QC) samples, and their weights for the Lasso model. * Acylcarnitines are expressed by the number of carbons on the chain. Figure 3. [162]Figure 3 [163]Open in a new tab Box plots for the analytes found as discriminatory by the Lasso model and their abundance between the three classes of volunteers: healthy non-hospitalized COVID-19 PCR-negative (Neg-NH) volunteers, hospitalized COVID-19 PCR-positive (Pos-H) patients and suspected hospitalized/symptomatic COVID-19 PCR-negative (Neg-H) patients. Univariate analysis was performed using the Kruskal–Wallis test. If a p-value is less than 0.05, it is flagged with one star (*). If a p-value is less than 0.01, it is flagged with 2 stars (**). If a p-value is less than 0.001, it is flagged with three stars (***). If a p-value is less than 0.0001, it is flagged with four stars (****). To investigate the biological significance of the metabolites selected by our model and evaluate if the changes observed in the chemical patterns were correlated to biological processes involved in infections, we performed a metabolite enrichment analysis of the discriminatory analytes. This analysis resulted in seven significantly altered pathways (FDR < 0.05), as shown in [164]Figure 4. Figure 4. [165]Figure 4 [166]Open in a new tab Metabolite enrichment analysis via over representation analysis for the discriminatory analytes found by Lasso analysis. A set of human metabolites from the KEGG library was used, and the p-adjusted values for the pathways are presented by the color bar, with the significant ones (FDR < 0.05) displayed numerically. 4. Discussion The method we developed for COVID-19 diagnosis is an adaptation of the well-known and worldwide established newborn screening methodology based on selective MS/MS detection. Utilizing a cohort of 246 RT-PCR validated samples, we opted to build a classifier using samples selected based on rigorous criteria that ensured maximum viral load based on the proximity of the onset of symptoms and RT-PCR collection date. Using this approach, we showed that a panel of amino acids and acylcarnitine could be used to develop classification models that are highly sensitive (>90%), specific (>95%), and accurate (>95%) for COVID-19 screening ([167]Figure 1). The reported performances of serological or antigen tests for diagnosis or confirmation of SARS-CoV-2 infection present sensitivities ranging from 21.8 to 97.9% (serological) and 34.1 to 96% (antigen), as recently revised by Bastos et al. [[168]25], and Dinnes et al. [[169]60]. These authors revised 104 studies, including 38 serological tests and 16 antigen tests applied to symptomatic volunteers, finding specificities ranging from 80.6 to 100% for serological tests and 34.1 to 96% for antigen tests. Böger et al. reviewed the performance of RT-PCR of nasopharyngeal specimens in four different studies and found 73.3% for sensitivity and 100% for specificity [[170]61]. The method we introduced here presented a simple sample workup consisting of dilution and centrifugation. We developed the method to provide a short processing time, with a run time of 4 min, with no chromatographic separation. Good sensitivities and specificity rates were found, as well as also the ability to detect COVID-19 infection outside the “optimal detection window”, as presented in [171]Figure 1. Altogether, these results showcase the potential of FIA-MS/MS to be used as a screening technique or for time course follow-up. However, further studies for clinical validation should include the evaluation of contamination with other viruses, such as influenza, and including positive asymptomatic people and other virus variants. The classification results obtained for the Withheld Set 1, composed of samples from suspect Neg-H patients, suggest that our classifier mainly reflects patient infection status, given the agreement with chest CT scan results ([172]Figure 1). The chest CT scan is a fundamental tool for COVID-19 diagnosing and monitoring. However, it cannot differentiate between an active or previous viral infection or, indeed, indicate the viral pathogen—resulting in lower specificity than RT-PCR for COVID-19 diagnosis [[173]62,[174]63,[175]64,[176]65,[177]66]. For patients from Withheld Set 1, the negative result from the RT-PCR test was in disagreement with their clinical profile and chest CT scan findings for most cases (19 out of 24). From 19 Neg-H patients with viral suggestive chest CT, our model classified 17 as being positive for COVID-19. For example, patient #34 (see [178]Supplementary Table S6A in Data Supplement), a 76-year-old male, received a negative result for RT-PCR, while he was classified as positive by our classifier. The patient presented a chest CT scan that was suggestive of viral infection with ground-glass opacity, consolidations, and pulmonary commitment (50%). The patient was in the intensive care unit (ICU) for 13 days, 11 of which required the use of mechanical ventilation, until death. As recognized by many studies [[179]62,[180]63,[181]64,[182]65], repeated PCR tests should be used for patients with an inconclusive diagnosis in order to more accurately diagnose COVID-19, although repeated PCR tests were not performed for the patients in our study as this could have resulted in a false-negative diagnosis. The disagreement of RT-PCR and chest CT scan results for the Neg-H volunteers, assumed to be the absence of a second-tier or confirmatory test for these individuals, motivated their exclusion from the training/validation sets and also in the option to keep them predicted within Withheld Set 1. The effect of the time lapse between symptoms onset and sample collection day was interrogated by analyzing the Withheld Set 2, which showed similar results to those acquired for training and validation sets. The results suggests that the detected metabolic alterations enabled sample classification for patients who were assessed more than 14 days from symptoms onset and after two days from RT-PCR detection. Based on the unveiled altered pathways, the selected molecular panel appears to correlate with systemic molecular changes that are associated with COVID-19 infection (as shown in [183]Figure 4). The altered amino acids were related to alterations in pathways enrolled in processes such as cellular bioenergetics [[184]67,[185]68,[186]69], immune regulation [[187]70,[188]71,[189]72], metabolic changes [[190]73,[191]74,[192]75], oxidative stress [[193]76,[194]77,[195]78], and protein regulation [[196]79,[197]80,[198]81]. Many metabolic alterations were also found to be correlated with amino acid alterations, specifically during COVID-19 infection [[199]67,[200]68,[201]69,[202]70,[203]71,[204]72,[205]73,[206]74,[207]7 5,[208]76,[209]77,[210]78,[211]79,[212]80,[213]81]. To better correlate our findings to known alterations in the related pathways, we organized [214]Table 2. This table summarizes the altered amino acids, the MSEA-impacted pathway, other related biological processes, and how these pathways and processes might be impacted during COVID-19 infection, according to the literature. For example, glycine urinary levels were decreased in infected volunteers (p < 2.2 × 10^−16). According to MSEA, this alteration significantly impacted pathways such as aminoacyl-tRNA biosynthesis; glyoxylate and dicarboxylate metabolism; and glutathione metabolism. Glycine acts on the regulation of pro-inflammatory cytokines that control immune response [[215]82,[216]83], and an increased level of this amino acid may be related to a decrease in oxidative stress and inflammatory processes [[217]83,[218]84], as these processes were reported during COVID-19 infection [[219]70,[220]71,[221]72,[222]76,[223]77,[224]78]. Another altered amino acid, valine, is a precursor of the cofactor CoA and acts on mitochondria protein transporters [[225]85], which will ultimately impact the release of acylcarnitines. Table 2. Amino acids and pathways significantly altered according to the metabolic set enrichment analysis (MSEA), as well as other related biological processes and their reported biological function when considering COVID-19 infection. Amino Acids Impacted Pathway (MSEA) Related Pathway Impact in COVID-19 Glycine (Gly) Aminoacyl-tRNA biosynthesis Glyoxylate and dicarboxylate metabolism Glutathione metabolism Immune regulation [[226]82,[227]83] Oxidative stress [[228]83,[229]84] [[230]70,[231]71,[232]72,[233]76,[234]77,[235]78] Valine (Val) Aminoacyl-tRNA biosynthesis Pantothenate and CoA biosynthesis Immune regulation [[236]86] [[237]70,[238]71,[239]72] Cysteine (Cys) Aminoacyl-tRNA biosynthesis Pantothenate and CoA biosynthesis Oxidative stress [[240]82,[241]85,[242]87]; Protein regulation [[243]85,[244]88] [[245]76,[246]77,[247]78,[248]79,[249]80,[250]81] Tryptophan (Try) Aminoacyl-tRNA biosynthesis Immune regulation [[251]35,[252]41,[253]46,[254]74,[255]89,[256]90] [[257]70,[258]71,[259]72] Phenylalanine (Phe) Aminoacyl-tRNA biosynthesis Bioenergetics [[260]41,[261]46]; Immune regulation [[262]35,[263]74,[264]89,[265]90] [[266]67,[267]68,[268]69,[269]70,[270]71,[271]72] Glutamine (Gln) Aminoacyl-tRNA biosynthesis D-Glutamine and D-glutamate metabolism Nitrogen metabolism Glyoxylate and dicarboxylate metabolism Arginine biosynthesis Immune regulation [[272]74,[273]91,[274]92]; Metabolic changes [[275]46,[276]93]; Oxidative stress [[277]94,[278]95] [[279]70,[280]71,[281]72,[282]73,[283]74,[284]75,[285]76,[286]77,[287]7 8] Glutamate (Glu) (glutamic acid) Aminoacyl-tRNA biosynthesis D-Glutamine and D-glutamate metabolism Nitrogen metabolism Glyoxylate and dicarboxylate metabolism Arginine biosynthesis Metabolic changes [[288]39,[289]49,[290]93]; Oxidative stress [[291]39,[292]46] [[293]73,[294]74,[295]75,[296]76,[297]77,[298]78] [299]Open in a new tab The Lasso analysis also ranked acylcarnitines as important markers for COVID-19 infection ([300]Figure 2). These molecules act mainly inside the mitochondria during the beta-oxidation process and act as an active acyl-group buffer [[301]96,[302]97], which plays an essential role in cellular bioenergetics [[303]98]. Outside the mitochondrial membrane, the acyl-CoA is formed and enzymatically converted to acyl-carnitine, which passes through the outer mitochondrial membrane; next, it then passes through the inner mitochondrial membrane by the Carnitine:Acylcarnitine Carrier (CAC) antiport protein [[304]96,[305]99,[306]100]. Changes in plasmatic acylcarnitine profiles are directly related to cardiovascular and metabolic syndrome [[307]94,[308]101,[309]102,[310]103], the main comorbidities associated with our sampling set ([311]Table 1, section Comorbidity), and with COVID-19 itself [[312]104,[313]105,[314]106], reflecting the worst outcomes [[315]107]. Long-chain acylcarnitines (C12-C20) accumulate at the air–fluid interface of the lungs in response to stress, such as influenza infection [[316]108]. The fine control and clearance of these metabolites are assigned to the kidneys [[317]96], which could explain their detection in urine. 5. Conclusions In conclusion, we showed that urine analysis, using an adaptation of the known method for newborn screening by FIA-MS/MS, is a promising methodology for COVID-19 screening and diagnosis, with the potential to be used even after the peak viral load passes. The non-invasive sample collection, the lack of need for specific primers, and the possibility of using existing laboratory resources in order to implement the methodology demonstrate the technique’s feasibility to be fully validated. This includes multi-center trials, as well as occurrences for newborn screening programs. Our method also revealed substantial changes in the metabolome of infected patients and pointed out the relation of COVID-19 to other diseases, providing insights into the physiopathology of the disease. Importantly, our method uses urine, a non-invasive and self-collectible sample that would ease the collection procedure without overburdening medical staff. Urine has also been shown to contain dense and consistent biological information regarding COVID-19 infection. Further advancements should focus on measuring the specificity of the method for samples that are obtained from patients presenting multiple pathogens, as well as its ability to detect COVID-19 in asymptomatic infected people or to distinguish COVID-19 infection from other critical diseases. Longitudinal experiments following the time course of the infection would also be valuable to better understand the metabolic changes in urine during different phases of the infection. The challenges faced in developing new alternatives for COVID-19 screening underscore the need to provide new methodological insights ahead of the next health security crisis. Acknowledgments