Abstract Influenza represents a major and ongoing public health hazard. Current collaborative efforts are aimed toward creating a universal flu vaccine with the goals of both improving responses to vaccination and increasing the breadth of protection against multiple strains and clades from a single vaccine. As an intermediate step toward these goals, the current work is focused on evaluating the systemic host response to vaccination in both normal and high-risk populations, such as the obese and geriatric populations, which have been linked to poor responses to vaccination. We therefore employed a metabolomics approach using a time-course (n = 5 time points) of the response to human vaccination against influenza from the time before vaccination (pre) to 90 days following vaccination. We analyzed the urinary profiles of a cohort of subjects (n = 179) designed to evenly sample across age, sex, BMI, and other demographic factors, stratifying their responses to vaccination as “High”, “Low”, or “None” based on the seroconversion measured by hemagglutination inhibition assay (HAI) from plasma samples at day 28 post-vaccination. Overall, we putatively identified 15,903 distinct, named, small-molecule structures (4473 at 10% FDR) among the 895 samples analyzed, with the aim of identifying metabolite correlates of the vaccine response, as well as prognostic and diagnostic markers from the periods before and after vaccination, respectively. Notably, we found that the metabolic profiles could unbiasedly separate the high-risk High-responders from the high-risk None-responders (obese/geriatric) within 3 days post-vaccination. The purine metabolites Guanine and Hypoxanthine were negatively associated with high seroconversion (p = 0.0032, p < 0.0001, respectively), while Acetyl-Leucine and 5-Aminovaleric acid were positively associated. Further changes in Cystine, Glutamic acid, Kynurenine and other metabolites implicated early oxidative stress (3 days) after vaccination as a hallmark of the High-responders. Ongoing efforts are aimed toward validating these putative markers using a ferret model of influenza infection, as well as an independent cohort of human seasonal vaccination and human challenge studies with live virus. Keywords: metabolomics, influenza, vaccine, LCMS 1. Introduction Influenza, flu, is a viral infection affecting the respiratory system with two major subtypes, influenza A virus (IAV) and influenza B virus (IBV), contributing to human disease. The virus is highly contagious and airborne, with symptoms ranging from mild to deadly [[32]1]. Both IAV and IBV contribute to seasonal infections, while the pandemic strains typically arise from the IAV clade. In recent years, the threat of pandemic has become more acknowledged, but seasonal (epidemic) influenza is still associated with significant morbidity and mortality worldwide, with estimates of an average of 389,000 annual deaths [[33]2] between 2002 and 2011. Therefore, IAV represents a significant public-health issue, and further work is needed to improve the prevention, surveillance, diagnosis, and treatment strategies to better understand the molecular underpinnings of the immune response to IAVfollowing infection and vaccination. Infection by IAV or IBV begins with targeting of the epithelial cells of the respiratory system by viral hemagglutinin (HA), which mediates cell entry, trafficking to the endosomes, and, ultimately, import to the cell nucleus, where the transcription of cRNA and vRNA takes place. The expression of the viral protein and RNA activates the innate and adaptive immune responses, leading to overt symptoms of infection and changes in cellular metabolism [[34]3,[35]4,[36]5]. IAV strains consist of different combinations of the two surface proteins found on the virus, hemagglutinin (HA) and neuraminidase (NA). There are 18 subtypes of hemagglutinin (H1-H18) and 11 subtypes of neuraminidase (N1–11) [[37]1]. NAs make up approximately 10–20% of influenza surface proteins, while HAs make up approximately 80–90% of surface proteins, partly explaining why most vaccine designs target the HA protein. IBV is similar, and strains are similarly grouped by HA, but divided into two different lineages (B/Victoria or B/Yamagata) instead of subtypes. Currently, H1N1, H3N2, B/Victoria, and B/Yamagata are all co-circulating seasonally in humans. One driver of the need for an annual vaccination is antigenic drift, in which mutations [[38]6,[39]7,[40]8,[41]9] in the genes that code for the antibody binding site, reducing binding recognition by the existing host antibodies. Antigenic shift during co-infection further generates the potential for novel immune-evading combinations of viral glycoproteins. With this reassortment during co-infection, IAV has the potential to generate hundreds (256) of unique genetic combinations of the two parental strains. Because of these challenges, influenza vaccination can lack specificity and efficacy [[42]10]. Currently, the seasonal vaccine strains are selected based on statistic modelling [[43]11] based on the observed configurations and other pre-season metrics. Most seasonal vaccines are trivalent or quadrivalent [[44]12], and the vaccination efficacy can vary dramatically from year to year, while the responses can be population-dependent. This changing landscape of strain selection and seasonal vaccination highlights the high-risk populations who exhibit poor responses to vaccination. The 2009 H1N1 (IAV) pandemic revealed the severity of obesity (BMI > 30) as a leading risk factor for more severe infections and higher mortality [[45]13]. The geriatric population (>65 years old) has also long been recognized [[46]14] as having less robust responses to influenza vaccination, leading to the use of high-dose or adjuvanted vaccine designs for these populations. For each of these high-risk populations, the mechanism of reduced vaccine efficacy is multi-factorial and incompletely understood. While research suggests that there is a link between high-risk populations and seroconversion following influenza vaccination, not much is known about the underlying metabolic mechanisms. Further characterization at the molecular level is also needed to understand what constitutes a robust immune response to vaccination and link these changes to the vaccine design and function of the host immune system. The two high-risk populations of interest in the current study both share systemic changes in their overall metabolism. Therefore, we hypothesized that the metabolic profiles of subjects undergoing influenza vaccination may reveal seroconversion-dependent changes in their systemic metabolism, which could aid in characterizing, identifying, and predicting the biochemical processes mediating a robust immune response to vaccination. As the end products of the cell regulatory process, metabolism is generally considered to be the most sensitive of the omics disciplines at detecting differences associated with the phenotype [[47]15,[48]16,[49]17] and is playing an increasingly impactful role in the investigation of novel mechanisms of pathophysiology. To begin addressing these questions, we examined a cohort of healthy adults undergoing annual influenza vaccination to identify potential metabolite markers that may be linked to effective response to vaccination among high-risk groups through metabolomic analysis. Our cohort comprised cross-sectional sampling with respect to age, sex, BMI, and other demographic factors so that we could associate metabolic changes with the vaccine response. We aimed to use seroconversion to the vaccine strains as a proxy of protective immune response to influenza vaccination with the goal of then correlating the seroconversion score to various time-dependent metabolic changes in the general cohort, as well as in the obese and geriatric subsets. 2. Methods 2.1. Vaccine Cohort The current metabolomics study utilizes a 2019–2020 cohort of urine samples from the University of Georgia (UGA4), which were acquired from subjects receiving split, inactivated Fluzone^TM, as previously described [[50]18,[51]19]. The study procedures, informed consent, and data collection documents were reviewed and approved by the Western Institutional Review Board and the Institutional Review Boards of the University of Pittsburgh and the University of Georgia. All subjects were recruited from the Athens, Georgia geographic region, including the University of Georgia. Background demographic data on the population were acquired from the Centers for Disease Control (CDC), Athens-Clarke County Unified Government, and United States Census Bureau. Subjects were excluded from the batch assignment and sample processing if one or more of the 5 urine sample time points were missing or unavailable. No other exclusion criteria were applied. The final study consisted of 179 unique subjects, each with five time points for a total n = 895 urine samples analyzed. HAI assays were carried out and seroconversion score were obtained as previously described [[52]18,[53]19]. 2.2. Batch Design and Quality Control To minimize the batch effects, the subjects were randomized into 31 technical batches, and all the sample time points for each subject were analyzed together. Each batch, therefore, contained 6 subjects, and each of their 5 time points for 30 total samples per batch, excluding the final batch. Urine samples were extracted as described below, and the order of acquisition was randomized to minimize the sequence (within-batch) effects. Each LC-MS sequence contained several control blocks of a standard cocktail (blank extraction buffer, with no sample but containing internal standards) that were extracted alongside each batch. The control block consisted of a blank control followed by a standard and another blank. The control blocks were injected at the start of the run, in-between every 6 samples, and at the end of the run, so that the instrument performance could be monitored throughout. These control injections were used to assess the data quality for each batch and measure the instrument variance, carry over, and column stability. Analytical blank injections were further used to define the blank threshold for peak detection in each batch. A large volume (2 L) of extraction buffer and 80% methanol for the internal blank controls was generated before the study to be used in the technical aliquots for all the subsequent technical batch processing. 2.3. Extraction of Metabolites from Urine For each batch, the appropriate samples were removed from −80 °C storage and transferred to wet ice and thawed. An aliquot of extraction buffer of 100% LCMS-grade methanol (Fisher Scientific, Waltham, MA) containing 625 nM metabolomics amino acid mix standard (Cambridge Isotope Laboratories, Inc, Tewksbury, MA.) was taken from 4 °C storage and equilibrated on dry ice for >15 min prior to sample processing. Urine samples were extracted by combining 200 µL of the sample with 800 µL of extraction buffer in 2.0 mL screw cap vials containing ~100 µL disruption beads. The tubes were homogenized for 10 cycles in a Benchmark Scientific Bead Blaster^TM. Each cycle consisted of 20 s of homogenization at 6 m/s, followed by a 30 s pause. The homogenized samples were centrifuged at 21,000× g for 3 min at 4°C, and then a fixed volume of the supernatant (450 µL) was dried down using speed vacuum concentration (Thermo Fisher, Waltham, MA). Once dry, the samples were stored at −80 °C until processing. On the day of the LCMS data acquisition, the samples were reconstituted in 50 µL of LCMS-grade water, sonicated for 2 min, and centrifuged at 21,000× g for 3 min at 4 °C to exclude any insoluble particulates. The extracted samples were then transferred to 2 mL glass vials (Agilent, Santa Clara, CA, USA) with glass LC inserts for analysis. All the samples were stored at −80 °C after data acquisition and QC evaluation. This process was repeated for each of the 31 randomized batches. 2.4. LC-MS/MS with the Polar Global Metabolomics Method The samples were subjected to an LCMS analysis to detect and quantify the putatively identified metabolites. The LC column was a Millipore^TM ZIC-pHILIC (2.1 × 150 mm, 5 μm) coupled with a Dionex Ultimate 3000^TM system, and the column oven temperature was set to 25 °C for the gradient elution. A flow rate of 100 μL/min was used with the following buffers: (A) 10 mM ammonium carbonate in water, pH 9.0, and (B) neat acetonitrile. The gradient profile was as follows: 80–20% B (0–30 min), 20–80% B (30–31 min), and 80–80% B (31–42 min). The injection volume was set to 2 μL for all the analyses (42 min total run time per injection). Each Millipore^TM ZIC-pHILIC column was tracked and used only for the urine samples associated with the current study. MS analyses were carried out by coupling the LC system with a Thermo Q Exactive HF^TM mass spectrometer operating in the heated electrospray ionization mode (HESI). The method duration was 30 min, using a polarity switching, data-dependent top 5 method for both the positive and negative modes. The spray voltage for both the positive and negative modes was 3.5 kV, and the capillary temperature was set to 320 °C, with a sheath gas rate of 35, aux gas of 10, and max spray current of 100 μA. The full MS scan for both polarities utilized a 120,000 resolution with an AGC target of 3e6 and a maximum IT of 100 ms, and the scan range was from 67 to 1000 m/z. The tandem MS spectra for both the positive and negative mode used a resolution of 15,000, AGC target of 1e5, maximum IT of 50 ms, isolation window of 0.4 m/z, isolation offset of 0.1 m/z, fixed first mass of 50 m/z, and 3-way multiplexed normalized collision energies (nCE) of 10, 35, and 80. The minimum AGC target was 1e4, with an intensity threshold of 2e5. All data were acquired in the profile mode. The quality control for each batch was assessed using the cocktail of isotopic amino acid standards present in both the samples and control block injections. The sample and standard chromatograms from each batch were visually compared against historic references for the expected