Abstract Foot-and-mouth disease (FMD) is the most devastating disease of cloven-hoofed livestock, with a crippling economic burden in endemic areas and immense costs associated with outbreaks in free countries. Foot-and-mouth disease virus (FMDV), a picornavirus, will spread rapidly in naïve populations, reaching morbidity rates of up to 100% in cattle. Even after recovery, over 50% of cattle remain subclinically infected and infectious virus can be recovered from the nasopharynx. The pathogen and host factors that contribute to FMDV persistence are currently not understood. Using for the first time primary bovine soft palate multilayers in combination with proteogenomics, we analyzed the transcriptional responses during acute and persistent FMDV infection. During the acute phase viral RNA and protein was detectable in large quantities and in response hundreds of interferon-stimulated genes (ISG) were overexpressed, mediating antiviral activity and apoptosis. Although the number of pro-apoptotic ISGs and the extent of their regulation decreased during persistence, some ISGs with antiviral activity were still highly expressed at that stage. This indicates a long-lasting but ultimately ineffective stimulation of ISGs during FMDV persistence. Furthermore, downregulation of relevant genes suggests an interference with the extracellular matrix that may contribute to the skewed virus-host equilibrium in soft palate epithelial cells. Keywords: foot-and-mouth disease virus (FMDV), bovine soft palate, nasopharynx, transcriptomics, proteomics, bioinformatics, virus-host interaction, innate immune system, interferon-stimulated genes (ISG) 1. Introduction Foot-and-mouth disease (FMD) is an acute and severe systemic vesicular disease of cloven-hoofed animals (Artiodactyla) with tremendous economic impact. The last FMD epizootic in the European Union in the United Kingdom, Ireland, France and the Netherlands in 2001 culminated in the slaughter of more than 6.5 million animals and an economic toll of over €5 billion [[50]1]. The etiological agent is foot-and-mouth disease virus (FMDV), the type species of the genus Aphthovirus in the family Picornaviridae [[51]2]. FMDV particles comprise a non-enveloped icosahedral capsid that surrounds a single-stranded positive-sense RNA genome with an approximate length of 8.4 kilobases [[52]3]. FMD mainly affects livestock such as cattle, buffalo, pigs, goats, and sheep, but can also be transmitted to deer and wild boar. It is endemic in wild buffalo in Southern Africa [[53]4,[54]5]. Although more than 70 species are known to be susceptible to FMDV, its primary host seem to be buffalo and cattle, in which the disease causes very high morbidity, but only low mortality in adults [[55]6]. During the onset of acute infection, cattle are highly febrile and small vesicles develop on the mucosal membranes of the muzzle, lips, and oral cavity, as well as on the coronary band and interdigital space, and the teats of the udder. Usually, cattle clinically recover within 2–3 weeks if no secondary infection occurs. At this time, many have completely cleared the virus, however, about 50% of animals may remain subclinically infected for up to three years, depending on the species [[56]7]. The World Organisation for Animal Health (OIE) defines an animal from which infectious FMDV can be recovered by probang sampling later than 28 days post infection (dpi) as persistently infected or a so-called “carrier” [[57]7]. The fear of contagion from carrier animals has severe consequences for trade in live animals and animal products [[58]8]. In vivo studies have shown that in cattle more than 50% of animals will become persistently infected [[59]9], even if they had been vaccinated against FMDV and did not develop clinical disease [[60]10,[61]11,[62]12]. The exact anatomical sites of persistence are still debated, but different tissues of the upper respiratory tract including the nasopharynx have been suggested. A recent study of the tissue-specific localization of FMDV in persistently infected steers identified the contiguous epithelia of the dorsal soft palate (SP) and the dorsal nasopharynx as the most likely sites of persistence [[63]10]. Evidence for FMDV persistence in lymph nodes and germinal centers was also put forward although no viral replication could be detected [[64]13]. The cellular factors that promote establishment of FMDV persistence and the viral strategies of immune avoidance in the bovine nasopharynx remain currently poorly understood, making it impossible to predict which animals will develop into carriers and which will clear the virus. Previous in vitro work that aimed to decipher cellular responses during persistent FMDV infection hardly reflected persistence in vivo as the used models were either based on immortalized non-bovine cell lines, such as hamster kidney (BHK) cells [[65]14], or bovine cell lines, e.g., bovine kidney (MDBK, EBK) cells [[66]15,[67]16], that are not from the primary site of persistence. O’Donnell et al. [[68]17] used a persistently infected bovine pharynx cell line to examine the gene expression of selected cellular cytokines by RT-qPCR, showing differences in the expression of key antiviral cytokines between acute and persistent infection. The rationale of this study was therefore to use a novel air–liquid interphase cell culture model based on primary SP cells from cattle [[69]18] that closely resembles the situation in vivo, together with an proteogenomics approach that combined transcriptomics by high-throughput sequencing, RT-qPCR, proteomics and bioinformatics. Our results provide detailed insights into the transcriptional responses of the SP in reaction to acute FMDV infection and reveal long-lasting changes of gene activation throughout persistence. The identified pathways and genes may give rise to further investigations leading to early detection of persistence in cattle and novel vaccines that prevent the carrier state. 2. Materials and Methods A schematic visualization of the sample processing and analysis workflow can be found in the [70]Supplementary Figure S1. 2.1. Ethics Statement The tissues used for the study were collected from animals slaughtered for food production. The animals were being processed as part of the normal work of the abattoir, therefore no ethics approval was required. 2.2. Bovine Epithelial Cultures from Soft Palate Multilayers of bovine dorsal soft palate cells were propagated at the air–liquid interface for 5 weeks before FMDV infection, as described previously [[71]18]. Briefly, bovine dorsal soft palate tissue, collected immediately after slaughter, was dissected and digested at 4 °C overnight in incubation medium supplemented with protease XIV (Sigma-Aldrich, St. Louis, MO, USA). Epithelial cells were thereafter scraped off the underlying tissue, filtered, and incubated in cell culture flasks for 4 h at 37 °C and 5% CO[2]. Cells that did not adhere to the plastic were centrifuged at 200× g for 10 min at room temperature, frozen, thawed, and propagated for three to five passages in cell culture flasks before being seeded in 12 mm diameter Corning® Transwell-COL collagen-coated PTFE membrane inserts with 3.0 μm pores (Sigma-Aldrich). The cell culture medium was removed from the upper compartment after five days of culture and changed in the lower compartment every two or three days. The average number of cells that constituted the upper layer of the multilayer was estimated at 750,000. 2.3. Experimental Design and FMDV Infection After 5 weeks of culture on inserts without passage, cells were infected with a twice-plaque-purified viral clone (FMDV O Clone 2.2, “Cl 2.2”) derived from the O/FRA/1/2001 strain that was further propagated on BHK-21 cells (four passages) [[72]16], or negative cell lysate, as described previously [[73]18]. Two experiments (experiment 1 and 2) were performed with SP cells that originated from two different animals (a male and female, respectively) and that were infected at a multiplicity of infection (MOI) of 0.01 (compare [74]Supplementary Table S1). Briefly, the inserts were incubated for one hour with 500 µL of clarified cell lysate from infected or uninfected cell cultures. Following infection and thereafter at a maximum interval of 3 days, the upper compartments were washed with 500 µL cell culture medium containing 10% FCS and, similarly, but only from 2 dpi, the medium in the lower compartments was changed. For each experiment, at days 0, 1 and 28, SP cells from 2 inserts were lysed with 750 μL of TRIzol Reagent (Life Technologies, Carlsbad, CA, USA) and frozen separately at −80 °C for transcriptomic and proteomic analyses. 2.4. RNA and Protein Isolation For isolation of high-quality total RNA and proteins, 150 µL of trichlormethane (Carl Roth, Karlsruhe, Germany) was added to the lysed cells in 750 μL TRIzol Reagent and the mixture was centrifuged in order to separate the RNA-containing aqueous phase from the DNA- and protein-containing organic phase. The aqueous phase was then mixed with an equal amount of 100% ethanol (Carl Roth) and total RNA was extracted using the RNeasy Mini Kit (Qiagen, Hilden, Germany) with on-column DNase digestion with the RNase-Free DNase Set (Qiagen), following the manufacturer’s instructions. The quantity and quality of total RNA was subsequently analyzed using a NanoDrop 1000 spectrophotometer (Peqlab, Erlangen, Germany) and RNA 6000 Pico chips on an Agilent 2100 Bioanalyzer (Agilent Technologies, Böblingen, Germany). All samples were checked for contamination using the 260/280 and 260/230 nm ratios, as well as for RNA degradation using the RNA Integrity Number (RIN). Polyadenylated mRNA was subsequently isolated from 1–3 µg of high-quality total RNA using the Dynabeads mRNA DIRECT Micro kit (Invitrogen, Carlsbad, CA, USA) following the manufacturer’s instructions. Prior to isolation, the ERCC ExFold RNA Spike-In mix 1 (Invitrogen) was supplemented and used as an internal control for all following steps. The quality of the mRNA and the extent of ribosomal RNA contamination was assessed using RNA 6000 Pico chips on the Agilent 2100 Bioanalyzer (Agilent Technologies). Proteins were extracted from the organic TRIzol phase following the manufacturer’s instructions. Briefly, 0.3 mL of 100% ethanol (Carl Roth) were added and DNA was pelleted by centrifugation. To the supernatant, 1.5 mL of isopropanol (Carl Roth) were added and proteins were pelleted by centrifugation. Protein pellets were washed three times with 0.3 M guanidine hydrochloride (Carl Roth) in 95% ethanol (Carl Roth). After a final washing step using 95% ethanol, proteins were air-dried and the pellet was resuspended in freshly prepared 1% SDS (Carl Roth) by ultrasonication. Quality and quantity of the isolated proteins were checked using a 12% polyacrylamide gel (SDS-PAGE) and a colorimetric bicinchoninic acid (BCA) assay. 2.5. Library Preparation and Sequencing For preparation of whole-transcriptome libraries the Ion Total RNA-Seq Kit v2 (Life Technologies) was used, following the manufacturer’s instructions. Briefly, between 1 and 100 ng of the mRNA containing the ERCC spike-in control was treated with RNase III at 37 °C for 10 min. The fragmented mRNA was subsequently purified using the Magnetic Bead Cleanup Module (Life Technologies) and the resulting size distribution was assessed with the Agilent 2100 Bioanalyzer as described above. After hybridization and ligation of appropriate adapters, the fragmented mRNA was reverse transcribed into cDNA using SuperScript III enzyme. The cDNA was purified as described above and amplified for 14 cycles using Platinum PCR SuperMix High Fidelity along with appropriate primers for the generation of barcoded libraries. The resulting libraries were again purified using the method described above and the size distribution was assessed using the Agilent 2100 Bioanalyzer together with the DNA 7500 kit and chip (Agilent Technologies). All libraries were subsequently quantified using the KAPA Library Quantification Kit Ion Torrent (Kapa Biosystems, Wilmington, MA, USA) on a CFX96 Real-Time PCR Detection System (Bio-Rad Laboratories, München, Germany) and pooled at an equimolar ratio. For sequencing an Ion S5XL sequencing system (Life Technologies) along with the Ion 540 OT2 and Chip kit (Life Technologies) for the generation of up to 200 bp reads was used. Each library was sequenced in at least two independent sequencing runs. 2.6. Statistical Analysis of Differential Gene Expression In order to detect problems and biases during mRNA isolation and library preparation, we used the ERCC_Analysis Plugin (version 5.8.0.1) provided in the Torrent Suite software (version 5.8.0). Only forward strand reads were selected for mapping and the minimum transcript count was set to 50. The raw reads from each sequencing library were quality checked using FastQC (version 0.11.7; Babraham Institute) with a focus on the read length distribution and adapter contamination. In order to quantify the expression of known bovine transcripts in these datasets we used Salmon (version 0.9.1) that uses a lightweight alignment method (quasi-mapping) for rapid transcript abundance estimation [[75]19]. Briefly, we obtained the transcript reference of cattle (GCF000003055.6 Bos Taurus UMD 3.1.1) from NCBI and selected only transcripts that were featured as mRNAs, non-coding RNAs and miscellaneous RNAs (for details see [76]Supplementary Table S2). A Salmon index was created using the option for a perfect hash, rather than a dense hash. Each sequencing library was then used as input for the major Salmon function “quant” using appropriate options for the library type (stranded single-end protocol with reads coming from the forward strand) and the mean read length. The number of bootstraps was set to 100 replicates. Subsequently a “tx2gene” table was prepared using the accessions of each RNA transcript from the aforementioned reference in the first column and the corresponding gene symbol in the second column (see also [77]Supplementary Table S3). Using this table, the transcript abundancy datasets from Salmon “quant” were imported into R workspace (version 3.4.1; [[78]20]) using the “tximport” package (version 1.6.0; [[79]21]) as implemented in the Bioconductor library. For handling and manipulating R scripts, the software RStudio (version 1.0.153) was used. Technical replicates of samples (multiple sequencing of the same library from a single sample) were combined and the datasets were pre-filtered using only genes were more than four samples had raw gene counts greater than or equal to 100. In order to check the dataset for influence of treatment (infection, time and animal) and repeatability (replicates of same treatment) a principal components analysis (PCA) as well as a heatmap clustering was conducted using the regularized log transformed (rlog) read counts. DESeq2 (version 1.18.1; [[80]22]) was then used to identify differentially expressed genes between the treatments based on the negative binomial distribution. The resulting p-values were adjusted with the Benjamini–Hochberg procedure and only genes with an adjusted p-value below 0.001 and an absolute fold change of >1 were considered significant. The logarithmic fold changes were further shrunken as recommended and described by Love et al. 2014 [[81]22] in order to account for genes with low read counts. Significant differently expressed genes were annotated using the AnnotationDbi package (version 1.40.0) and used for further pathway enrichment analysis. Briefly, gene sets were analyzed using the “enrichPathway” function of the ReactomePA package (version 1.22.0; [[82]23]) and the “enrichKEGG” function of the clusterProfiler package (version 3.6.0; [[83]24]). 2.7. Quantitative Reverse Transcription PCR (RT-qPCR) In order to confirm the results from the RNA sequencing experiment and to include samples from additional time points, a subset of six target genes (ANKRD1, CASP7, IDO1, IFIH1, NCAM1, OAS2) and two reference genes (ACTB, GAPDH) was selected for quantitative reverse transcription PCR (RT-qPCR) analysis. For each gene, two intron-spanning primer sets were designed using Primer3 (version 0.4.0; [[84]25]) (for primer sequences see [85]Supplementary Table S4). For RT-qPCR the QuantiTect Probe RT-PCR Kit (Qiagen) was used together with LightCycler 480 ResoLight Dye (Roche, Mannheim, Germany) according to the manufacturers’ instructions. The following temperature profile was used on a Bio-Rad CFX96: 50 °C for 30 min, 95 °C for 15 min and 45 cycles of 94 °C for 15 s and 60 °C for 1 min. After each cycle, the fluorescence in the SYBR channel was detected and threshold cycle (Ct) values were deduced after each run. For each newly designed primer set the PCR efficiency was determined using an appropriate dilution series from an independent bovine RNA control. Using the ∆∆Ct method, Ct values of target genes were first normalized by subtracting the Ct value of references genes