Abstract Enterohemorrhagic Escherichia coli (EHEC) O157:H7 is a human pathogen responsible for diarrhea, hemorrhagic colitis and hemolytic uremic syndrome (HUS). To promote a comprehensive insight into the molecular basis of EHEC O157:H7 physiology and pathogenesis, the combined proteome of EHEC O157:H7 strains, Clade 8 and Clade 6 isolated from cattle in Argentina, and the standard EDL933 (clade 3) strain has been analyzed. From shotgun proteomic analysis a total of 2,644 non-redundant proteins of EHEC O157:H7 were identified, which correspond approximately 47% of the predicted proteome of this pathogen. Normalized spectrum abundance factor analysis was performed to estimate the protein abundance. According this analysis, 50 proteins were detected as the most abundant of EHEC O157:H7 proteome. COG analysis showed that the majority of the most abundant proteins are associated with translation processes. A KEGG enrichment analysis revealed that Glycolysis / Gluconeogenesis was the most significant pathway. On the other hand, the less abundant detected proteins are those related to DNA processes, cell respiration and prophage. Among the proteins that composed the Type III Secretion System, the most abundant protein was EspA. Altogether, the results show a subset of important proteins that contribute to physiology and pathogenicity of EHEC O157:H7. Introduction Enterohemorrhagic Escherichia coli (EHEC) O157:H7 is a zoonotic pathogen belonging to Shiga toxin-producing E. coli (STEC) and responsible for different diseases as diarrhea, hemorrhagic colitis and hemolytic uremic syndrome (HUS). HUS is distributed worldwide and considered to be a public health problem in several countries [[44]1,[45]2]. Unfortunately, Argentina is the country with the highest incidence of HUS in the world, with approximately 14 cases per 100,000 in children under 5 and a report of 500 cases per year [[46]3,[47]4]. Cattle are the main reservoir of EHEC. Several studies have shown that most cases related to infection in human may be attributed to the high consumption of foods of bovine origin and especially ground beef is the main source of contamination [[48]5]. Great efforts had been made to characterize strains of E. coli O157:H7 isolated from Argentinian cattle [[49]6]. Using the analysis of simple nucleotide polymorphisms, we have classified 16 strains of STEC O157:H7 in clade 6 and 8, which are the most virulent clades [[50]6]. In vitro and in vivo experimental results showed that the strains Rafaela II (clade 8) and 7.1 Anguil (clade 6) have a high virulence potential when compared with other strains and the standard strain EHEC O157:H7 EDL933 [[51]7]. These results enabled us to characterize the high prevalence of strains clade 6 and 8 in the Argentinian cattle. Importantly, these two clades might contribute to a high incidence of HUS in Argentina. The availability of whole genome sequences of different EHEC strains has enabled genome-wide comparisons to identify factors that might be correlated to physiology and virulence of this pathogen [[52]8]. In addition, the implementation of system biology approaches, such as prediction of protein-protein network, has contributed substantially in the understanding of the pathogen and interactions with its host [[53]9]. Information about the functions and activities of the individual proteins and pathways that control these systems is essential to understand complex processes occurring in living cells. Large scale quantitative proteomics is a powerful approach used to understand global proteomic dynamics in a cell, tissue or organism, and has been widely used to study protein profiles in the field of microbiology [[54]10]. Furthermore, the study of the abundance of proteins in different conditions or during different stages of growth or disease can provide important information about the activities of individual protein components or protein networks and pathways. The rapid growth of proteomic and genomic methods and tools has managed to reveal the basic protein inventory of a few hundred different organisms. Quantitative proteomic approaches have been applied to determine the absolute or relative abundance of proteins. This information gives insights about the biological function and properties of the cell as well as how cells respond to environmental or metabolic changes or stresses [[55]11, [56]12]. Quantitative proteomics analysis can contribute to the generation of datasets that are critical for our understanding of global proteins expression and modifications underlying the molecular mechanism of biological processes and disease states. In a previous study, we reported the use of isobaric tags for comparative quantitation (TMT) method to identify the differentially expressed proteins among three EHEC O157:H7 isolates: Rafaela II (Clade 8), Anguil 7.1 (Clade 6) and EDL933 (Clade 3) [[57]7]. The proteome differences observed among these strains are related mainly to proteins involved in both virulence and cellular metabolism; which might reflect the virulence potential of each strain [[58]7]. The aim of the present study was to promote a more comprehensive insight into the molecular basis of EHEC O157:H7 physiology. For this purpose, we applied high-throughput proteomics to combine the proteome of three EHEC O157:H7 isolates: Rafaela II, Anguil 7.1 and EDL933 and normalized spectrum abundance factor (NSAF) approach [[59]13] to quantify the EHEC O157:H7 proteome. Material and methods Bacterial strain and growth conditions The EHEC O157:H7 strains Rafaela II (clade 8) and 7.1 Anguil (clade 6) isolated from cattle in Argentina and EDL933 (clade 3) strain recovered from a patient in USA were routinely maintained in Luria-Bertani broth (LB, Difco Laboratories, USA) or in LB 1.5% bacteriological agar plates, at 37°C. For the proteomic studies, bacterial strains were cultured as previously described by Amigo et al. [[60]7]. Overnight cultures of the different EHEC O157:H7 strains growth in LB were inoculated (1:50) in Dulbecco’s modified Eagle’s medium (DMEM)-F12 nutrient until reach the mid-exponential growth phase (OD[600 nm] = 0.6) under a 5% CO[2] atmosphere at 37°C. Protein extraction and preparation of whole bacterial lysates for LC-MS/MS After bacterial growth, protein extractions were performed according to Amigo et al. [[61]7]. Three biological replicates of each culture were centrifuged at 5000 x g for 20 min at 4°C. The cell pellets were resuspended in ice-cold lysis buffer (50 mM Tris-HCl, pH 7.5, 25 mM NaCl, 5 mM DTT and 1 mM PMSF) and disrupted by three cycles in liquid N[2] and subsequently placed in boiling water. The resulting lysates were centrifuged at 30,000 × g for 10 min and precipitated with 5 volumes of ice-cold acetone at -20°C overnight. Next, the protein pellets were resuspended in buffer containing 8 M urea, 2 M thiocarbamide and 200 mM tetraethylammonium bromide at pH 8.5. The protein concentration was determined by the Bradford assay using BSA curve as a standard. Subsequently, the samples were reduced with tris-(2-carboxyethyl)-phosphine (200 mM), alkylated with iodoacetamide (375 mM) and enzymatically digested with sequencing grade trypsin. Finally, the samples were labeled with TMT Reagents 6-plex Kit according to the manufacturer's instructions. Liquid chromatography and mass spectrometry The proteomic analyses were performed using High pH Reverse Phase Fractionation and Nano LC-MS/MS Analysis by Orbitrap Fusion. Firstly, the labeled peptides were pooled together and desalted using Sep-Pak SPE (Waters) to remove salt ions. The hpRP chromatography was performed with Dionex UltiMate 3000 model on an Xterra MS C18 column (3.5 um, 2.1 × 150 mm, Waters). The sample were dissolved in buffer A (20 mM ammonium formate, pH 9.5) and then eluted with a gradient of 10 to 45% buffer B (80% acetonitrile (ACN)/20% 20 mM NH[4]HCO[2]) for 30 min, followed by 45% to 90% buffer B for 10 min, and a 5-min hold at 90% buffer B. Forty-eight fractions collected at 1 min intervals were merged into 12 fractions. The nano LC MS/MS analysis was carried out using a Orbitrap Fusion tribrid (Thermo-Fisher Scientific, San Jose, CA) mass spectrometer with an UltiMate 3000 RSLC nano system (Thermo-Dionex, Sunnyvale, CA). The fraction was injected onto a PepMap C18 trapping column (5 μm, 200 μm × 1 cm, Dionex) and separated on a PepMap C18 RP nano column (3 μm, 75 μm × 15 cm, Dionex). For all the analysis, the mass spectrometer was operated in positive ion mode, MS spectra were acquired across 350–1550 m/z scan mass range, at a resolution of 12,0000 in the Orbitrap with the max injection time of 50 ms. Tandem mass spectra were recorded in high sensitivity mode (resolution >30000) and made by HCD at normalized collision energy of 40. Each cycle of data-dependent acquisition (DDA) mode selected the top10 most intense peaks for fragmentation. The data were acquired with Xcalibur 2.1 software (Thermo-Fisher Scientific). Database searching, protein identification and abundance estimation Tandem mass spectra were extracted and charge state deconvolution and deisotoping were not performed. All MS/MS samples were analyzed using Mascot (Matrix Science, London, UK; version 2.4.1). Mascot was set up to search the EDL933_NCBI_20141031.fasta; TW14539_exclusive_20150310 database (unknown version, 6341 entries) assuming the digestion enzyme trypsin. Mascot was searched with a fragment ion mass tolerance of 0.020 Da and a parent ion tolerance of 8.0 PPM. Carbamidomethyl of cysteine and TMT-6plex of lysine and the n-terminus were specified in Mascot as fixed modifications. Deamidated of asparagine and glutamine and oxidation of methionine were specified in Mascot as variable modifications. Scaffold (version Scaffold_4.8.8, Proteome Software Inc., Portland, OR) was used to validate MS/MS based peptide and protein identifications. Peptide identifications were accepted if they could achieve an FDR less than 1.0% by the Scaffold Local FDR algorithm and contained at least 1 identified peptide. Label free quantification value was calculated by Normalized spectrum abundance factor (NSAF) algorithm [[62]14]. Bioinformatics analysis Functional annotations were assigned by the COG database [[63]15]. Metabolic pathways were determined by analyzing proteins with the Kyoto Encyclopedia of Genes pathways and Genomes (KEGG) [[64]16]. Results and discussion Global proteomic analysis and functional classification of Escherichia coli (EHEC) O157:H7 proteome In this study we have promoted insights into EHEC O157:H7 proteome from a dataset generated with strains E. coli O157:H7 Rafaela II, Anguil 7.1 and EDL933. The strains were grown in D-MEM media and then, proteins from total bacterial lysates were extracted and digested in solution. The resulting peptides were analyzed by 2D-LC MS/MS. From this proteomic analysis, we detected 2,644 non-redundant EHEC O157:H7 proteins ([65]S1 Table). When comparing this result with in silico data of EHEC O157:H7 genome, approximately 47% of the predicted proteome of this pathogen was identified ([66]Fig 1A). To determine the abundance of the identified proteins, the NSAF approach [[67]13] was used. This approach determines to significance of the expression changes based on individual protein intensities (Zybailov et al., 2006) ([68]S1 Table). Fig 1. Characterization of EHEC O157:H7 proteome and correlation with in silico data. [69]Fig 1 [70]Open in a new tab (A) Correlation of the proteomic results with in silico data of EHEC O157:H7 genome. (B) Dynamic range based on the emPAI value of the proteins identified by LC-MS analysis; pink, most abundant proteins; green, less abundant proteins and red, proteins related to EHEC O157:H7 and that are present in the LEE pathogenicity island. According this analysis a dynamic range of protein abundance was generated ([71]Fig 1B). Fifty proteins were identified as most abundant in EHEC O157:H7 proteome ([72]Table 1). Of the total of proteins identified, 25 proteins are encoded by genes that are present in the pO157 plasmid; however, these proteins did not show a high abundance level ([73]S1 Table) Table 1. List of the most abundant proteins of EHEC O157:H7 proteome. Accession Number Description COG NSAF Value [74]AIG70661.1 30S ribosomal protein S4 J 2,87E-03 [75]AIG70668.1 30S ribosomal protein S5 J 3,79E-03 [76]AIG71453.1 50S ribosomal protein L1 J 3,33E-03 [77]AIG70674.1 50S ribosomal protein L24 J 4,79E-03 [78]AIG70685.1 50S ribosomal protein L3 J 2,94E-03 [79]AIG71455.1 50S ribosomal protein L7/L12 J 8,47E-03 [80]AIG67859.1 Acyl carrier protein I 1,17E-02 [81]AIG66879.1 Alkyl hydroperoxide reductase protein C V 3,59E-03 [82]AIG71201.1 ATP synthase beta chain C 2,81E-03 [83]AIG70015.1 Carbon storage regulator T 3,09E-03 [84]AIG70905.1 Chaperone HdeA O 6,78E-03 [85]AIG66221.1 Chaperone protein DnaK O 3,95E-03 [86]AIG68966.1 Cold shock protein CspA K 8,38E-03 [87]AIG66898.1 Cold shock protein CspA K 2,98E-03 [88]AIG69737.1 Cysteine synthase E 3,27E-03 [89]AIG69094.1 Cystine ABC transporter, periplasmic cystine-binding protein FliY ET 2,87E-03 [90]AIG66325.1 Dihydrolipoamide acetyltransferase component of pyruvate dehydrogenase complex C 2,77E-03 [91]AIG70946.1 Dipeptide-binding ABC transporter, E 6,27E-03 [92]AIG68128.1 DNA-binding protein H-NS L 6,10E-03 [93]AIG71470.1 DNA-binding protein HU-alpha L 5,03E-03 [94]AIG66716.1 DNA-binding protein HU-beta L 3,63E-03 [95]AIG69372.1 DNA-damage-inducible protein I L 3,46E-03 [96]AIG71732.1 Endoribonuclease L-PSP V 3,64E-03 [97]AIG70112.1 Enolase G 8,50E-03 [98]AIG71085.1 EspA protein J 4,18E-03 [99]AIG69098.1 Flagellar biosynthesis protein FliC N 5,31E-03 [100]AIG68335.1 Glutamate decarboxylase E 4,16E-03 [101]AIG71015.1 Glutaredoxin 3 (Grx3) O 3,56E-03 [102]AIG71630.1 Chaperone GroEL O 7,71E-03 [103]AIG71629.1 Chaperone GroES O 6,56E-03 [104]AIG71712.1 Inorganic pyrophosphatase CP 3,12E-03 [105]AIG67993.1 Isocitrate dehydrogenase [NADP] C 3,89E-03 [106]AIG67038.1 Molybdenum ABC transporter, ModA P 2,76E-03 [107]AIG68915.1 NAD-dependent glyceraldehyde-3-phosphate dehydrogenase G 1,14E-02 [108]AIG68347.1 Osmotically inducible protein C V 3,53E-03 [109]AIG70283.1 Phosphoglycerate kinase G 6,15E-03 [110]AIG67029.1 Phosphoglycerate mutase G 6,11E-03 [111]AIG69739.1 Phosphotransferase system, phosphocarrier protein HPr TG 9,84E-03 [112]AIG69741.1 PTS system, glucose-specific IIA component G 3,25E-03 [113]AIG70329.1 putative Fe(2+)-trafficking protein YggX PO 3,43E-03 [114]AIG69875.1 Serine hydroxymethyltransferase E 3,74E-03 [115]AIG67276.1 Tellurium resistance protein TerD T 3,27E-03 [116]AIG67277.1 Tellurium resistance protein TerE T 3,09E-03 [117]AIG68542.1 Thiol peroxidase, Tpx-type O 4,93E-03 [118]AIG71240.1 Thioredoxin O 5,00E-03 [119]AIG66382.1 Translation elongation factor Ts J 3,81E-03 [120]AIG70690.1 Translation elongation factor Tu J 8,57E-03 [121]AIG71387.1 Triosephosphate isomerase G 3,71E-03 [122]AIG68383.1 Unknown Function S 4,62E-03 [123]AIG71521.1 UPF0337 protein yjbJ S 2,93E-03 [124]Open in a new tab We subsequently performed functional annotation of the identified proteins using gene ontology [[125]15]. Cluster of orthologous group (COG) analysis grouped the identified proteins into four important functional groups: (i) metabolism, (ii) information storage and processing, (iii) cellular processes and signaling, and (iv) poorly characterized ([126]Fig 2A). Although most of the identified proteins are related to cellular metabolism, the most abundant proteins are involved in the translation process, followed by energy metabolism and posttranslational modification, protein turnover and chaperones, which shows an intense metabolic activity mainly in the protein synthesis ([127]Fig 2B). On the other hand, most of the less abundant proteins are involved in replication, recombination and repair ([128]Fig 2B). Fig 2. Functional analysis of the EHEC O157:H7 proteome. [129]Fig 2 [130]Open in a new tab (A) Proteins classified by COG functional categories (B) Categorization of the proteins identified into biological processes. [C] Energy production and conversion; [E] Amino acid transport and metabolism; [D] Cell cycle control, cell division, chromosome partitioning; [G] Carbohydrate transport and metabolism; [I] Lipid transport and metabolism; [J] Translation, ribosomal structure and biogenesis; [K] Transcription; [L] Replication, recombination and repair; [M] Cell wall/membrane/envelope biogenesis; [N] Cell motility; [O] Posttranslational modification, protein turnover, chaperones; [P] Inorganic ion transport and metabolism; [R] General function prediction only; [S] Function unknown; [T] Signal transduction mechanisms; [U] Intracellular trafficking, secretion, and vesicular transport; [V] Defense mechanisms; [X] Mobilome: prophages, transposons. (C) KEGG pathway enrichment analysis, the colors are based on the protein abundance; blue, most abundant and green, less abundant. Pieper et al. [[131]17] and Ishihama et al. [[132]18] also conducted proteomic studies on E. coli K-12 and EHEC O157:H7 strain 86–24, respectively, to determine the absolute abundance of proteins. Thirteen proteins of the most abundant proteins in our study were also found as the most abundant proteins in E. coli K-12 ([133]Table 2). Those proteins are related to carbohydrate metabolism, transcription, translation, posttranslational modification and signal transduction mechanisms [[134]18]. On the other hand, only 11 proteins of the most abundant group ([135]Table 2) were the most abundant ones in the data obtained from quantitative proteome of EHEC O157:H7 strain 86–24 [[136]17]. Some of those proteins (e. g. TerD, TerE, EspA and DNA-damage-inducible protein I) are absent from E. coli K-12. Interestingly, when comparing our results with those of Pieper et al. [[137]17] and Ishihama et al. [[138]18], the E. coli proteome was evaluated in different grown condition. Despite the different growth conditions, glyceraldehyde-3-phosphate dehydrogenase, translation elongation factor Tu, DNA-binding protein H-NS, alkyl hydroperoxidereductase protein C, GroEL chaperone and 50S ribosomal protein L7/L12 were detected as the most abundant proteins as well ([139]Table 2). These results suggest a set of proteins that may play an important role in the biology of E. coli. Table 2. List of the most abundant proteins detected in E. coli K-12 and EHEC 86–24. Access Number Gene name Description Detection E. coli K-12 EHEC 86–24 gi|667692306 gapA Glyceraldehyde-3-phosphate dehydrogenase M M gi|667694081 tuf Translation elongation factor Tu M M gi|667691519 hns DNA-binding protein H-NS M M gi|667690270 ahpC Alkyl hydroperoxidereductase protein C M M gi|667694846 rplL 50S ribosomal protein L7/L12 (P1/P2) M M gi|667695021 groEL Heat shock protein 60 family chaperone GroEL M M gi|667694059 rpsE 30S ribosomal protein S5 M gi|667691933 tpx Thiol peroxidase, Tpx-type M gi|667691384 icdA Isocitrate dehydrogenase [NADP] M gi|667695020 groES Heat shock protein 60 family co-chaperone GroES M gi|667694592 atpD ATP synthase beta chain M gi|667693130 ptsH Phosphotransferase system, phosphocarrier protein HPr M gi|667693674 pgk Phosphoglycerate kinase M gi|667689773 tsf Translation elongation factor Ts M gi|667694844 rplA 50S ribosomal protein L1 M gi|667693132 crr PTS system, glucose-specific IIA component M gi|667694065 rplX LSU ribosomal protein L24p (L26e) M gi|667694076 rplC LSU ribosomal protein L3p (L3e) M [140]Open in a new tab All proteins were detected in the proteomic study of E. coli K-12 (Ishihama et al. [[141]18]) and EHEC 86–24 (Pieper et al. [[142]17]). M = proteins detected at high levels We also detected shiga-toxin subunits such as StxA, StxB, Stx2a and Stx2cb; these proteins, however, were not among the most abundant proteins ([143]Fig 1). Pieper et al. [[144]17] also obtained similar results in EHEC 86–24 proteome. This low abundance can be associated with environmental or nutritional conditions that contribute to the bacterial lysis and consequently to the production of the toxin [[145]17, [146]19, [147]20]. Metabolic network analysis To identify the most relevant biological pathways of the identified proteins, we performed a KEGG enrichment analysis. This analysis provides a comprehensive understanding about pathways that might contribute to cellular physiology [[148]16]. When we evaluated the most abundant proteins, we identified 10 pathways that were considered significant (p < 0.05), among them the Glycolysis / Gluconeogenesis was the most significant ([149]Fig 2C). On the other hand, among the less abundant proteins were proteins related to ABC transport. Different studies have reported that glycolysis / gluconeogenesis pathway might influence in the colonization process of EHEC in the gastrointestinal tract of both mouse and bovine [[150]21, [151]22]. Although glycolysis substrates inhibit the expression of genes that are localized in locus of enterocyte effacement (LEE), this pathway plays an important role in the initial colonization and maintenance of EHEC in the mouse intestine. In addition, gluconeogenesis not only induces LEE gene expression, but contributes also to the later stages of EHEC colonization in mouse [[152]21, [153]23]. In our proteomic analysis, 23 proteins that composed the Glycolysis / Gluconeogenesis pathway of E. coli were identified ([154]Fig 3). NAD-dependent glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was the second most abundant protein of EHEC O157:H7 proteome ([155]Table 1). This important cytoplasmic protein of the Glycolysis pathway is also described as a moonlight protein, owing to the distinct functions performed by this enzyme in different cellular localization [[156]24]. Some studies showed that GAPDH secreted by EHEC and enteropathogenic E. coli (EPEC) strains can bind to fibrinogen and epithelial cell, which could contribute to the pathogenesis of this bacterium mainly through cell adhesion [[157]25, [158]26]. Another protein that is also described as a moonlight protein and was detected among the most abundant proteins of the EHEC proteome is enolase ([159]Table 1) [[160]27]. This glycolytic enzyme that plays an important role in the carbon metabolism also acts in the RNA degradosome process, mainly in the RNA processing and gene regulation. In E. coli, enolase-RNase E/ degradosome complex regulates bacterial morphology under anaerobic condition by inducing a filamentous form, which is observed by some pathogenic E. coli strains under oxygen limiting conditions [[161]27]. Fig 3. Overview of the glycolysis / gluconeogenesis pathway of EHEC O157:H7. [162]Fig 3 [163]Open in a new tab Enzymes of the Glycolysis / Gluconeogenesis metabolism that were identified at the proteome level. Blue, proteins detected in our proteomic analysis; Green, proteins not identified in our study and Red, proteins detected as most abundant. Although, the lipid metabolism not has been detected among the KEEG pathway with significant p-value, the acyl carrier protein (ACP) was detected as the more abundant protein of EHEC O157:H7 proteome. Studies showed that E. coli contain in its genome only a copy of acpP gene, which codify to an ACP, this protein plays key role in the fatty acid biosynthesis and is also required for the growth of E. coli (Rawlings and Cronan, 1992; De Lay and Cronan, 1996). During the fatty acid biosynthesis, ACP is post translationally modified by 4’-phosphopantetheinyl (4’-PP). The acyl intermediates generated are bound to the 4’-PP thiol through a thioester linkage, which allows ACP to transport intermediaries among the fatty acid synthetic enzymes [[164]28, [165]29]. Information storage and processing Most proteins described as the most abundant are involved in translation processes. Similar results had been observed in E. coli K-12 [[166]17]. In addition, according to the KEGG enrichment analysis, the ribosome was strongly enriched ([167]Fig 2C). We identified proteins involved in structural elements of the ribosome as well as related to initiation, elongation and terminations steps, which are required to the translation process [[168]30]. These results show an intense metabolic activity of EHEC mainly in protein synthesis. Among these proteins, the translation elongation factor Tu was identified (EF-Tu) ([169]Table 1). EF-Tu could play a role in the resistance process of this bacterium in the gastrointestinal tract [[170]31], as well as against cellular damage generated by the bile salt sodium deoxycholate [[171]32]. Unlike E. coli K-12 [[172]17], the proteins involved in transcription process in EHEC were identified as most abundant. CspA was identified to be among the most abundant proteins as well. This RNA chaperone is described as the major cold shock protein of E. coli. CspA binds to RNA molecules and destabilizes stem loop structures to prevent and resolve misfolding of RNA [[173]33]. Cellular processes and signaling Flagella are filamentous structures that contribute to pathogenesis of pathogenic E. coli, mainly in motility, adhesion and biofilm production [[174]34]. Generally, this organelle is constituted by basal body, hook and a filament that is composed by flagelin or flagellar antigen FliC, which belongs to the H-antigens group [[175]35, [176]36]. FliC was detected as highly abundant ([177]Table 1). In addition, a study performed in EPEC showed that FliC might be involved in the inflammatory response during the EPEC infection, due to the capacity of flagelin to induce interleukin-8 (IL-8) release in T84 cells [[178]36]. During infection, E. coli is subject to different environmental conditions, for example, temperature changes that occur both in external ambient and within host. In our proteomic analysis, DnaK, GroEL and GroES were detected among the most abundant proteins ([179]Table 1). Studies have shown that these proteins contribute to the resistance process of EHEC under elevated temperature [[180]37, [181]38]. In addition, Kudva et al. [[182]39] demonstrated that DnaK and GroEL were induced when EHEC was grown in bovine rumen fluid, thus showing the contribution of these proteins in the adaptation of EHEC to the bovine rumen. Other type of stress commonly found by EHEC during the infection process is oxidative stress, which is generated by reactive oxygen species (ROS) such as superoxide anion (O[2]^-) hydrogen peroxide (H[2]O[2]) and the hydroxyl radical (OH^·) produced mainly by host immune response [[183]40]. Thus, to adapt and survive under this stress condition, this bacterium presents different anti-oxidant systems. We detected two members of peroxiredoxins (Prxs) family: periplasmic thiol peroxidase (Tpx) and alkyl hydroperoxide reductase C (AhpC) system ([184]Table 1). These two antioxidant systems play an important role in the scavengers of H[2]O[2] and organic hydroperoxides [[185]41, [186]42]. Glutaredoxin 3 (Grx3) was also among the most abundant proteins ([187]Table 1). Grx3 is associated with Glutaredoxin (Grx) system, whose function is to reduce disulfide bond in target proteins to control the intracellular redox environment [[188]43]. In addition, Smirnova et al. [[189]44] showed that glutaredoxin proteins might be involved in the resistance of E. coli to antibiotics as ampicillin. Altogether, these different systems promote an efficient pathway of antioxidant defense in EHEC that contributes to the pathogenesis of this bacterium. The ter operon related to tellurite resistance is widely spread in several Gram positive and Gram negative pathogenic species [[190]45, [191]46]. In EDL933, this operon is composed by six genes (terZABCDE). Among the proteins expressed by that operon, only TerC was absent from our proteomic analysis. Interestingly, TerD and TerE proteins were among the most abundant proteins of EHEC O157:H7 proteome ([192]Table 1). A study performed with Uropathogenic E. coli (UPEC) isolates showed that the introduction of the ter gene cluster contributes to improve bacterial fitness inside macrophages [[193]47]. On the other hand, Yin et al. [[194]48] demonstrated that ter genes contribute to adherence of EHEC O157:H7 to epithelial cells. However, the true role of these genes in the EHEC pathogenesis remains unclear. Although tellurium is absent from the EHEC niche, interestingly, proteomic studies have detected tellurium resistance proteins in EHEC O157 proteome in different media and growth conditions such as D-MEM [[195]49], minimal medium [[196]50], CHROMagar STEC [[197]51], bovine fluid rumen [[198]39] and under conditions that stimulate the quorum sensing pathway [[199]52]. Despite the several studies in this area, more efforts are necessary to unveil the true role of the tellurium resistance proteins in EHEC pathogenesis. Locus of Enterocyte Effacement (LEE) The LEE is a pathogenicity island of 35.6 kb that is organized into five polycistronic operons (LEE1 to LEE5) and is an additional bicistronic operon of glr regulatory proteins [[200]53]. LEE is related to intimate adherence of EHEC to cell host and is required for attaching and effacing (A/E) lesions, followed by the translocation of effector proteins that contribute mainly to host modulation of the immune system [[201]54]. In addition, LEE contains the genes that encode the Type III secretion system (T3SS) as well as some effectors molecules that are exported by this system. The T3SS is responsible for the translocation of effectors from within the host cell, whose are directly involved in the EHEC pathogenesis, mainly in the host modulation of the immune system [[202]54]. In this study the EHEC strains were grown in D-MEM, a medium known to induce expression of genes encoding T3SS [[203]55]. We identified 24 LEE-encoded proteins ([204]Fig 1C, [205]S1 Table). Among these proteins, the most abundant were EspA (filamentous structure of the T3SS), Tir (translocated intimin receptor), EspB (pore formation and effector activity) and EspD (outer membrane adhesin) ([206]Fig 1C). Interestingly, these proteins play an important role in the E. coli O157 adhesion [[207]56, [208]57]. On the other hand, EspA, EspB, Tir and Intimin are potential vaccine candidates against EHEC infection [[209]58, [210]59]. EspA, which was detected as the most abundant protein of LEE, forms a channel that connect the bacterial cytoplasm with the host cell; this exportation conduct allows the translocation of effectors from within the host cell [[211]60]. EspB together with EspD are responsible for the formation of the translocation pore and for the effector translocation of Tir. In addition, EspB can inhibit the interaction between myosin and actin, which promotes loss of microvilli and consequently contributes to the induction of diarrhea [[212]61]. The interaction between Tir and Intimin contributes directly to EHEC O157:H7 persistence during the infection process [[213]62, [214]63]. Furthermore, Tir and Intimin are involved in the modulation of host immunity. Tir might inhibit tumor necrosis factor receptor-associated factor 6 (TRAF-6)-mediated by NF-κB activation [[215]64]. Instead, intimin can induce a T-helper cell type 1 response as well as to stimulate the proliferation of spleen CD4+ T lymphocytes and cells from lymphoid tissues [[216]65, [217]66]. Conclusion In this work, we applied the quantitative proteomic (TMT)-based and emPAI analyses to estimate the quantification of EHEC O157:H7 proteome of combined proteomes of two EHEC O157:H7 isolates from Argentinian cattle and of the standard strainEDL933. These comprehensive proteomic analyses generated a quantitative dataset of EHEC proteome composed of a subset of proteins involved in different biological processes. All these proteins together might form a network of factors that play an important role in the pathogenesis and physiology of this pathogen. Altogether, the results presented in this study provide insights into the functional genome of EHEC O157:H7 at the protein level and could contribute to the understating of the factors associated with the biology of this pathogen. Supporting information S1 Table. Total list of proteins identified and quantified by NSAF approach. (XLSX) [218]Click here for additional data file.^ (250.8KB, xlsx) Acknowledgments