Abstract

   Salmonella enterica serovar Typhimurium (S. Typhimurium) is a highly
   adaptive pathogenic bacteria with a serious public health concern due
   to its increasing resistance to antibiotics. Therefore, identification
   of novel drug targets for S. Typhimurium is crucial. Here, we first
   created a pathogen-host integrated genome-scale metabolic network by
   combining the metabolic models of human and S. Typhimurium, which we
   further tailored to the pathogenic state by the integration of dual
   transcriptome data. The integrated metabolic model enabled simultaneous
   investigation of metabolic alterations in human cells and S.
   Typhimurium during infection. Then, we used the tailored pathogen-host
   integrated genome-scale metabolic network to predict essential genes in
   the pathogen, which are candidate novel drug targets to inhibit
   infection. Drug target prioritization procedure was applied to these
   targets, and pabB was chosen as a putative drug target. It has an
   essential role in 4-aminobenzoic acid (PABA) synthesis, which is an
   essential biomolecule for many pathogens. A structure based virtual
   screening was applied through docking simulations to predict candidate
   compounds that eliminate S. Typhimurium infection by inhibiting pabB.
   To our knowledge, this is the first comprehensive study for predicting
   drug targets and drug like molecules by using pathogen-host integrated
   genome-scale models, dual RNA-seq data and structure-based virtual
   screening protocols. This framework will be useful in proposing novel
   drug targets and drugs for antibiotic-resistant pathogens.

Introduction

   S. Typhimurium, is a gram-negative invasive and facultative pathogen
   that can infect various animal species [[30]1]. Upon infection, it
   mostly causes food poisoning and leads to gastroenteritis in humans
   [[31]2]. Salmonella infection causes 130,000 deaths every year, and it
   mostly affects people in low income countries [[32]3]. S. Typhimurium
   is an intracellular pathogen, residing inside a membrane-bound
   compartment within host cells during infection [[33]4]. This
   compartment is called Salmonella-containing vacuole (SCV), and it
   enables proliferation of S. Typhimurium inside the host cell by
   escaping from the host defense mechanism. Although SCV is considered a
   nutrition poor environment, this does not pose a problem for S.
   Typhimurium due to its highly adaptive lifestyle [[34]1]. Besides being
   a highly adaptive pathogen, the increasing rate of antibiotic
   resistance by S. Typhimurium has the potential to become a serious
   concern for public health. Therefore, it is important to identify novel
   drug targets to eliminate S. Typhimurium infections [[35]5].
   Understanding the metabolic activities of a pathogen is an important
   part of drug development process. Transcriptome data is a key data type
   that can elucidate metabolic activities of an organism [[36]6]. It
   reflects enzymatic activities of the organism via mRNA levels that
   carry genetic information of those enzymes. One of the novel approaches
   in transcriptomics is dual RNA-sequencing (dual RNA-seq), which can be
   used to elucidate pathogen-host relationship since it measures mRNA
   levels of pathogen and host simultaneously during infection [[37]7,
   [38]8].

   Genome-scale metabolic network (GMN) models have shown utility in the
   analysis of metabolic activities of pathogen and host during infections
   [[39]7, [40]9] Analysis of GMN models with constraint-based techniques
   to predict novel drug targets has the advantages of (i) being cost
   effective, (ii) being time efficient, (iii) providing a wide range of
   analyses of metabolic pathways at the same time. There are several
   studies about the prediction of novel drug targets to eliminate
   pathogen induced infections by analyzing GMN models [[41]10]. Most
   widely used constraint-based computational approach for the analysis of
   GMNs is Flux Balance Analysis (FBA). FBA is a mathematical optimization
   technique that uses linear optimization to predict distribution of
   metabolic fluxes at steady state conditions. It uses an objective
   function besides constraints to select an optimum point from the flux
   solution space [[42]11]. In silico gene deletion analysis is another
   widely used constraint-based analysis technique, which is used to
   determine potential drug targets [[43]12–[44]18]. Several GMN models
   were reconstructed so far for different Salmonella strains
   [[45]19–[46]21]. But these models are generic models, and they do not
   represent the metabolism of Salmonella inside a host cell. There are
   several techniques to create condition specific GMN models by mapping
   transcriptome data on to the generic GMNs. One of the commonly used
   transcriptome data mapping methods to generate condition specific GMNs
   is Gene Inactivity Moderated by Metabolism and Expression (GIMME)
   algorithm [[47]22], which predicts active and inactive reactions in a
   GMN based on mRNA levels belonging to a particular condition.

   Infection leads to a set of intricate interactions between pathogen and
   host cells, and these interactions should be taken into account in the
   process of identification of novel drug targets [[48]9]. Pathogen-host
   integrated GMNs have potential to shed light on pathogen-host
   interactions (PHI) when integrated with dual RNA-seq data [[49]7].
   Pathogen-host metabolic modeling is a multi-cellular interaction
   modelling approach, where GMNs of both pathogen and host organisms are
   integrated in the simulations of metabolic phenotypes. Even if not
   commonly used yet, some pathogen-host GMNs are available in the
   literature [[50]23, [51]24]. Here, we aim to provide a better insight
   into S. Typhimurium and host interactions by taking advantage of
   pathogen-host integrated GMNs and dual RNA-seq data in order to
   determine novel drug targets that can eliminate S. Typhimurium induced
   infections. We further report potential drugs for the identified drug
   target candidates by using a drug repositioning based approach. Through
   a prioritization criteria, pabB was selected as a high-ranked putative
   target, and a structure-based screening of novel drugs for pabB was
   performed using molecular docking simulations. To our knowledge, this
   is the first study that reconstructs a condition-specific pathogen-host
   GMN model by mapping dual RNA-seq data to predict novel drug targets.

Results

Pathogen-host integrated genome scale metabolic network analysis

   A pathogen-host integrated GMN model was reconstructed in this study
   for the first time in literature for S. Typhimurium, with a total of
   3586 genes from both organisms and 11,029 reactions. This model was
   used to generate condition-specific pathogen-host GMN models by mapping
   the dual RNA-seq data from 0^th, 8^th and 16^th hours of infection
   [[52]25]. GIMME was used to integrate the model and the transcriptome
   data, and the number of reactions for the corresponding
   condition-specific pathogen-host GMNs are given in [53]Table 1.

Table 1. The reaction and metabolite numbers of condition specific
pathogen-host GMN models.

   Condition-specific GMN at the beginning of infection Post-infection
   condition specific GMN at 8th hour Post-infection condition specific
   GMN at 16th hour
   Number of Reactions 8773 8933 9089
   Number of Metabolites 6941 6982 6979
   Number of HeLa reactions 6595 6681 6536
   Number of S. Typhimurium Reactions 2178 2252 2553
   [54]Open in a new tab

   Pathogen-host GMN was created by combining generic human and S.
   Typhimurium GMN models, and the number of reactions decreased in the
   condition-specific pathogen-host GMN models since they represent
   infected HeLa cell by S. Typhimurium in specific conditions. Nearly
   2500 reactions were discarded from the generic Pathogen-Host GMN model
   since they were not active at the beginning of infection based on mRNA
   levels. On the other hand, the reaction profile of condition-specific
   pathogen-host GMN is different between the beginning and the late
   stages of infection. As infection progresses, the number of HeLa
   reactions remains almost the same while S. Typhimurium reactions
   dramatically increase ([55]Table 1). The reaction profiles at the
   beginning of infection (0^th hour) and at late stage of infection
   (16^th hour) were compared. Even if there is not a considerable change
   in the number of HeLa reactions, the reaction profiles are different
   between the two conditions ([56]Fig 1). During the infection, S.
   Typhimurium must adapt to the nutrition environment and physical
   conditions to survive inside the host [[57]26]. Therefore, the increase
   in the number of S. Typhimurium reactions during the infection can be
   attributed to the activation of genes that might be necessary for the
   adaptation and proliferation of the pathogen. The reaction profiles of
   condition-specific pathogen-host GMNs were compared to each other in
   order to identify alterations in metabolic pathways based on the
   progress of infection. For host and pathogen separately, reactions only
   active in the condition-specific GMN at the beginning of infection
   (GMN^0th) and only active in the post-infection condition-specific GMN
   at the 16^th hour (GMN^16th) were identified and grouped by their
   pathways. The identified metabolic pathways and corresponding number of
   infection-time specific reactions are given in [58]Fig 1.

Fig 1. The differences in reaction profiles of condition-specific
pathogen-host GMNs grouped by pathways.

   [59]Fig 1
   [60]Open in a new tab

   Pathways with at least 2 differential reactions are given. Red and
   green bars represent the number of HeLa cell reactions that are only
   active in 0^th hour and 16^th hour respectively. Purple and blue bars
   represent the number of S. Typhimurium reactions that are only active
   in 0^th and 16^th hour respectively. The pathway names are listed on
   the left side of figure.

   [61]Fig 1. shows that there is severe modulation in the lipid related
   pathways of host in infection. Lipid related pathways of the host are
   known to be subjected to modulation during the invasion of bacteria
   [[62]27]. On the other hand, multiple metabolic pathways of S.
   Typhimurium are altered during infection based on [63]Fig 1. Most
   dramatic change is in glycerophospholipid metabolism, which is one of
   the most important pathways for dual-membrane envelope of gram-negative
   bacteria [[64]28]. There is also a dramatic change in alternate carbon
   metabolism, implying the utilization of different carbon sources other
   than glucose in the late stage of infection. Even if glucose is major
   carbon source for S. Typhimurium, it utilizes different carbon sources
   during infection [[65]29]. Cofactor and prosthetic group biosynthesis,
   which is tightly related to enzymatic activities, is also altered as
   expected since enzymatic activities become varied as infection
   progress.

   The first step in the validation of GMN models is comparing predicted
   flux rates with experimental data from literature. FBA was performed to
   predict flux rates of GMN models by maximizing TB (see [66]Material and
   Methods). Ethanol and succinate production rates were set to zero based
   on literature information [[67]30], and the other constraints were set
   as detailed in the section Materials and Methods. Predicted secretion
   rates of major by-products for S. typhimurium are given in [68]Table 2
   together with the literature-reported secretion rates at infection. The
   relative rates of by-products predicted by the pathogen-host integrated
   condition-specific model at 16^th hour are in perfect agreement with
   the experimental data from HeLa-infecting S. Typhimurium cells
   [[69]30]. Repeating simulations by using the non-reduced model leads to
   rates about 20% higher than the reduced model predictions. The use of
   only reduced Salmonella model without the host network, on the other
   hand, led to 25% higher acetate secretion rates. Predicted acetate,
   formate, and lactate secretion rates at 8^th hour of infection, on the
   other hand, are very close to the rates predicted for the beginning of
   infection ([70]Fig 1). Therefore, for the rest of the study, we used
   the model reconstructed for the 16^th hour of infection as the model
   representative of the infectious state of the organism.

Table 2. Predicted flux rates obtained by FBA analysis of condition-specific
pathogen-host integrated GMNs are compared with the experimental results.

   Infection (0^th hr) Flux Values (mmol/gDW/h) Infection (8^th hour) Flux
   Values (mmol/gDW/h) Infection (16^th hour) Flux Values (mmol/gDW/h)
   Experimental (nM/cell/h) [[71]30]
   D-Lactate Secretion 0 0 11.29 10 ± 3
   Acetate Secretion 1.94 2.17 6.80 4 ± 2
   Formate Secretion 10.86 9.14 3.55 2 ± 1
   Succinate Secretion 0 0 0 0
   Ethanol Secretion 0 0 0 0
   [72]Open in a new tab

Identification of potential drug targets

   Prediction of essential genes (EG) for the survival of pathogen inside
   host organism is the primary step for most of the drug discovery
   processes. Enzymes produced from EGs are potential drug targets that
   can be targeted with chemical molecules to eliminate the pathogen. EGs
   for the infection were predicted by using GMN^16th, which represents
   the infectious state. Here, 140 EGs were predicted for the infection.
   Predicted 140 EGs were compared with literature by using Database of
   Essential Genes (DEG) [[73]31], which reports data from three
   experimental gene deletion studies from rich medium experiments
   [[74]32–[75]34] Data were available for 137 of 140 predicted EGs, 93 of
   which were reported as essential genes in at least one study (68%).
   ([76]S1 Table). The genes falsely predicted as essential can be
   attributed to the fact that the experiments were performed in rich
   medium conditions with no host cells involved while the simulations
   were performed by the pathogen-host integrated genome-scale metabolic
   network.

   Drug targets should not show high amino acid sequence similarity with
   human proteins in order to prevent side effects. Therefore, homology
   analysis was performed to identify drug targets that are similar to
   human proteins. The similarity was determined based on the predefined
   cutoff value detailed in Materials and Methods. Out of 140 potential
   drug targets, 52 proteins were discarded since they have high
   similarity with human proteins ([77]S2 Table). Pathway enrichment
   analysis was performed with non-homologous 89 proteins in order to
   characterize potential drug targets ([78]Fig 2). The most enriched
   pathway is the lipopolysaccharide biosynthesis pathway, which is
   critical for the survival of S. Typhimurium since it maintains the
   functionality of the outer membrane of the pathogen [[79]35]. Another
   enriched pathway is the biosynthesis of amino acids, and it is a
   reasonable metabolic pathway that can be targeted since it is
   indispensable for pathogens [[80]36]. Like lipopolysaccharides,
   peptidoglycans are also indispensable molecules for the functionality
   of bacterial cell wall [[81]37], and their biosynthesis pathway was
   also captured ([82]Fig 2). Consequently, the general composition of
   enriched pathways indicates that targeted proteins serve in the
   pathways that are crucial for amino acid and cell membrane metabolisms.

Fig 2. Pathway enrichment analysis result.

   [83]Fig 2
   [84]Open in a new tab

   The values on the x axes indicates number of drug targets in the
   related pathway with no homology to human proteins.

   Druggability analysis was performed to identify potential drug targets
   that can be targeted with chemical molecules. The druggability analysis
   aims to identify proteins that have a high affinity to bind to
   drug-like molecules since some proteins do not have this property
   [[85]18]. Here, 43 potential drug targets that have high affinity to
   bind drug-like molecules were determined among 89 non-homologous
   pathogen proteins ([86]S3 Table). Determining potential drug targets
   that are broadly distributed among other harmful bacteria is one of the
   important steps of the prioritization process. [[87]18]. To identify
   such potential drug targets, broad-spectrum analysis was performed.
   Finally, 28 potential drug targets that are non-homologous to human
   proteins, druggable and broadly distributed among other bacteria were
   determined. The list of final potential drug targets is given in
   [88]Table 3 and [89]S4 Table. Of 28 predicted potential drug targets,
   20 were reported as essential in the DEG database. When we ranked the
   drug target list in terms of broad spectrum score, i.e. number of
   pathogenic bacteria with significantly similar sequence of the gene, 16
   of the top 20 genes were essential based on DEG (80%).

Table 3. Model-derived potential drug targets that obey three criteria: No
homology to human proteins, druggable, and broad-spectrum behaviour.

   Locus Names Gene Symbol Protein Name Pathway Reported at DEG
   STM1824 pabB Aminodeoxychorismate synthase component 1 Folate
   biosynthesis Yes
   STM0087 folA  Dihydrofolate reductase Folate biosynthesis Yes
   STM0183 folK 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine
   pyrophosphokinase Folate biosynthesis No
   STM3295 folP  Dihydropteroate synthase Folate biosynthesis No
   STM0064 dapB 4-hydroxy-tetrahydrodipicolinate reductase Biosynthesis of
   amino acids, L-lysine biosynthesis via DAP pathway Yes
   STM0213 dapD 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate
   N-succinyltransferase Biosynthesis of amino acids, L-lysine
   biosynthesis via DAP pathway Yes
   STM0207 mtnN 5’-methylthioadenosine/S-adenosylhomocysteine nucleosidase
   Biosynthesis of amino acids, Cysteine and methionine metabolism. No
   STM3486 aroB 3-dehydroquinate synthase Biosynthesis of amino acids,
   Phenylalanine, tyrosine and tryptophan biosynthesis No
   STM2384 aroC Chorismate synthase Biosynthesis of amino acids,
   Phenylalanine, tyrosine and tryptophan biosynthesis No
   STM3862 glmU Bifunctional protein GlmU UDP-N-acetyl-alpha-D-glucosamine
   biosynthesis Yes
   STM2094 rmlC dTDP-4-dehydrorhamnose 3,5-epimerase Polyketide sugar unit
   biosynthesis, Streptomycin biosynthesis No
   STM1772 kdsA 2-dehydro-3-deoxyphosphooctonate aldolase
   Lipopolysaccharide biosynthesis Yes
   STM3316 kdsC 3-deoxy-D-manno-octulosonate 8-phosphate phosphatase KdsC
   Lipopolysaccharide biosynthesis No
   STM0988 kdsB 3-deoxy-manno-octulosonate cytidylyltransferase
   Lipopolysaccharide biosynthesis Yes
   STM0310 gmhA Phosphoheptose isomerase Lipopolysaccharide biosynthesis
   No
   STM0228 lpxA Acyl-[acyl-carrier-protein]—UDP-N-acetylglucosamine
   O-acyltransferase Lipopolysaccharide biosynthesis Yes
   STM0134 LpxC UDP-3-O-acyl-N-acetylglucosamine deacetylase
   Lipopolysaccharide biosynthesis Yes
   STM1200 tmk Thymidylate kinase Pyrimidine metabolism Yes
   STM1707 pyrF Orotidine 5’-phosphate decarboxylase Pyrimidine metabolism
   Yes
   STM1426 ribE Riboflavin synthase, alpha chain Riboflavin metabolism Yes
   STM0417 ribH 6,7-dimethyl-8-ribityllumazine synthase Riboflavin
   metabolism Yes
   STM0045 ribF  Riboflavin biosynthesis protein Riboflavin metabolism Yes
   STM3307 murA UDP-N-acetylglucosamine 1-carboxyvinyl transferase
   Peptidoglycan biosynthesis. Amino sugar and nucleotide sugar metabolism
   Yes
   STM0129 murC UDP-N-acetylmuramate—L-alanine ligase Peptidoglycan
   biosynthesis. D-Glutamine and D-glutamate metabolism Yes
   STM0123 murE
   UDP-N-acetylmuramoyl-L-alanyl-D-glutamate—2,6-diaminopimelate ligase
   Peptidoglycan biosynthesis.Lysine biosynthesis Yes
   STM0128 murG UDP-N-acetylglucosamine—N-acetylmuramyl-(pentapeptide)
   pyrophosphoryl-undecaprenol N-acetylglucosamine transferase
   Peptidoglycan biosynthesis Yes
   STM0124 murF UDP-N-acetylmuramoyl-tripeptide—D-alanyl-D-alanine ligase
   Peptidoglycan biosynthesis.Lysine biosynthesis Yes
   STM3725 coaD Phosphopantetheine adenylyltransferase Pantothenate and
   CoA biosynthesis Yes
   [90]Open in a new tab

Analysis of the prioritized drug targets

   The prioritized drug targets ([91]Table 3) were clustered into pathways
   that are crucial for the survival of the pathogen. There are four
   proteins, pabB, folA, folK, folP, that take part in folate biosynthesis
   pathways. For nearly all organisms, the folate biosynthesis pathway is
   one of the indispensable pathways in order to maintain life. Folates
   are necessary for the production of essential biomolecules such as
   nucleic acids and amino acids. Most bacteria, fungi and plants can
   synthesize folate, while animal cells take it up from external sources
   [[92]38]. This pathway is a potentially promising drug target since
   human cells do not have a folate synthesis mechanism that might be
   manipulated by a pathogen. Five of the prioritized drug targets have
   functionality in the biosynthesis of amino acids (dapB, dapD, mtnN,
   aroB, aroC). dapB and dapD take place in L-lysine biosynthesis via
   diaminopimelic acid (DAP) pathway, and the side product of this pathway
   is m-DAP, which is an essential biomolecule for peptidoglycan cell wall
   for gram-negative bacteria [[93]39]. glmU has a role in the production
   of UDP-N-acetyl-alpha-D-glucosamine, which is essential for bacterial
   cell wall [[94]40]. rmlC has an important role in the synthesis of
   L-rhamnose, which is an important saccharide for the virulence of some
   pathogens including S. Typhimurium. The absence of L-rhamnose
   biosynthesis pathway in human cells makes this drug target more
   appealing [[95]41]. The survival of bacterium depends on the integrity
   of cell envelope. kdsA, kdsC, kdsB, gmhA, lpxA and LpxC have roles in
   the production of lipopolysaccharides, which is critical for the
   formation of cell envelope [[96]42]. tmk and pyrF are involved in
   pyrimidine metabolism, which is crucial for all living organisms. tmk
   catalyzes the phosphorylation of thymidine 5’-monophosphate, which is
   an essential reaction for pyrimidine synthesis [[97]43]. ribE, ribF and
   ribH are required for the production of riboflavin, which is a
   precursor of flavin mononucleotide (FMN) and flavin adenin dinucleotide
   (FAD). Riboflavin synthesis is a pathway required for the survival of
   gram-negative bacteria in the absence of external riboflavin synthesis
   [[98]18]. murA, murC, murE, murG and murF are involved in the synthesis
   of peptidoglycans, which is an essential ingredient for bacterial cell
   wall biogenesis [[99]44]. murA catalyzes the first step of
   peptidoglycan biosynthesis, and deletion of murA leads to death of
   Escherichia coli and Streptococcus pneumoniae. Fosfomycin is an
   antibiotic that targets murA in order to kill bacteria [[100]45]. Mur
   ligases are known to be attractive drug targets because of their role
   in bacterial cell wall formation [[101]46]. Pantothenate is a main
   precursor of coenzyme A, and its absence leads to deficiency in
   bacterial growth. coaD is involved in the fourth step in the coenzyme A
   biosynthesis pathway, which was investigated before as a suitable
   antibiotic target [[102]47].

Identification of potential drugs for pabB

   4-aminobenzoic acid (PABA) synthesis is an attractive antibiotic target
   since it is an essential biomolecule for many pathogens and it does not
   have a human counterpart. PABA has two main functionalities in the
   bacteria; (i) it is a substrate for folic acid pathway, which is
   critical for survival of pathogen, (ii) it is a precursor in coenzyme Q
   biosynthesis, which is essential for virulence [[103]48, [104]49]. PABA
   is synthesized in two steps, and the first step is catalyzed by pabA
   and pabB enzymes by converting chorismate to 4-amino-4-deoxychorismate
   [[105]50]. And, the second step is the production of PABA from
   4-amino-4-deoxychorismate. pabB was detected as one of the putative
   drug targets in this study by our drug target prioritization pipeline.
   We specifically focused on pabB in the rest of our study as a drug
   target to eliminate Salmonella infections since it has critical
   functionality in PABA synthesis pathway, which is not represented in
   human cells. pabB was reported to be essential experimentally in the
   DEG database, and, among the identified drug targets with experimental
   validation ([106]Table 3), it ranks 7^th in terms of the number of
   pathogenic bacteria strains that carry a gene with high sequence
   similarity based on our broad-spectrum analysis. Additionally, we
   investigated the importance of pabB in the pathogen-host integrated GMN
   model. Interactions of the metabolites of the pabB reaction
   (chorismate, 4-amino-4-deoxychorismate, L-glutamate, L-glutamine) with
   the metabolites of other reactions were visualized by creating a
   metabolite-metabolite interaction network ([107]S1 Fig). L-glutamate is
   directly related to numerous amino acids such as alanine, leucine and
   asparagine. On the other hand, 4-amino-4-deoxychorismate is indirectly
   related with the production of thymidine through tetrahydrofolate. It
   is also linked with the production of adenine through R-Pantoate.
   Therefore, DNA synthesis is dependent on the production of
   4-amino-4-deoxychorismate through adenine and thymine synthesis which
   cannot be synthesized when pabB is inhibited.

   There is a very similar protein to this putative target in Escherichia
   coli, which is also called pabB. Formic acid, which is widely used as
   an antibacterial agent in fodders, was reported in DrugBank as a
   compound that targets pabB in Escherichia coli (strain K12) [[108]51].
   To our knowledge, pabB was not offered or examined as a drug target for
   Salmonella species before. Hereby, protein docking and molecular
   dynamic analysis were performed to determine novel molecules that can
   inhibit S. Typhimurium growth by binding pabB. The protein
   Aminodeoxychorismate synthase component 1, which belonged to S.
   Typhimurium (strain LT2 / SGSC1412 / ATCC 700720) with a UniProt ID:
   [109]P12680 (PABB_SALTY), was taken for elaborative structural and
   functional studies. The 3D structure of the protein is the core
   requirement to carry out protein molecular docking studies and its
   unavailability necessitated the modeling of the three-dimensional
   structure. Herein, first step was the comparative 3D modeling of the
   query protein with the application of MODELLER Software. The crystal
   structure of 4-amino4-deoxychorismate (ADC) synthase (PDB ID: 1K0E) was
   chosen as a template as its percent identity was 76.38% and query
   coverage was 99% against [110]P12680. The MODELLER yielded five 3D
   structures of [111]P12680 protein, out of which the best model with
   higher accuracy was selected ([112]S2 Fig) after thorough quality
   assessment using PROCHECK server. The Verify3D passed the modelled
   structure with 87% indicating 80% amino acids of the built model having
   score > = 0.2 in the 3D/1D profile. Further, the Ramachandran plot
   ([113]S3 Fig) showed 92.7% residues in the most favoured region whereas
   only 0.3% residues were reported in disallowed region. Once the model
   was finalized, Mg^2+ ion was incorporated into the built 3D structure
   as magnesium ion is reported as the cofactor within the target protein
   in the UniProt indicating its role in the catalysis activity.
   Therefore, Mg^2+ ion was complexed near the active site residues
   reported in UniProt and in literature [[114]48].

   The DoGSiteScorer produced 10 probable binding pockets, out of which
   pocket number one was selected as the binding site to utilize for
   molecular docking. The selected pocket has a druggability score of
   0.83, suggesting the prime region for drug binding. The binding pocket
   was analyzed in Chimera software [[115]52], and validated as it covers
   the active site amino acid residues reported in the UniProt along with
   the amino acid residues coordinating with the Mg^2+ ion. The predicted
   binding site can be seen in ([116]S4 Fig).

   Once all the 54,000 drug-like compounds were screened against the
   target protein ([117]P12680), the lowest binding energy conformation of
   each single 54,000 compound(s) was obtained. The overall binding
   energies ranked against the [118]P12680 are represented through a
   histogram in [119]Fig 3.

Fig 3. Histogram illustration of the overall binding energy retrieved through
virtual screening.

   [120]Fig 3
   [121]Open in a new tab

   The thorough analysis of the results suggested that 1659 compounds
   showed promising binding free energy ranging from -12.42 kcal/mol to
   -9.04 kcal/mol, while 22,231 compounds having binding free energies
   within the range of -5.66 kcal/mol to -2.28 kcal/mol. Moreover, top ten
   compounds that were docked near the active site of the target protein
   were (having binding free energy of -12.42, -12.02, -11.92, -11.90,
   -11.91, -11.98, -11.77, -11.73, -11.42, -11.72) retrieved to further
   investigate their chemical interactions with amino residues of target
   protein. The aforementioned top ten compounds with their ZINC IDs are
   reported in ([122]S5 Table).

   The protein-ligand complex of each top ten compound was explored to
   inspect the chemical bonds and interactions occurring within the
   protein-ligand complex. LigPlot+ generates the atomic interactions
   taking place among the target protein and ligand (drug) through
   hydrogen bonds and hydrophobic contacts. Each of the ten protein-ligand
   complex analyzed in LigPlot+ has shown ligand interactions with the
   Mg^2+. The reported active site residues from UniProt (i.e. Lys275 and
   Glu259) were found to be interacting with the top ten ligands, out of
   which Lys275 can be seen making hydrogen bond with five ligands and
   hydrophobic contacts with two ligands. ([123]Table 4). On the other
   hand, Glu259 makes hydrophobic contacts with six of the ligands. The
   most common amino acid residues interacting with top ten ligands
   through hydrogen bonding were identified as Thr277, Gly427, Glu440,
   Lys444, Lys275, Arg411 and Glu259. Finally, each of the amino acid
   residues making hydrogen bonds and hydrophobic contacts interaction
   with the top ranked 10 compounds is presented in [124]Table 4 and
   visualized in [125]S5–[126]S7 Figs. There is not any report in
   literature about the use of these compounds against pathogenic
   bacteria. Therefore, they remain open to experimental validation.

Table 4. Protein residues involved in the hydrogen and hydrophobic
interactions with top ten best ranking compounds (ligands), analyzed through
LigPlot+.

   S.No ZINC IDs Residues making hydrogen bond interaction Residues making
   Hydrophobic contacts
   1 ZINC7879733 Thr277(A), Gly427(A) Lys444(A), Arg411(A) Lys275(A)
   Ile410(A), Ala424(A), Gly427(A) Asn214(A) Trp391(A), Ser423(A),
   Gly426(A), Glu259(A) Ile368(A), Cys422(A), Gly425(A), Gly276(A),
   Ile274(A)
   2 ZINC15179659 Gly427(A), Arg411(A), Lys444(A), Thr277(A), Glu440(A)
   Lys275(A), Trp391(A), Val445(A), OThr277(A) Ala424(A), Ser367(A),
   Ile410(A), Gly425(A) Ser423(A), Ile368(A), Ile448(A), Asn214(A),
   Gly426(A) Thr412(A), Cys422(A), GIle274(A), ly276(A)
   3 ZINC14880941 Thr277(A), Ser423(A) Gly427(A), Gly276(A) Glu440(A),
   Trp391(A) Asn214(A) Lys444(A), Arg411(A), Val445(A), Ile410(A)
   Lys275(A), Ile274(A), Gly425(A), Ala424(A), Ile368(A) Gly426(A),
   Ile448(A), Glu259(A), Cys422(A)
   4 ZINC58542694 Arg411(A), Thr277(A), Glu440(A), Lys275(A), Gly427(A),
   Lys444(A), Gly276(A), Trp391(A), Ile410(A), Ser423(A), Ala424(A),
   Ile274(A), Asn214(A), Ile368(A), Val445(A), Gly425(A), Gly426(A),
   His340(A), Ser367(A),
   5 ZINC1201089024 Trp391(A), Lys275(A), Asn214(A), Arg411(A), Glu440(A),
   Thr277(A), Gly427(A), Lys444(A), Gly276(A), Ile410(A), Ile274(A),
   Gly425(A), Ile368(A), Val445(A), Glu259(A), Ser423(A), Gly426(A)
   Cys422(A), Ile448(A),
   6 ZINC27071723 Lys444(A), Trp391(A), Lys275(A), Glu440(A), Thr277(A),
   Gly427(A), Arg411(A), Gly276(A), Ile274(A), Val445(A), Ser423(A),
   Gly425(A), Ile410(A), Cys422(A), Gly426(A), Thr412(A), Ala424(A),
   Ile448(A), Asn214(A), Glu259(A), Ile368(A),
   7 ZINC7133393 Arg411(A), Gly425(A), Glu440(A), Thr277(A), Gly427(A)
   Lys444(A), Gly276(A), Trp391(A), Ile410(A), Ala424(A), Val445(A),
   Ser423(A), Glu259(A), Asn214(A), Gly426(A)Ser367(A)
   8 ZINC7879735 Gly427(A), Lys444(A), Arg411(A), Glu440(A), Thr277(A),
   Lys275(A), Gly425(A), Gly276(A), Asn214(A), Ile274(A), Ala424(A),
   Ile410(A), Ser367(A), Trp391(A), Ser423(A), Gly426(A), Cys422(A),
   Val445(A),
   9 ZINC58542238 Gly427(A), Gly425(A), Glu440(A), Lys275(A), Thr277(A)
   Arg411(A), Lys444(A), Gly276(A), Trp391(A), Ser423(A), Val445(A),
   Ile274(A), Ile410(A), Asn214(A), Ala424(A), Gly426(A), His340(A),
   Ile368(A), Thr412(A), Cys422(A), Ile448(A),
   10 ZINC7538530 Arg411(A), Glu440(A), Thr277(A), Gly427(A), Lys444(A),
   Gly276(A), Lys275(A), Ile274(A), Gly425(A), Trp391(A), Ile410(A),
   Ser423(A), Ala424(A), Gly426(A), Asn214(A), Glu259(A), Cys422(A),
   Val445(A)
   [127]Open in a new tab

Conclusions

   Analysis of pathogen-host integrated GMN models is a well-suited
   approach in terms of identifying novel drug targets by considering
   pathogen-host interactions. It allows tracking the response of pathogen
   and host simultaneously during infection by mapping infection-induced
   dual-transcriptome data. There are some pathogen-host GMN models
   published [[128]23, [129]24], but to our knowledge, this is the first
   study in the literature that determines drug targets by analysing
   condition-specific pathogen-host integrated GMNs created by mapping
   dual-transcriptome data. We here reconstructed and analyzed
   condition-specific integrated GMN models to identify novel drug targets
   for S. Typhimurium induced infections. We used prioritization steps to
   identify best suitable novel drug targets. After prioritization
   processes, we identified 28 putative drug targets, and pabB was chosen
   as a high ranked drug target based on our prioritization pipeline,
   literature information and novelty. Subsequently, homology and
   molecular docking analyses were performed to identify candidate
   compounds that inhibit pabB. The top ten compounds in terms of binding
   free energy were identified and reported. Consequently, analysing S.
   enterica metabolism inside the host cell has enabled us to comprehend
   metabolic alterations in both HeLa cells and S. enterica along with
   determining novel drug targets. Future studies may provide more
   arguments for the proposed drug targets and inhibitors in this study.
   This study can be used as a guideline for creating and analysing
   condition specific pathogen-host GMN models.

Materials and methods

   The flowchart of the pipeline followed in this study is given in
   [130]Fig 4. Each step is detailed in the sections below.

Fig 4. The flowchart of the pipeline followed in this study.

   [131]Fig 4
   [132]Open in a new tab

Transcriptome data

   The dual RNA-seq data of infected HeLa cells and S. Typhimurium strain
   SL1344 [[133]25] was downloaded from NCBI Gene Expression Omnibus (GEO)
   Database [[134]53]. The dataset ID in the GEO database is
   [135]GSE117236. The samples collected at the beginning of infection and
   the post-infection data at 8^th and 16^th hours were used in this
   study. Each time point included duplicate samples. Principal component
   analysis (PCA) was used to identify any possible outliers in the data,
   and no outliers were detected ([136]S8 Fig).

Pathogen-host integrated genome scale metabolic network

   Two different genome-scale metabolic models from the literature were
   used to reconstruct an integrated pathogen-host genome scale metabolic
   network (GMN). A genome-scale metabolic network of S. Typhimurium,
   called stm_v1.0, consisting of 2,545 reactions controlled by 1,271
   genes was used as the pathogen metabolic network [[137]21]. As the host
   GMN, a recent reconstruction of human metabolism with a substantial
   amount of curation, called iHsa, was used, which covered 8,336
   reactions and 2,315 genes [[138]54]. S. Typhimurium is an intracellular
   pathogen. Therefore, the GMN of the pathogen was placed inside the
   cytoplasm of the host GMN as a separate compartment to create the
   pathogen-host integrated GMN. Here, an extensive literature research
   was performed to identify cytosolic host metabolites that can be
   consumed by the pathogen during infection. An exchange reaction with
   the extracellular environment must be available for these metabolites
   in the S. Typhimurium metabolic network. 38 such metabolites were
   identified [[139]26, [140]29, [141]55–[142]57], and they were allowed
   to be taken up by the S. Typhimurium GMN from the cytoplasm of the host
   ([143]S6 Table). In addition, the pathogen was allowed to secrete all
   its exchange metabolites to the host cytoplasm as defined by the model
   secretion reactions ([144]S7 Table). As a result, the pathogen-host
   integrated GMN consisting of 11,029 reactions controlled by 3,586 genes
   was created by using COBRA Toolbox v.3.0 on MATLAB programming platform
   [[145]58].

Flux balance analysis

   Pathogen-host integrated GMN was analyzed using flux balance analysis
   (FBA) method to predict fluxes associated with the infection times
   studied. FBA searches the solution space defined by mass balance and
   reaction reversibility constraints to find an optimal solution with the
   help of an objective function. FBA assumes that the system is at steady
   state, i.e. the concentrations of intracellular metabolites do not
   change over sufficiently long time, leading to linear mass balance
   constraints [[146]59]. The objective function of pathogen-host
   integrated GMN was defined based on a weighted relationship between
   host and pathogen biomass, and the biomass composition formulas were
   taken from iHsa and stm_v1.0 models ([147]Eq 1) [[148]21, [149]54,
   [150]60]. HB, PB and TB are host biomass, pathogen biomass and total
   biomass respectively in Eq ([151]1), where α and β are maximum host and
   pathogen biomass production rates in order. Using the constructed
   pathogen-host GMN, α and β were calculated first with FBA via
   maximization of HB and PB reactions separately. Later, maximum host and
   pathogen biomass production rates (α and β), which were calculated as
   0.39 and 0.28 respectively, were added to the equation as weights of HB
   and PB to get TB, which represents the balanced effect of HB and PB.
   [152]Eq 1 was added as a reaction to the pathogen-host integrated GMN
   and later set as the objective function.
   [MATH:
   <mo>∝</mo><mo>×</mo><mi>H</mi><mi>B</mi><mo>+</mo><mi>β</mi><mo>×</mo><
   mi>P</mi><mi>B</mi><mo>=</mo><mi>T</mi><mi>B</mi> :MATH]
   (Eq 1)

   The upper bound of glucose uptake rate of S. Typhimurium was set to 5
   mmol/gDW/h in simulations based on the studies of Thiele and her
   coworkers [[153]21]. The maximum uptake rate of other available carbon
   sources for S. Typhimurium inside the host cell was limited to 20% of
   its glucose uptake rate. Oxygen uptake rate of S. Typhimurium was set
   to 1 mmol/gDW/h to mimic the hypoxic environment during infection
   [[154]61]. The host cell was allowed to utilize only metabolites that
   were found in the Dulbecco’s Modified Eagle’s Medium (DMEM) since the
   infection experiment for the dual RNA-seq data, which was used in
   creating condition-specific integrated GMN models, was carried out in
   this medium ([155]S8 Table). In GMN models, alternate optima can be an
   issue in interpreting the results of FBA since there might be multiple
   flux distributions that result in the same value for the objective
   function. Minimization of the sum of squares of all flux values was
   applied to prevent alternate optima [[156]62]. The principle of this
   method was proposed based on the accomplishment of the cellular goals
   with minimal resource expenditure since the flux values are an
   indication of the amount of depletion of resources [[157]63].

Integrating transcriptome data with pathogen-host integrated GMN

   Condition specific GMNs were generated by mapping dual RNA-seq data on
   the pathogen-host GMN to simulate infection states at different time
   points. Gene Inactivity Moderated by Metabolism and Expression (GIMME)
   algorithm was used as the mapping algorithm to generate condition
   specific GMNs [[158]22]. GIMME determines active reactions based on a
   threshold put on the mRNA levels in the data, where reactions below the
   thresholds are set as inactive. GIMME generates a GMN with the desired
   functionality using the objective fraction parameter, and it adds the
   reactions in the inactive set back if their removal affects the desired
   functionality [[159]22]. The threshold value was determined as the
   quarter of the mean of the transcriptome data. Since the average gene
   expression values were much higher in HeLa cell compared to S.
   Typhimurium in the utilized dual RNA-seq data ([160]S9 Fig), the
   threshold value was separately determined for both organisms. Then,
   since GIMME algorithm accepts a single threshold, the difference
   between the organism-specific threshold values were added to the
   reaction scores of S. Typhimurium, and the threshold value obtained
   from the human transcriptome was used as the threshold value in GIMME
   simulations. The objective fraction parameter was set to 0.2 in order
   to ensure that the condition specific integrated GMN produces at least
   20% of the maximum TB. Three different condition specific GMNs were
   produced as a result to represent the start of infection (0^th hour)
   and infections at 8^th and 16^th hours.

Identification of drug targets

   Gene deletion analysis is a widely used approach in constraint-based
   metabolic modeling to predict potential drug targets, and it is
   performed by in silico deletion of genes in the GMN [[161]10]. The
   analysis aims to obtain essential genes for the desired functionality
   of a GMN, such as preventing growth of pathogen. FBA can be used to
   predict essential genes in an organism by setting the rate of the
   associated reaction(s) to zero for each gene. If inactivation of the
   reaction(s) lead to zero growth rate, the gene is essential for the
   pathogenic organism and it can be used as a drug target. In this study,
   gene deletion analysis was performed to identify potential drug targets
   that can restore the metabolic changes in the host cell caused by S.
   Typhimurium induced infection. GIMME-based condition-specific GMNs were
   used in the analysis. A nonzero rate for the production of HB together
   with zero rate for PB was used as the desired functionality of the
   condition specific GMNs in gene deletion analysis. Using the FBA
   technique to maximize HB and PB separately, essential genes were
   obtained, and the enzymes that catalyze these reactions were chosen as
   potential drug targets.

   The identified drug targets were further filtered based on the approach
   applied elsewhere [[162]18]. Briefly, the steps in the approach
   followed are as follows: (i) Homology analysis is a critical part of
   drug target selection process, and the aim of the analysis is
   determining drug targets that are not similar to host proteins in order
   to avoid side effects of drugs. Homology analysis was performed using
   BLASTp algorithm, and drug targets that are not similar to human
   proteins were identified [[163]64]. As BLASTp parameters, cut off for
   the expected value (E-value) was chosen as 1x10^-4, and the maximum
   sequence identity for the determination of human-non-homolog drug
   targets were chosen as 30% [[164]18]. Pathway enrichment analysis was
   performed with the identified human-non-homolog drug targets by using
   KOBAS V3.0 [[165]65] to elucidate pathways that are expected to be
   crucial for the survival of S. Typhimurium inside the host. (ii)
   Another important step in the model-based drug target selection process
   is the determination of druggable proteins, and the goal of this
   selection is identifying drug targets that can be targeted with
   drug-like chemical compounds. Druggable targets were identified using
   BLAST algorithm in Drugbank database [[166]51], and the E-value was
   chosen as 10^−25 [[167]18]. (iii) Broad spectrum analysis was performed
   for the identification of drug targets that are broadly distributed
   among other bacteria. Broad spectrum analysis is very beneficial in
   order to determine drug targets that can be effective against
   co-infections or multiple infections. In addition, the proteins that
   are broadly distributed among other bacteria may indicate low mutation
   rate, so developing antibacterial resistance can be harder for the
   targeted bacteria. Broad spectrum analysis was performed using PBIT web
   browser [[168]66], which contains protein sequences of 181 pathogenic
   organisms. E-value cut-off of 1x10^-5, bit score of 100 and sequence
   identity of 35% were chosen as parameter values [[169]18, [170]66]. For
   the broad spectrum target determination criteria, targets available in
   at least 40 pathogenic strains were chosen [[171]18].

Homology modeling of target protein and active / binding pocket prediction

   As the 3D structure of target protein was not available in the Protein
   Data Bank (PDB) [[172]67], the homology modeling was performed to model
   the target protein’s 3D structure. In the first step, the appropriate
   template protein was searched by BLASTp against the PDB database. The
   template that matches the criteria of query coverage ≥ 90% and percent
   identity ≥ 70% was selected. The Modeller software v9.20 was used to
   perform comparative protein structure modelling [[173]68–[174]71]. The
   Modeller works by satisfying the spatial restraints along with
   employing specific geometrical calculations generating possible
   coordinate(s) for the location of each single atom of target protein
   [[175]70]. The homology modeling was performed through Modeller by
   implementing a series of python script(s), which resulted in generating
   five models with their Discrete Optimized Potential Energy (DOPE)
   values. The 3D model with the lowest DOPE value was selected as the
   predicted 3D structure of the target protein. The predicted 3D model
   was then validated through verify3D and Ramachandran Plot via PROCHECK
   server to ensure its reliability and quality. The Verify3D program was
   used to interpret the quality of the built model, as Verify3D computes
   the compatibility of 3D model of target protein against its amino acid
   sequence [[176]72]. Further, Ramachandran Plot was analyzed to assess
   the stereo-chemical properties of the predicted 3D model [[177]73].

   The built target protein model was subjected to DoGSiteScorer to
   identify potential binding pockets. The DoGSiteScorer is an automated
   grid-based program which utilizes difference of Gaussian filter to
   identify the potential binding pocket present within the protein
   [[178]74, [179]75]. The program provides ten potential binding pockets
   with druggability scores. Out of the ten predicted pockets, the one
   with the highest druggability score is supposed to be a rich binding
   pocket and therefore, selected for study. The program also evaluates
   the depth, surface area and volume for each predicted binding pocket.

Docking-based virtual screening

   To carry out the rigorous virtual screening of drug-like molecules
   against the target protein, a library of chemical compounds was
   curated. The ZINC15 database was used to retrieve the drug-like
   molecules following certain criteria of compound’s molecular weight and
   logP values [[180]76]. The drug-like compounds having logP value ≤ 5
   and Molecular Weight ≤ 375 Daltons were obtained from the database in
   SDF format. A total of 54,000 drug-like compounds were compiled and
   prepared into the required PDBQT format by using Open Babel. All the
   54,000 compounds were minimized with force field MMFF94 via Open Babel
   [[181]77].

   The AutoDock-GPU [[182]78] was chosen to execute molecular docking and
   virtual screening of curated drug-like molecules (ligands) library
   against the target protein ([183]P12680). The AutoDock GPU was chosen
   because of the prolonged execution times whilst using AutoDock4.
   AutoDock-GPU, which is an OpenCL and cuda based implementation of
   Autodock4, was created to utilize large number of GPU cores and speed
   up docking by using parallel processing [[184]79, [185]80]. So, to
   execute molecular docking, the target protein’s modelled 3D structure
   was prepared by adding hydrogens and its conversion into the required
   format of PDBQT. Afterwards, the grid search box dimensions were set
   carefully to cover the predicted binding site retrieved from the
   DoGSiteScorer. Further docking steps were carried out to create docking
   parameter files, and, eventually, the grid maps FLD file, docking
   parameters files, GPF and DPF were made. Once the necessary files were
   created, the molecular docking was performed to screen all the 54,000
   compounds with 20 Genetic Algorithm (GA) runs against target receptor’s
   binding site. The virtual screening yielded binding free energy for 20
   runs of all the 54,000 compounds. Ultimately, the lowest binding energy
   of each compound was extracted for structural and functional
   evaluation. Moreover, re-docking step was performed by execution of all
   the Autodock4 steps utilizing only one ligand against the target
   protein to validate predicted binding site.

   LigPlot+ program was used to generate 2D diagrams of the target protein
   and ligand complex. For the LigPlot+ analysis, protein-ligand complex
   in PDB format was taken. The atomic interaction(s) within the diagram
   is shown, where ligand and protein’s interactive residues are
   represented in ball-stick format. From the diagram, the amino acid
   residues of the target protein making chemical interactions with the
   ligand can be identified. The LigPlot+ highlights the hydrogen bonding
   within the atoms of protein and ligand with green dotted lines.

Supporting information

   S1 Fig. Metabolic network of pabB related metabolites.

   The red arrows indicate the reaction controlled by pabB (L-Glutamine +
   chorismate -> L-Glutamate + 4-amino-4-deoxychorismate).

   (TIF)
   [186]Click here for additional data file.^ (1.4MB, tif)
   S2 Fig. The predicted 3D structure of [187]P12680.

   (TIF)
   [188]Click here for additional data file.^ (342.7KB, tif)
   S3 Fig. Ramachandran plot of the predicted 3D structure of target
   protein ([189]P12680).

   (TIF)
   [190]Click here for additional data file.^ (223.6KB, tif)
   S4 Fig. The predicted binding pocket is shown as surface in sand brown
   color while the protein chain as ribbon in blue color.

   From the figure, it can be seen that Mg2+ ion is submerged within the
   predicted binding pocket.

   (TIF)
   [191]Click here for additional data file.^ (258.3KB, tif)
   S5 Fig

   A) is representing the LigPlot+ of [192]P12680 showing atomic
   interaction between protein (residues/ Mg2+ ion) and ligand
   (ZINC7879733). The atomic linkages due to hydrogen bonding can be
   identified from the diagram. Similarly, B) represents the LigPlot+
   analysis of [193]P12680 and ligand (ZINC15179659), C) represents the
   LigPlot+ for [194]P12680 and ligand (ZINC14880941), and D) represents
   the LigPlot+ for [195]P12680 and ligand (ZINC58542694).

   (TIF)
   [196]Click here for additional data file.^ (359.4KB, tif)
   S6 Fig

   A) is the LigPlot+ for [197]P12680 and ligand (ZINC1201089024), all the
   atomic linkages occurring between the protein-ligand complex can be
   analyzed. Likewise, B) represents the LigPlot+ for [198]P12660 and
   ligand (ZINC27071723), C) is LigPlot+ for [199]P12680 and ligand
   (ZINC7133393) and D) is LigPlot+ for [200]P12680 and ligand
   (ZINC7879735).

   (TIF)
   [201]Click here for additional data file.^ (387.3KB, tif)
   S7 Fig

   A) is the LigPlot+ for [202]P12680 and ligand (ZINC58542238) complex
   and B) is the LigPlot+ for [203]P12680 and ligand (ZINC7538530)
   complex. The overall amino acid residues of [204]P12680 which interact
   with each of the top ten compounds (ligands) making hydrogen bonds and
   hydrophobic contacts.

   (TIF)
   [205]Click here for additional data file.^ (286.8KB, tif)
   S8 Fig. Principal component analysis (PCA) of [206]GSE117236.

   The red, blue and green dots represent beginning of infection, 8^th
   hour of infection and 16^th hour of infection respectively.

   (TIF)
   [207]Click here for additional data file.^ (48.4KB, tif)
   S9 Fig. Box plots of gene expression values of HeLa cell and S.
   Typhimurium.

   (TIFF)
   [208]Click here for additional data file.^ (11.7KB, tiff)
   S1 Table. 140 essential genes based on DEG database.

   (DOCX)
   [209]Click here for additional data file.^ (15.7KB, docx)
   S2 Table. 89 potential drug targets that are not similar to human
   proteins.

   (DOCX)
   [210]Click here for additional data file.^ (15.8KB, docx)
   S3 Table. Potential drug targets that have high affinity to bind
   drug-like molecules.

   (DOCX)
   [211]Click here for additional data file.^ (14.2KB, docx)
   S4 Table. The list of final potential drug targets.

   The last column indicates broad spectrum analysis results.

   (DOCX)
   [212]Click here for additional data file.^ (16KB, docx)
   S5 Table. Binding energy, Zinc IDs, 1D and 2D structure of each of top
   10 compounds.

   (DOCX)
   [213]Click here for additional data file.^ (566.2KB, docx)
   S6 Table. Metabolites that can be consumed by S. Typhimurium inside
   host cytoplasm in pathogen-host GMN model.

   (DOCX)
   [214]Click here for additional data file.^ (15.2KB, docx)
   S7 Table. The exchange metabolites of S. Typhimurium that match with
   the cytoplasmic metabolites of human cell.

   (DOCX)
   [215]Click here for additional data file.^ (20.8KB, docx)
   S8 Table. DMEM medium constraints for the host GMN used in metabolic
   model simulations.

   (DOCX)
   [216]Click here for additional data file.^ (15.3KB, docx)
   S9 Table. Long names of metabolites given in [217]S1 Fig, the figüre
   that reports metabolite-metabolite interactions around pabB catalyzed
   reaction.

   (DOCX)
   [218]Click here for additional data file.^ (19.9KB, docx)
   S1 File. The MATLAB codes to generate and analyze pathogen-host
   integrated genome scale metabolic models of S. Typhimurium and human.

   (ZIP)
   [219]Click here for additional data file.^ (3.1MB, zip)

Data Availability

   The codes are provided as a [220]supplementary file. The transcriptome
   data was downloaded from Gene Expression Omnibus (GSE117236).

Funding Statement

   This work was supported by TUBITAK, The Scientific and Technological
   Research Council of Turkey (Project Code: 316S005) and by PSF, The
   Pakistan Science Foundation [Project Code: PSF-TUBITAK/S-HEJ (04)]. The
   funders had no role in study design, data collection and analysis,
   decision to publish, or preparation of the manuscript.

References