Abstract
Salmonella enterica serovar Typhimurium (S. Typhimurium) is a highly
adaptive pathogenic bacteria with a serious public health concern due
to its increasing resistance to antibiotics. Therefore, identification
of novel drug targets for S. Typhimurium is crucial. Here, we first
created a pathogen-host integrated genome-scale metabolic network by
combining the metabolic models of human and S. Typhimurium, which we
further tailored to the pathogenic state by the integration of dual
transcriptome data. The integrated metabolic model enabled simultaneous
investigation of metabolic alterations in human cells and S.
Typhimurium during infection. Then, we used the tailored pathogen-host
integrated genome-scale metabolic network to predict essential genes in
the pathogen, which are candidate novel drug targets to inhibit
infection. Drug target prioritization procedure was applied to these
targets, and pabB was chosen as a putative drug target. It has an
essential role in 4-aminobenzoic acid (PABA) synthesis, which is an
essential biomolecule for many pathogens. A structure based virtual
screening was applied through docking simulations to predict candidate
compounds that eliminate S. Typhimurium infection by inhibiting pabB.
To our knowledge, this is the first comprehensive study for predicting
drug targets and drug like molecules by using pathogen-host integrated
genome-scale models, dual RNA-seq data and structure-based virtual
screening protocols. This framework will be useful in proposing novel
drug targets and drugs for antibiotic-resistant pathogens.
Introduction
S. Typhimurium, is a gram-negative invasive and facultative pathogen
that can infect various animal species [[30]1]. Upon infection, it
mostly causes food poisoning and leads to gastroenteritis in humans
[[31]2]. Salmonella infection causes 130,000 deaths every year, and it
mostly affects people in low income countries [[32]3]. S. Typhimurium
is an intracellular pathogen, residing inside a membrane-bound
compartment within host cells during infection [[33]4]. This
compartment is called Salmonella-containing vacuole (SCV), and it
enables proliferation of S. Typhimurium inside the host cell by
escaping from the host defense mechanism. Although SCV is considered a
nutrition poor environment, this does not pose a problem for S.
Typhimurium due to its highly adaptive lifestyle [[34]1]. Besides being
a highly adaptive pathogen, the increasing rate of antibiotic
resistance by S. Typhimurium has the potential to become a serious
concern for public health. Therefore, it is important to identify novel
drug targets to eliminate S. Typhimurium infections [[35]5].
Understanding the metabolic activities of a pathogen is an important
part of drug development process. Transcriptome data is a key data type
that can elucidate metabolic activities of an organism [[36]6]. It
reflects enzymatic activities of the organism via mRNA levels that
carry genetic information of those enzymes. One of the novel approaches
in transcriptomics is dual RNA-sequencing (dual RNA-seq), which can be
used to elucidate pathogen-host relationship since it measures mRNA
levels of pathogen and host simultaneously during infection [[37]7,
[38]8].
Genome-scale metabolic network (GMN) models have shown utility in the
analysis of metabolic activities of pathogen and host during infections
[[39]7, [40]9] Analysis of GMN models with constraint-based techniques
to predict novel drug targets has the advantages of (i) being cost
effective, (ii) being time efficient, (iii) providing a wide range of
analyses of metabolic pathways at the same time. There are several
studies about the prediction of novel drug targets to eliminate
pathogen induced infections by analyzing GMN models [[41]10]. Most
widely used constraint-based computational approach for the analysis of
GMNs is Flux Balance Analysis (FBA). FBA is a mathematical optimization
technique that uses linear optimization to predict distribution of
metabolic fluxes at steady state conditions. It uses an objective
function besides constraints to select an optimum point from the flux
solution space [[42]11]. In silico gene deletion analysis is another
widely used constraint-based analysis technique, which is used to
determine potential drug targets [[43]12–[44]18]. Several GMN models
were reconstructed so far for different Salmonella strains
[[45]19–[46]21]. But these models are generic models, and they do not
represent the metabolism of Salmonella inside a host cell. There are
several techniques to create condition specific GMN models by mapping
transcriptome data on to the generic GMNs. One of the commonly used
transcriptome data mapping methods to generate condition specific GMNs
is Gene Inactivity Moderated by Metabolism and Expression (GIMME)
algorithm [[47]22], which predicts active and inactive reactions in a
GMN based on mRNA levels belonging to a particular condition.
Infection leads to a set of intricate interactions between pathogen and
host cells, and these interactions should be taken into account in the
process of identification of novel drug targets [[48]9]. Pathogen-host
integrated GMNs have potential to shed light on pathogen-host
interactions (PHI) when integrated with dual RNA-seq data [[49]7].
Pathogen-host metabolic modeling is a multi-cellular interaction
modelling approach, where GMNs of both pathogen and host organisms are
integrated in the simulations of metabolic phenotypes. Even if not
commonly used yet, some pathogen-host GMNs are available in the
literature [[50]23, [51]24]. Here, we aim to provide a better insight
into S. Typhimurium and host interactions by taking advantage of
pathogen-host integrated GMNs and dual RNA-seq data in order to
determine novel drug targets that can eliminate S. Typhimurium induced
infections. We further report potential drugs for the identified drug
target candidates by using a drug repositioning based approach. Through
a prioritization criteria, pabB was selected as a high-ranked putative
target, and a structure-based screening of novel drugs for pabB was
performed using molecular docking simulations. To our knowledge, this
is the first study that reconstructs a condition-specific pathogen-host
GMN model by mapping dual RNA-seq data to predict novel drug targets.
Results
Pathogen-host integrated genome scale metabolic network analysis
A pathogen-host integrated GMN model was reconstructed in this study
for the first time in literature for S. Typhimurium, with a total of
3586 genes from both organisms and 11,029 reactions. This model was
used to generate condition-specific pathogen-host GMN models by mapping
the dual RNA-seq data from 0^th, 8^th and 16^th hours of infection
[[52]25]. GIMME was used to integrate the model and the transcriptome
data, and the number of reactions for the corresponding
condition-specific pathogen-host GMNs are given in [53]Table 1.
Table 1. The reaction and metabolite numbers of condition specific
pathogen-host GMN models.
Condition-specific GMN at the beginning of infection Post-infection
condition specific GMN at 8th hour Post-infection condition specific
GMN at 16th hour
Number of Reactions 8773 8933 9089
Number of Metabolites 6941 6982 6979
Number of HeLa reactions 6595 6681 6536
Number of S. Typhimurium Reactions 2178 2252 2553
[54]Open in a new tab
Pathogen-host GMN was created by combining generic human and S.
Typhimurium GMN models, and the number of reactions decreased in the
condition-specific pathogen-host GMN models since they represent
infected HeLa cell by S. Typhimurium in specific conditions. Nearly
2500 reactions were discarded from the generic Pathogen-Host GMN model
since they were not active at the beginning of infection based on mRNA
levels. On the other hand, the reaction profile of condition-specific
pathogen-host GMN is different between the beginning and the late
stages of infection. As infection progresses, the number of HeLa
reactions remains almost the same while S. Typhimurium reactions
dramatically increase ([55]Table 1). The reaction profiles at the
beginning of infection (0^th hour) and at late stage of infection
(16^th hour) were compared. Even if there is not a considerable change
in the number of HeLa reactions, the reaction profiles are different
between the two conditions ([56]Fig 1). During the infection, S.
Typhimurium must adapt to the nutrition environment and physical
conditions to survive inside the host [[57]26]. Therefore, the increase
in the number of S. Typhimurium reactions during the infection can be
attributed to the activation of genes that might be necessary for the
adaptation and proliferation of the pathogen. The reaction profiles of
condition-specific pathogen-host GMNs were compared to each other in
order to identify alterations in metabolic pathways based on the
progress of infection. For host and pathogen separately, reactions only
active in the condition-specific GMN at the beginning of infection
(GMN^0th) and only active in the post-infection condition-specific GMN
at the 16^th hour (GMN^16th) were identified and grouped by their
pathways. The identified metabolic pathways and corresponding number of
infection-time specific reactions are given in [58]Fig 1.
Fig 1. The differences in reaction profiles of condition-specific
pathogen-host GMNs grouped by pathways.
[59]Fig 1
[60]Open in a new tab
Pathways with at least 2 differential reactions are given. Red and
green bars represent the number of HeLa cell reactions that are only
active in 0^th hour and 16^th hour respectively. Purple and blue bars
represent the number of S. Typhimurium reactions that are only active
in 0^th and 16^th hour respectively. The pathway names are listed on
the left side of figure.
[61]Fig 1. shows that there is severe modulation in the lipid related
pathways of host in infection. Lipid related pathways of the host are
known to be subjected to modulation during the invasion of bacteria
[[62]27]. On the other hand, multiple metabolic pathways of S.
Typhimurium are altered during infection based on [63]Fig 1. Most
dramatic change is in glycerophospholipid metabolism, which is one of
the most important pathways for dual-membrane envelope of gram-negative
bacteria [[64]28]. There is also a dramatic change in alternate carbon
metabolism, implying the utilization of different carbon sources other
than glucose in the late stage of infection. Even if glucose is major
carbon source for S. Typhimurium, it utilizes different carbon sources
during infection [[65]29]. Cofactor and prosthetic group biosynthesis,
which is tightly related to enzymatic activities, is also altered as
expected since enzymatic activities become varied as infection
progress.
The first step in the validation of GMN models is comparing predicted
flux rates with experimental data from literature. FBA was performed to
predict flux rates of GMN models by maximizing TB (see [66]Material and
Methods). Ethanol and succinate production rates were set to zero based
on literature information [[67]30], and the other constraints were set
as detailed in the section Materials and Methods. Predicted secretion
rates of major by-products for S. typhimurium are given in [68]Table 2
together with the literature-reported secretion rates at infection. The
relative rates of by-products predicted by the pathogen-host integrated
condition-specific model at 16^th hour are in perfect agreement with
the experimental data from HeLa-infecting S. Typhimurium cells
[[69]30]. Repeating simulations by using the non-reduced model leads to
rates about 20% higher than the reduced model predictions. The use of
only reduced Salmonella model without the host network, on the other
hand, led to 25% higher acetate secretion rates. Predicted acetate,
formate, and lactate secretion rates at 8^th hour of infection, on the
other hand, are very close to the rates predicted for the beginning of
infection ([70]Fig 1). Therefore, for the rest of the study, we used
the model reconstructed for the 16^th hour of infection as the model
representative of the infectious state of the organism.
Table 2. Predicted flux rates obtained by FBA analysis of condition-specific
pathogen-host integrated GMNs are compared with the experimental results.
Infection (0^th hr) Flux Values (mmol/gDW/h) Infection (8^th hour) Flux
Values (mmol/gDW/h) Infection (16^th hour) Flux Values (mmol/gDW/h)
Experimental (nM/cell/h) [[71]30]
D-Lactate Secretion 0 0 11.29 10 ± 3
Acetate Secretion 1.94 2.17 6.80 4 ± 2
Formate Secretion 10.86 9.14 3.55 2 ± 1
Succinate Secretion 0 0 0 0
Ethanol Secretion 0 0 0 0
[72]Open in a new tab
Identification of potential drug targets
Prediction of essential genes (EG) for the survival of pathogen inside
host organism is the primary step for most of the drug discovery
processes. Enzymes produced from EGs are potential drug targets that
can be targeted with chemical molecules to eliminate the pathogen. EGs
for the infection were predicted by using GMN^16th, which represents
the infectious state. Here, 140 EGs were predicted for the infection.
Predicted 140 EGs were compared with literature by using Database of
Essential Genes (DEG) [[73]31], which reports data from three
experimental gene deletion studies from rich medium experiments
[[74]32–[75]34] Data were available for 137 of 140 predicted EGs, 93 of
which were reported as essential genes in at least one study (68%).
([76]S1 Table). The genes falsely predicted as essential can be
attributed to the fact that the experiments were performed in rich
medium conditions with no host cells involved while the simulations
were performed by the pathogen-host integrated genome-scale metabolic
network.
Drug targets should not show high amino acid sequence similarity with
human proteins in order to prevent side effects. Therefore, homology
analysis was performed to identify drug targets that are similar to
human proteins. The similarity was determined based on the predefined
cutoff value detailed in Materials and Methods. Out of 140 potential
drug targets, 52 proteins were discarded since they have high
similarity with human proteins ([77]S2 Table). Pathway enrichment
analysis was performed with non-homologous 89 proteins in order to
characterize potential drug targets ([78]Fig 2). The most enriched
pathway is the lipopolysaccharide biosynthesis pathway, which is
critical for the survival of S. Typhimurium since it maintains the
functionality of the outer membrane of the pathogen [[79]35]. Another
enriched pathway is the biosynthesis of amino acids, and it is a
reasonable metabolic pathway that can be targeted since it is
indispensable for pathogens [[80]36]. Like lipopolysaccharides,
peptidoglycans are also indispensable molecules for the functionality
of bacterial cell wall [[81]37], and their biosynthesis pathway was
also captured ([82]Fig 2). Consequently, the general composition of
enriched pathways indicates that targeted proteins serve in the
pathways that are crucial for amino acid and cell membrane metabolisms.
Fig 2. Pathway enrichment analysis result.
[83]Fig 2
[84]Open in a new tab
The values on the x axes indicates number of drug targets in the
related pathway with no homology to human proteins.
Druggability analysis was performed to identify potential drug targets
that can be targeted with chemical molecules. The druggability analysis
aims to identify proteins that have a high affinity to bind to
drug-like molecules since some proteins do not have this property
[[85]18]. Here, 43 potential drug targets that have high affinity to
bind drug-like molecules were determined among 89 non-homologous
pathogen proteins ([86]S3 Table). Determining potential drug targets
that are broadly distributed among other harmful bacteria is one of the
important steps of the prioritization process. [[87]18]. To identify
such potential drug targets, broad-spectrum analysis was performed.
Finally, 28 potential drug targets that are non-homologous to human
proteins, druggable and broadly distributed among other bacteria were
determined. The list of final potential drug targets is given in
[88]Table 3 and [89]S4 Table. Of 28 predicted potential drug targets,
20 were reported as essential in the DEG database. When we ranked the
drug target list in terms of broad spectrum score, i.e. number of
pathogenic bacteria with significantly similar sequence of the gene, 16
of the top 20 genes were essential based on DEG (80%).
Table 3. Model-derived potential drug targets that obey three criteria: No
homology to human proteins, druggable, and broad-spectrum behaviour.
Locus Names Gene Symbol Protein Name Pathway Reported at DEG
STM1824 pabB Aminodeoxychorismate synthase component 1 Folate
biosynthesis Yes
STM0087 folA Dihydrofolate reductase Folate biosynthesis Yes
STM0183 folK 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine
pyrophosphokinase Folate biosynthesis No
STM3295 folP Dihydropteroate synthase Folate biosynthesis No
STM0064 dapB 4-hydroxy-tetrahydrodipicolinate reductase Biosynthesis of
amino acids, L-lysine biosynthesis via DAP pathway Yes
STM0213 dapD 2,3,4,5-tetrahydropyridine-2,6-dicarboxylate
N-succinyltransferase Biosynthesis of amino acids, L-lysine
biosynthesis via DAP pathway Yes
STM0207 mtnN 5’-methylthioadenosine/S-adenosylhomocysteine nucleosidase
Biosynthesis of amino acids, Cysteine and methionine metabolism. No
STM3486 aroB 3-dehydroquinate synthase Biosynthesis of amino acids,
Phenylalanine, tyrosine and tryptophan biosynthesis No
STM2384 aroC Chorismate synthase Biosynthesis of amino acids,
Phenylalanine, tyrosine and tryptophan biosynthesis No
STM3862 glmU Bifunctional protein GlmU UDP-N-acetyl-alpha-D-glucosamine
biosynthesis Yes
STM2094 rmlC dTDP-4-dehydrorhamnose 3,5-epimerase Polyketide sugar unit
biosynthesis, Streptomycin biosynthesis No
STM1772 kdsA 2-dehydro-3-deoxyphosphooctonate aldolase
Lipopolysaccharide biosynthesis Yes
STM3316 kdsC 3-deoxy-D-manno-octulosonate 8-phosphate phosphatase KdsC
Lipopolysaccharide biosynthesis No
STM0988 kdsB 3-deoxy-manno-octulosonate cytidylyltransferase
Lipopolysaccharide biosynthesis Yes
STM0310 gmhA Phosphoheptose isomerase Lipopolysaccharide biosynthesis
No
STM0228 lpxA Acyl-[acyl-carrier-protein]—UDP-N-acetylglucosamine
O-acyltransferase Lipopolysaccharide biosynthesis Yes
STM0134 LpxC UDP-3-O-acyl-N-acetylglucosamine deacetylase
Lipopolysaccharide biosynthesis Yes
STM1200 tmk Thymidylate kinase Pyrimidine metabolism Yes
STM1707 pyrF Orotidine 5’-phosphate decarboxylase Pyrimidine metabolism
Yes
STM1426 ribE Riboflavin synthase, alpha chain Riboflavin metabolism Yes
STM0417 ribH 6,7-dimethyl-8-ribityllumazine synthase Riboflavin
metabolism Yes
STM0045 ribF Riboflavin biosynthesis protein Riboflavin metabolism Yes
STM3307 murA UDP-N-acetylglucosamine 1-carboxyvinyl transferase
Peptidoglycan biosynthesis. Amino sugar and nucleotide sugar metabolism
Yes
STM0129 murC UDP-N-acetylmuramate—L-alanine ligase Peptidoglycan
biosynthesis. D-Glutamine and D-glutamate metabolism Yes
STM0123 murE
UDP-N-acetylmuramoyl-L-alanyl-D-glutamate—2,6-diaminopimelate ligase
Peptidoglycan biosynthesis.Lysine biosynthesis Yes
STM0128 murG UDP-N-acetylglucosamine—N-acetylmuramyl-(pentapeptide)
pyrophosphoryl-undecaprenol N-acetylglucosamine transferase
Peptidoglycan biosynthesis Yes
STM0124 murF UDP-N-acetylmuramoyl-tripeptide—D-alanyl-D-alanine ligase
Peptidoglycan biosynthesis.Lysine biosynthesis Yes
STM3725 coaD Phosphopantetheine adenylyltransferase Pantothenate and
CoA biosynthesis Yes
[90]Open in a new tab
Analysis of the prioritized drug targets
The prioritized drug targets ([91]Table 3) were clustered into pathways
that are crucial for the survival of the pathogen. There are four
proteins, pabB, folA, folK, folP, that take part in folate biosynthesis
pathways. For nearly all organisms, the folate biosynthesis pathway is
one of the indispensable pathways in order to maintain life. Folates
are necessary for the production of essential biomolecules such as
nucleic acids and amino acids. Most bacteria, fungi and plants can
synthesize folate, while animal cells take it up from external sources
[[92]38]. This pathway is a potentially promising drug target since
human cells do not have a folate synthesis mechanism that might be
manipulated by a pathogen. Five of the prioritized drug targets have
functionality in the biosynthesis of amino acids (dapB, dapD, mtnN,
aroB, aroC). dapB and dapD take place in L-lysine biosynthesis via
diaminopimelic acid (DAP) pathway, and the side product of this pathway
is m-DAP, which is an essential biomolecule for peptidoglycan cell wall
for gram-negative bacteria [[93]39]. glmU has a role in the production
of UDP-N-acetyl-alpha-D-glucosamine, which is essential for bacterial
cell wall [[94]40]. rmlC has an important role in the synthesis of
L-rhamnose, which is an important saccharide for the virulence of some
pathogens including S. Typhimurium. The absence of L-rhamnose
biosynthesis pathway in human cells makes this drug target more
appealing [[95]41]. The survival of bacterium depends on the integrity
of cell envelope. kdsA, kdsC, kdsB, gmhA, lpxA and LpxC have roles in
the production of lipopolysaccharides, which is critical for the
formation of cell envelope [[96]42]. tmk and pyrF are involved in
pyrimidine metabolism, which is crucial for all living organisms. tmk
catalyzes the phosphorylation of thymidine 5’-monophosphate, which is
an essential reaction for pyrimidine synthesis [[97]43]. ribE, ribF and
ribH are required for the production of riboflavin, which is a
precursor of flavin mononucleotide (FMN) and flavin adenin dinucleotide
(FAD). Riboflavin synthesis is a pathway required for the survival of
gram-negative bacteria in the absence of external riboflavin synthesis
[[98]18]. murA, murC, murE, murG and murF are involved in the synthesis
of peptidoglycans, which is an essential ingredient for bacterial cell
wall biogenesis [[99]44]. murA catalyzes the first step of
peptidoglycan biosynthesis, and deletion of murA leads to death of
Escherichia coli and Streptococcus pneumoniae. Fosfomycin is an
antibiotic that targets murA in order to kill bacteria [[100]45]. Mur
ligases are known to be attractive drug targets because of their role
in bacterial cell wall formation [[101]46]. Pantothenate is a main
precursor of coenzyme A, and its absence leads to deficiency in
bacterial growth. coaD is involved in the fourth step in the coenzyme A
biosynthesis pathway, which was investigated before as a suitable
antibiotic target [[102]47].
Identification of potential drugs for pabB
4-aminobenzoic acid (PABA) synthesis is an attractive antibiotic target
since it is an essential biomolecule for many pathogens and it does not
have a human counterpart. PABA has two main functionalities in the
bacteria; (i) it is a substrate for folic acid pathway, which is
critical for survival of pathogen, (ii) it is a precursor in coenzyme Q
biosynthesis, which is essential for virulence [[103]48, [104]49]. PABA
is synthesized in two steps, and the first step is catalyzed by pabA
and pabB enzymes by converting chorismate to 4-amino-4-deoxychorismate
[[105]50]. And, the second step is the production of PABA from
4-amino-4-deoxychorismate. pabB was detected as one of the putative
drug targets in this study by our drug target prioritization pipeline.
We specifically focused on pabB in the rest of our study as a drug
target to eliminate Salmonella infections since it has critical
functionality in PABA synthesis pathway, which is not represented in
human cells. pabB was reported to be essential experimentally in the
DEG database, and, among the identified drug targets with experimental
validation ([106]Table 3), it ranks 7^th in terms of the number of
pathogenic bacteria strains that carry a gene with high sequence
similarity based on our broad-spectrum analysis. Additionally, we
investigated the importance of pabB in the pathogen-host integrated GMN
model. Interactions of the metabolites of the pabB reaction
(chorismate, 4-amino-4-deoxychorismate, L-glutamate, L-glutamine) with
the metabolites of other reactions were visualized by creating a
metabolite-metabolite interaction network ([107]S1 Fig). L-glutamate is
directly related to numerous amino acids such as alanine, leucine and
asparagine. On the other hand, 4-amino-4-deoxychorismate is indirectly
related with the production of thymidine through tetrahydrofolate. It
is also linked with the production of adenine through R-Pantoate.
Therefore, DNA synthesis is dependent on the production of
4-amino-4-deoxychorismate through adenine and thymine synthesis which
cannot be synthesized when pabB is inhibited.
There is a very similar protein to this putative target in Escherichia
coli, which is also called pabB. Formic acid, which is widely used as
an antibacterial agent in fodders, was reported in DrugBank as a
compound that targets pabB in Escherichia coli (strain K12) [[108]51].
To our knowledge, pabB was not offered or examined as a drug target for
Salmonella species before. Hereby, protein docking and molecular
dynamic analysis were performed to determine novel molecules that can
inhibit S. Typhimurium growth by binding pabB. The protein
Aminodeoxychorismate synthase component 1, which belonged to S.
Typhimurium (strain LT2 / SGSC1412 / ATCC 700720) with a UniProt ID:
[109]P12680 (PABB_SALTY), was taken for elaborative structural and
functional studies. The 3D structure of the protein is the core
requirement to carry out protein molecular docking studies and its
unavailability necessitated the modeling of the three-dimensional
structure. Herein, first step was the comparative 3D modeling of the
query protein with the application of MODELLER Software. The crystal
structure of 4-amino4-deoxychorismate (ADC) synthase (PDB ID: 1K0E) was
chosen as a template as its percent identity was 76.38% and query
coverage was 99% against [110]P12680. The MODELLER yielded five 3D
structures of [111]P12680 protein, out of which the best model with
higher accuracy was selected ([112]S2 Fig) after thorough quality
assessment using PROCHECK server. The Verify3D passed the modelled
structure with 87% indicating 80% amino acids of the built model having
score > = 0.2 in the 3D/1D profile. Further, the Ramachandran plot
([113]S3 Fig) showed 92.7% residues in the most favoured region whereas
only 0.3% residues were reported in disallowed region. Once the model
was finalized, Mg^2+ ion was incorporated into the built 3D structure
as magnesium ion is reported as the cofactor within the target protein
in the UniProt indicating its role in the catalysis activity.
Therefore, Mg^2+ ion was complexed near the active site residues
reported in UniProt and in literature [[114]48].
The DoGSiteScorer produced 10 probable binding pockets, out of which
pocket number one was selected as the binding site to utilize for
molecular docking. The selected pocket has a druggability score of
0.83, suggesting the prime region for drug binding. The binding pocket
was analyzed in Chimera software [[115]52], and validated as it covers
the active site amino acid residues reported in the UniProt along with
the amino acid residues coordinating with the Mg^2+ ion. The predicted
binding site can be seen in ([116]S4 Fig).
Once all the 54,000 drug-like compounds were screened against the
target protein ([117]P12680), the lowest binding energy conformation of
each single 54,000 compound(s) was obtained. The overall binding
energies ranked against the [118]P12680 are represented through a
histogram in [119]Fig 3.
Fig 3. Histogram illustration of the overall binding energy retrieved through
virtual screening.
[120]Fig 3
[121]Open in a new tab
The thorough analysis of the results suggested that 1659 compounds
showed promising binding free energy ranging from -12.42 kcal/mol to
-9.04 kcal/mol, while 22,231 compounds having binding free energies
within the range of -5.66 kcal/mol to -2.28 kcal/mol. Moreover, top ten
compounds that were docked near the active site of the target protein
were (having binding free energy of -12.42, -12.02, -11.92, -11.90,
-11.91, -11.98, -11.77, -11.73, -11.42, -11.72) retrieved to further
investigate their chemical interactions with amino residues of target
protein. The aforementioned top ten compounds with their ZINC IDs are
reported in ([122]S5 Table).
The protein-ligand complex of each top ten compound was explored to
inspect the chemical bonds and interactions occurring within the
protein-ligand complex. LigPlot+ generates the atomic interactions
taking place among the target protein and ligand (drug) through
hydrogen bonds and hydrophobic contacts. Each of the ten protein-ligand
complex analyzed in LigPlot+ has shown ligand interactions with the
Mg^2+. The reported active site residues from UniProt (i.e. Lys275 and
Glu259) were found to be interacting with the top ten ligands, out of
which Lys275 can be seen making hydrogen bond with five ligands and
hydrophobic contacts with two ligands. ([123]Table 4). On the other
hand, Glu259 makes hydrophobic contacts with six of the ligands. The
most common amino acid residues interacting with top ten ligands
through hydrogen bonding were identified as Thr277, Gly427, Glu440,
Lys444, Lys275, Arg411 and Glu259. Finally, each of the amino acid
residues making hydrogen bonds and hydrophobic contacts interaction
with the top ranked 10 compounds is presented in [124]Table 4 and
visualized in [125]S5–[126]S7 Figs. There is not any report in
literature about the use of these compounds against pathogenic
bacteria. Therefore, they remain open to experimental validation.
Table 4. Protein residues involved in the hydrogen and hydrophobic
interactions with top ten best ranking compounds (ligands), analyzed through
LigPlot+.
S.No ZINC IDs Residues making hydrogen bond interaction Residues making
Hydrophobic contacts
1 ZINC7879733 Thr277(A), Gly427(A) Lys444(A), Arg411(A) Lys275(A)
Ile410(A), Ala424(A), Gly427(A) Asn214(A) Trp391(A), Ser423(A),
Gly426(A), Glu259(A) Ile368(A), Cys422(A), Gly425(A), Gly276(A),
Ile274(A)
2 ZINC15179659 Gly427(A), Arg411(A), Lys444(A), Thr277(A), Glu440(A)
Lys275(A), Trp391(A), Val445(A), OThr277(A) Ala424(A), Ser367(A),
Ile410(A), Gly425(A) Ser423(A), Ile368(A), Ile448(A), Asn214(A),
Gly426(A) Thr412(A), Cys422(A), GIle274(A), ly276(A)
3 ZINC14880941 Thr277(A), Ser423(A) Gly427(A), Gly276(A) Glu440(A),
Trp391(A) Asn214(A) Lys444(A), Arg411(A), Val445(A), Ile410(A)
Lys275(A), Ile274(A), Gly425(A), Ala424(A), Ile368(A) Gly426(A),
Ile448(A), Glu259(A), Cys422(A)
4 ZINC58542694 Arg411(A), Thr277(A), Glu440(A), Lys275(A), Gly427(A),
Lys444(A), Gly276(A), Trp391(A), Ile410(A), Ser423(A), Ala424(A),
Ile274(A), Asn214(A), Ile368(A), Val445(A), Gly425(A), Gly426(A),
His340(A), Ser367(A),
5 ZINC1201089024 Trp391(A), Lys275(A), Asn214(A), Arg411(A), Glu440(A),
Thr277(A), Gly427(A), Lys444(A), Gly276(A), Ile410(A), Ile274(A),
Gly425(A), Ile368(A), Val445(A), Glu259(A), Ser423(A), Gly426(A)
Cys422(A), Ile448(A),
6 ZINC27071723 Lys444(A), Trp391(A), Lys275(A), Glu440(A), Thr277(A),
Gly427(A), Arg411(A), Gly276(A), Ile274(A), Val445(A), Ser423(A),
Gly425(A), Ile410(A), Cys422(A), Gly426(A), Thr412(A), Ala424(A),
Ile448(A), Asn214(A), Glu259(A), Ile368(A),
7 ZINC7133393 Arg411(A), Gly425(A), Glu440(A), Thr277(A), Gly427(A)
Lys444(A), Gly276(A), Trp391(A), Ile410(A), Ala424(A), Val445(A),
Ser423(A), Glu259(A), Asn214(A), Gly426(A)Ser367(A)
8 ZINC7879735 Gly427(A), Lys444(A), Arg411(A), Glu440(A), Thr277(A),
Lys275(A), Gly425(A), Gly276(A), Asn214(A), Ile274(A), Ala424(A),
Ile410(A), Ser367(A), Trp391(A), Ser423(A), Gly426(A), Cys422(A),
Val445(A),
9 ZINC58542238 Gly427(A), Gly425(A), Glu440(A), Lys275(A), Thr277(A)
Arg411(A), Lys444(A), Gly276(A), Trp391(A), Ser423(A), Val445(A),
Ile274(A), Ile410(A), Asn214(A), Ala424(A), Gly426(A), His340(A),
Ile368(A), Thr412(A), Cys422(A), Ile448(A),
10 ZINC7538530 Arg411(A), Glu440(A), Thr277(A), Gly427(A), Lys444(A),
Gly276(A), Lys275(A), Ile274(A), Gly425(A), Trp391(A), Ile410(A),
Ser423(A), Ala424(A), Gly426(A), Asn214(A), Glu259(A), Cys422(A),
Val445(A)
[127]Open in a new tab
Conclusions
Analysis of pathogen-host integrated GMN models is a well-suited
approach in terms of identifying novel drug targets by considering
pathogen-host interactions. It allows tracking the response of pathogen
and host simultaneously during infection by mapping infection-induced
dual-transcriptome data. There are some pathogen-host GMN models
published [[128]23, [129]24], but to our knowledge, this is the first
study in the literature that determines drug targets by analysing
condition-specific pathogen-host integrated GMNs created by mapping
dual-transcriptome data. We here reconstructed and analyzed
condition-specific integrated GMN models to identify novel drug targets
for S. Typhimurium induced infections. We used prioritization steps to
identify best suitable novel drug targets. After prioritization
processes, we identified 28 putative drug targets, and pabB was chosen
as a high ranked drug target based on our prioritization pipeline,
literature information and novelty. Subsequently, homology and
molecular docking analyses were performed to identify candidate
compounds that inhibit pabB. The top ten compounds in terms of binding
free energy were identified and reported. Consequently, analysing S.
enterica metabolism inside the host cell has enabled us to comprehend
metabolic alterations in both HeLa cells and S. enterica along with
determining novel drug targets. Future studies may provide more
arguments for the proposed drug targets and inhibitors in this study.
This study can be used as a guideline for creating and analysing
condition specific pathogen-host GMN models.
Materials and methods
The flowchart of the pipeline followed in this study is given in
[130]Fig 4. Each step is detailed in the sections below.
Fig 4. The flowchart of the pipeline followed in this study.
[131]Fig 4
[132]Open in a new tab
Transcriptome data
The dual RNA-seq data of infected HeLa cells and S. Typhimurium strain
SL1344 [[133]25] was downloaded from NCBI Gene Expression Omnibus (GEO)
Database [[134]53]. The dataset ID in the GEO database is
[135]GSE117236. The samples collected at the beginning of infection and
the post-infection data at 8^th and 16^th hours were used in this
study. Each time point included duplicate samples. Principal component
analysis (PCA) was used to identify any possible outliers in the data,
and no outliers were detected ([136]S8 Fig).
Pathogen-host integrated genome scale metabolic network
Two different genome-scale metabolic models from the literature were
used to reconstruct an integrated pathogen-host genome scale metabolic
network (GMN). A genome-scale metabolic network of S. Typhimurium,
called stm_v1.0, consisting of 2,545 reactions controlled by 1,271
genes was used as the pathogen metabolic network [[137]21]. As the host
GMN, a recent reconstruction of human metabolism with a substantial
amount of curation, called iHsa, was used, which covered 8,336
reactions and 2,315 genes [[138]54]. S. Typhimurium is an intracellular
pathogen. Therefore, the GMN of the pathogen was placed inside the
cytoplasm of the host GMN as a separate compartment to create the
pathogen-host integrated GMN. Here, an extensive literature research
was performed to identify cytosolic host metabolites that can be
consumed by the pathogen during infection. An exchange reaction with
the extracellular environment must be available for these metabolites
in the S. Typhimurium metabolic network. 38 such metabolites were
identified [[139]26, [140]29, [141]55–[142]57], and they were allowed
to be taken up by the S. Typhimurium GMN from the cytoplasm of the host
([143]S6 Table). In addition, the pathogen was allowed to secrete all
its exchange metabolites to the host cytoplasm as defined by the model
secretion reactions ([144]S7 Table). As a result, the pathogen-host
integrated GMN consisting of 11,029 reactions controlled by 3,586 genes
was created by using COBRA Toolbox v.3.0 on MATLAB programming platform
[[145]58].
Flux balance analysis
Pathogen-host integrated GMN was analyzed using flux balance analysis
(FBA) method to predict fluxes associated with the infection times
studied. FBA searches the solution space defined by mass balance and
reaction reversibility constraints to find an optimal solution with the
help of an objective function. FBA assumes that the system is at steady
state, i.e. the concentrations of intracellular metabolites do not
change over sufficiently long time, leading to linear mass balance
constraints [[146]59]. The objective function of pathogen-host
integrated GMN was defined based on a weighted relationship between
host and pathogen biomass, and the biomass composition formulas were
taken from iHsa and stm_v1.0 models ([147]Eq 1) [[148]21, [149]54,
[150]60]. HB, PB and TB are host biomass, pathogen biomass and total
biomass respectively in Eq ([151]1), where α and β are maximum host and
pathogen biomass production rates in order. Using the constructed
pathogen-host GMN, α and β were calculated first with FBA via
maximization of HB and PB reactions separately. Later, maximum host and
pathogen biomass production rates (α and β), which were calculated as
0.39 and 0.28 respectively, were added to the equation as weights of HB
and PB to get TB, which represents the balanced effect of HB and PB.
[152]Eq 1 was added as a reaction to the pathogen-host integrated GMN
and later set as the objective function.
[MATH:
∝×HB+β×<
mi>PB=TB :MATH]
(Eq 1)
The upper bound of glucose uptake rate of S. Typhimurium was set to 5
mmol/gDW/h in simulations based on the studies of Thiele and her
coworkers [[153]21]. The maximum uptake rate of other available carbon
sources for S. Typhimurium inside the host cell was limited to 20% of
its glucose uptake rate. Oxygen uptake rate of S. Typhimurium was set
to 1 mmol/gDW/h to mimic the hypoxic environment during infection
[[154]61]. The host cell was allowed to utilize only metabolites that
were found in the Dulbecco’s Modified Eagle’s Medium (DMEM) since the
infection experiment for the dual RNA-seq data, which was used in
creating condition-specific integrated GMN models, was carried out in
this medium ([155]S8 Table). In GMN models, alternate optima can be an
issue in interpreting the results of FBA since there might be multiple
flux distributions that result in the same value for the objective
function. Minimization of the sum of squares of all flux values was
applied to prevent alternate optima [[156]62]. The principle of this
method was proposed based on the accomplishment of the cellular goals
with minimal resource expenditure since the flux values are an
indication of the amount of depletion of resources [[157]63].
Integrating transcriptome data with pathogen-host integrated GMN
Condition specific GMNs were generated by mapping dual RNA-seq data on
the pathogen-host GMN to simulate infection states at different time
points. Gene Inactivity Moderated by Metabolism and Expression (GIMME)
algorithm was used as the mapping algorithm to generate condition
specific GMNs [[158]22]. GIMME determines active reactions based on a
threshold put on the mRNA levels in the data, where reactions below the
thresholds are set as inactive. GIMME generates a GMN with the desired
functionality using the objective fraction parameter, and it adds the
reactions in the inactive set back if their removal affects the desired
functionality [[159]22]. The threshold value was determined as the
quarter of the mean of the transcriptome data. Since the average gene
expression values were much higher in HeLa cell compared to S.
Typhimurium in the utilized dual RNA-seq data ([160]S9 Fig), the
threshold value was separately determined for both organisms. Then,
since GIMME algorithm accepts a single threshold, the difference
between the organism-specific threshold values were added to the
reaction scores of S. Typhimurium, and the threshold value obtained
from the human transcriptome was used as the threshold value in GIMME
simulations. The objective fraction parameter was set to 0.2 in order
to ensure that the condition specific integrated GMN produces at least
20% of the maximum TB. Three different condition specific GMNs were
produced as a result to represent the start of infection (0^th hour)
and infections at 8^th and 16^th hours.
Identification of drug targets
Gene deletion analysis is a widely used approach in constraint-based
metabolic modeling to predict potential drug targets, and it is
performed by in silico deletion of genes in the GMN [[161]10]. The
analysis aims to obtain essential genes for the desired functionality
of a GMN, such as preventing growth of pathogen. FBA can be used to
predict essential genes in an organism by setting the rate of the
associated reaction(s) to zero for each gene. If inactivation of the
reaction(s) lead to zero growth rate, the gene is essential for the
pathogenic organism and it can be used as a drug target. In this study,
gene deletion analysis was performed to identify potential drug targets
that can restore the metabolic changes in the host cell caused by S.
Typhimurium induced infection. GIMME-based condition-specific GMNs were
used in the analysis. A nonzero rate for the production of HB together
with zero rate for PB was used as the desired functionality of the
condition specific GMNs in gene deletion analysis. Using the FBA
technique to maximize HB and PB separately, essential genes were
obtained, and the enzymes that catalyze these reactions were chosen as
potential drug targets.
The identified drug targets were further filtered based on the approach
applied elsewhere [[162]18]. Briefly, the steps in the approach
followed are as follows: (i) Homology analysis is a critical part of
drug target selection process, and the aim of the analysis is
determining drug targets that are not similar to host proteins in order
to avoid side effects of drugs. Homology analysis was performed using
BLASTp algorithm, and drug targets that are not similar to human
proteins were identified [[163]64]. As BLASTp parameters, cut off for
the expected value (E-value) was chosen as 1x10^-4, and the maximum
sequence identity for the determination of human-non-homolog drug
targets were chosen as 30% [[164]18]. Pathway enrichment analysis was
performed with the identified human-non-homolog drug targets by using
KOBAS V3.0 [[165]65] to elucidate pathways that are expected to be
crucial for the survival of S. Typhimurium inside the host. (ii)
Another important step in the model-based drug target selection process
is the determination of druggable proteins, and the goal of this
selection is identifying drug targets that can be targeted with
drug-like chemical compounds. Druggable targets were identified using
BLAST algorithm in Drugbank database [[166]51], and the E-value was
chosen as 10^−25 [[167]18]. (iii) Broad spectrum analysis was performed
for the identification of drug targets that are broadly distributed
among other bacteria. Broad spectrum analysis is very beneficial in
order to determine drug targets that can be effective against
co-infections or multiple infections. In addition, the proteins that
are broadly distributed among other bacteria may indicate low mutation
rate, so developing antibacterial resistance can be harder for the
targeted bacteria. Broad spectrum analysis was performed using PBIT web
browser [[168]66], which contains protein sequences of 181 pathogenic
organisms. E-value cut-off of 1x10^-5, bit score of 100 and sequence
identity of 35% were chosen as parameter values [[169]18, [170]66]. For
the broad spectrum target determination criteria, targets available in
at least 40 pathogenic strains were chosen [[171]18].
Homology modeling of target protein and active / binding pocket prediction
As the 3D structure of target protein was not available in the Protein
Data Bank (PDB) [[172]67], the homology modeling was performed to model
the target protein’s 3D structure. In the first step, the appropriate
template protein was searched by BLASTp against the PDB database. The
template that matches the criteria of query coverage ≥ 90% and percent
identity ≥ 70% was selected. The Modeller software v9.20 was used to
perform comparative protein structure modelling [[173]68–[174]71]. The
Modeller works by satisfying the spatial restraints along with
employing specific geometrical calculations generating possible
coordinate(s) for the location of each single atom of target protein
[[175]70]. The homology modeling was performed through Modeller by
implementing a series of python script(s), which resulted in generating
five models with their Discrete Optimized Potential Energy (DOPE)
values. The 3D model with the lowest DOPE value was selected as the
predicted 3D structure of the target protein. The predicted 3D model
was then validated through verify3D and Ramachandran Plot via PROCHECK
server to ensure its reliability and quality. The Verify3D program was
used to interpret the quality of the built model, as Verify3D computes
the compatibility of 3D model of target protein against its amino acid
sequence [[176]72]. Further, Ramachandran Plot was analyzed to assess
the stereo-chemical properties of the predicted 3D model [[177]73].
The built target protein model was subjected to DoGSiteScorer to
identify potential binding pockets. The DoGSiteScorer is an automated
grid-based program which utilizes difference of Gaussian filter to
identify the potential binding pocket present within the protein
[[178]74, [179]75]. The program provides ten potential binding pockets
with druggability scores. Out of the ten predicted pockets, the one
with the highest druggability score is supposed to be a rich binding
pocket and therefore, selected for study. The program also evaluates
the depth, surface area and volume for each predicted binding pocket.
Docking-based virtual screening
To carry out the rigorous virtual screening of drug-like molecules
against the target protein, a library of chemical compounds was
curated. The ZINC15 database was used to retrieve the drug-like
molecules following certain criteria of compound’s molecular weight and
logP values [[180]76]. The drug-like compounds having logP value ≤ 5
and Molecular Weight ≤ 375 Daltons were obtained from the database in
SDF format. A total of 54,000 drug-like compounds were compiled and
prepared into the required PDBQT format by using Open Babel. All the
54,000 compounds were minimized with force field MMFF94 via Open Babel
[[181]77].
The AutoDock-GPU [[182]78] was chosen to execute molecular docking and
virtual screening of curated drug-like molecules (ligands) library
against the target protein ([183]P12680). The AutoDock GPU was chosen
because of the prolonged execution times whilst using AutoDock4.
AutoDock-GPU, which is an OpenCL and cuda based implementation of
Autodock4, was created to utilize large number of GPU cores and speed
up docking by using parallel processing [[184]79, [185]80]. So, to
execute molecular docking, the target protein’s modelled 3D structure
was prepared by adding hydrogens and its conversion into the required
format of PDBQT. Afterwards, the grid search box dimensions were set
carefully to cover the predicted binding site retrieved from the
DoGSiteScorer. Further docking steps were carried out to create docking
parameter files, and, eventually, the grid maps FLD file, docking
parameters files, GPF and DPF were made. Once the necessary files were
created, the molecular docking was performed to screen all the 54,000
compounds with 20 Genetic Algorithm (GA) runs against target receptor’s
binding site. The virtual screening yielded binding free energy for 20
runs of all the 54,000 compounds. Ultimately, the lowest binding energy
of each compound was extracted for structural and functional
evaluation. Moreover, re-docking step was performed by execution of all
the Autodock4 steps utilizing only one ligand against the target
protein to validate predicted binding site.
LigPlot+ program was used to generate 2D diagrams of the target protein
and ligand complex. For the LigPlot+ analysis, protein-ligand complex
in PDB format was taken. The atomic interaction(s) within the diagram
is shown, where ligand and protein’s interactive residues are
represented in ball-stick format. From the diagram, the amino acid
residues of the target protein making chemical interactions with the
ligand can be identified. The LigPlot+ highlights the hydrogen bonding
within the atoms of protein and ligand with green dotted lines.
Supporting information
S1 Fig. Metabolic network of pabB related metabolites.
The red arrows indicate the reaction controlled by pabB (L-Glutamine +
chorismate -> L-Glutamate + 4-amino-4-deoxychorismate).
(TIF)
[186]Click here for additional data file.^ (1.4MB, tif)
S2 Fig. The predicted 3D structure of [187]P12680.
(TIF)
[188]Click here for additional data file.^ (342.7KB, tif)
S3 Fig. Ramachandran plot of the predicted 3D structure of target
protein ([189]P12680).
(TIF)
[190]Click here for additional data file.^ (223.6KB, tif)
S4 Fig. The predicted binding pocket is shown as surface in sand brown
color while the protein chain as ribbon in blue color.
From the figure, it can be seen that Mg2+ ion is submerged within the
predicted binding pocket.
(TIF)
[191]Click here for additional data file.^ (258.3KB, tif)
S5 Fig
A) is representing the LigPlot+ of [192]P12680 showing atomic
interaction between protein (residues/ Mg2+ ion) and ligand
(ZINC7879733). The atomic linkages due to hydrogen bonding can be
identified from the diagram. Similarly, B) represents the LigPlot+
analysis of [193]P12680 and ligand (ZINC15179659), C) represents the
LigPlot+ for [194]P12680 and ligand (ZINC14880941), and D) represents
the LigPlot+ for [195]P12680 and ligand (ZINC58542694).
(TIF)
[196]Click here for additional data file.^ (359.4KB, tif)
S6 Fig
A) is the LigPlot+ for [197]P12680 and ligand (ZINC1201089024), all the
atomic linkages occurring between the protein-ligand complex can be
analyzed. Likewise, B) represents the LigPlot+ for [198]P12660 and
ligand (ZINC27071723), C) is LigPlot+ for [199]P12680 and ligand
(ZINC7133393) and D) is LigPlot+ for [200]P12680 and ligand
(ZINC7879735).
(TIF)
[201]Click here for additional data file.^ (387.3KB, tif)
S7 Fig
A) is the LigPlot+ for [202]P12680 and ligand (ZINC58542238) complex
and B) is the LigPlot+ for [203]P12680 and ligand (ZINC7538530)
complex. The overall amino acid residues of [204]P12680 which interact
with each of the top ten compounds (ligands) making hydrogen bonds and
hydrophobic contacts.
(TIF)
[205]Click here for additional data file.^ (286.8KB, tif)
S8 Fig. Principal component analysis (PCA) of [206]GSE117236.
The red, blue and green dots represent beginning of infection, 8^th
hour of infection and 16^th hour of infection respectively.
(TIF)
[207]Click here for additional data file.^ (48.4KB, tif)
S9 Fig. Box plots of gene expression values of HeLa cell and S.
Typhimurium.
(TIFF)
[208]Click here for additional data file.^ (11.7KB, tiff)
S1 Table. 140 essential genes based on DEG database.
(DOCX)
[209]Click here for additional data file.^ (15.7KB, docx)
S2 Table. 89 potential drug targets that are not similar to human
proteins.
(DOCX)
[210]Click here for additional data file.^ (15.8KB, docx)
S3 Table. Potential drug targets that have high affinity to bind
drug-like molecules.
(DOCX)
[211]Click here for additional data file.^ (14.2KB, docx)
S4 Table. The list of final potential drug targets.
The last column indicates broad spectrum analysis results.
(DOCX)
[212]Click here for additional data file.^ (16KB, docx)
S5 Table. Binding energy, Zinc IDs, 1D and 2D structure of each of top
10 compounds.
(DOCX)
[213]Click here for additional data file.^ (566.2KB, docx)
S6 Table. Metabolites that can be consumed by S. Typhimurium inside
host cytoplasm in pathogen-host GMN model.
(DOCX)
[214]Click here for additional data file.^ (15.2KB, docx)
S7 Table. The exchange metabolites of S. Typhimurium that match with
the cytoplasmic metabolites of human cell.
(DOCX)
[215]Click here for additional data file.^ (20.8KB, docx)
S8 Table. DMEM medium constraints for the host GMN used in metabolic
model simulations.
(DOCX)
[216]Click here for additional data file.^ (15.3KB, docx)
S9 Table. Long names of metabolites given in [217]S1 Fig, the figüre
that reports metabolite-metabolite interactions around pabB catalyzed
reaction.
(DOCX)
[218]Click here for additional data file.^ (19.9KB, docx)
S1 File. The MATLAB codes to generate and analyze pathogen-host
integrated genome scale metabolic models of S. Typhimurium and human.
(ZIP)
[219]Click here for additional data file.^ (3.1MB, zip)
Data Availability
The codes are provided as a [220]supplementary file. The transcriptome
data was downloaded from Gene Expression Omnibus (GSE117236).
Funding Statement
This work was supported by TUBITAK, The Scientific and Technological
Research Council of Turkey (Project Code: 316S005) and by PSF, The
Pakistan Science Foundation [Project Code: PSF-TUBITAK/S-HEJ (04)]. The
funders had no role in study design, data collection and analysis,
decision to publish, or preparation of the manuscript.
References