Abstract
This study aimed to identify potential novel drug candidates and
targets for Parkinson’s disease. First, 970 genes that have been
reported to be related to PD were collected from five databases, and
functional enrichment analysis of these genes was conducted to
investigate their potential mechanisms. Then, we collected drugs and
related targets from DrugBank, narrowed the list by proximity scores
and Inverted Gene Set Enrichment analysis of drug targets, and
identified potential drug candidates for PD treatment. Finally, we
compared the expression distribution of the candidate drug-target genes
between the PD group and the control group in the public dataset with
the largest sample size ([38]GSE99039) in Gene Expression Omnibus. Ten
drugs with an FDR < 0.1 and their corresponding targets were
identified. Some target genes of the ten drugs significantly overlapped
with PD-related genes or already known therapeutic targets for PD. Nine
differentially expressed drug-target genes with p < 0.05 were screened.
This work will facilitate further research into the possible efficacy
of new drugs for PD and will provide valuable clues for drug design.
Subject terms: Neurology, Neurological disorders, Data mining
Introduction
Parkinson's disease (PD) is a pervasive, progressive, disabling
neurodegenerative disorder with motor and nonmotor features^[39]1. PD
places a significant burden on society and the affected individuals,
and approximately 6.1 million people worldwide had been diagnosed with
PD in 2016^[40]2. Dopamine depletion leading to hyperactivity of the
corticostriatal glutamatergic pathway is thought to be primarily
responsible for parkinsonian symptoms such as resting tremors,
rigidity, dyskinesia and postural instability^[41]3,[42]4. Levodopa is
the gold standard drug that provides symptomatic relief from motor
problems but it has some side effects, and its effectiveness is reduced
under long-term treatment. In the present context, medication
(including dopamine agonists and monoamine oxidase B inhibitors) and
invasive surgery (deep brain stimulation) are being used to reduce the
shortcomings of levodopa therapy^[43]5. These therapies are quite
helpful but are not always completely satisfactory. Thus, it is
imperative to identify and develop promising drugs for preventing and
treating PD.
Unfortunately, drug discovery is expensive and time-consuming. New drug
development is affected by many factors, and 85.1% of potential drugs
for PD tested thus far have failed in the clinical trial phase^[44]6.
Under such a situation, the repositioning of available drugs for other
disorders as potential novel therapeutic agents for PD becomes an ideal
approach. A well-known example of drug repositioning against cancer is
thalidomide, which was initially used as a sedative drug^[45]7. Network
pharmacology is a commonly applied strategy that analyzes biological
systems and establishes a drug-target-disease network for drug
repositioning^[46]8. Many computational methods based on transcriptomic
data have also been developed. Chuan et al. found 10 drugs that had
certain therapeutical effect on PD based on a handful of genes^[47]9.
Hindol et al. developed a bidirectional drug repositioning method to
find out new drugs for PD^[48]10. However, most previous studies have
mainly focused on certain specific genes and neglected the gene
expression signatures. Here, we present an integrated method for the
comparisons of gene expression signatures between a disease model and
drug-treated condition network, prediction of drug-protein
interactions, and large transcriptomic dataset mining.
The purpose of this scheme is to predict and identify new related drugs
and targets by applying integrated network pharmacology and
transcriptome analysis. Our work will facilitate further studies for
better preventive strategies for PD.
Results
PD-related genes
Genes from the five databases were integrated to include as many
PD-related genes as possible. Specifically, 249 genes were retrieved
from the Parkinson’s disease pathway (hsa05012) included in KEGG. By
querying the keyword “Parkinson’s disease”, 188 genes were retrieved
from OMIM, 9 genes from HGMD, 524 from Genotype, and 370 from DISEASE.
After filtering, a total of 970 genes were retained as the PD-related
genes. The expression levels of these genes in the PD and control
subjects ([49]GSE99039) are shown in the heatmap (Fig. [50]1a).
Detailed provenance information for the genes is given in a Venn
diagram (Fig. [51]1b).
Figure 1.
[52]Figure 1
[53]Open in a new tab
PD-related genes. (a) Heatmap for comparison between the control
samples and patients with PD. Colors correspond to standardized
log2-transformed expression values. Heatmap created using R and
“pheatmap” package^[54]48. (b) Veen diagram show the number of
PD-related genes collected from five databases. PD, parkinson’s
disease. Veen diagram created using R and “ggVennDiagram”
package^[55]49.
Protein–protein interactions between the PD-related genes
The PD-related genes were exported to the STRING, PINA, and HuRI
databases to construct the PPI network. STRING contained 1723
interactions among 188 genes/proteins after removing all interactions
with a combined score < 0.9; PINA contained 1127 interactions among 107
genes/proteins; and HuRI predicted 1411 experimental validation
interactions of 119 genes/proteins (see Supplementary Fig. [56]S1
online). We extracted 163 nonredundant genes/proteins and 1709
interactions by comparing the results from the three databases
(Fig. [57]2). Nodes represent PD-related genes/proteins and edges
represent interactions of these genes/proteins.
Figure 2.
[58]Figure 2
[59]Open in a new tab
The constructed PPI network with PD-related genes. The red nodes
represent up-regulated genes and the blue represent down-regulated
genes in [60]GSE99039. Visualization tool: Cytoscape (3.7.2).
[61]https://cytoscape.org/.
Enrichment analysis of the PD-related genes
In this study, based on GO and KEGG analysis of PD-related genes,
several enriched biological processes and metabolic pathways were
identified. The top 10 GO enrichment terms in the three GO categories
and the KEGG pathways were illustrated by a bubble diagram. GO
enrichment analysis of biological processes (BP) revealed that
PD-related genes were mainly involved in the ATP metabolic process,
energy derivation by oxidation of organic compounds, and oxidative
phosphorylation (Fig. [62]3a). Molecular function (MF) analysis showed
that these genes may take part in the cell adhesion molecule binding,
DNA-binding transcription activator activity, and RNA polymerase II
activity-specific processes (Fig. [63]3b). For cell component (CC)
analysis, our genes mainly enriched in mitochondrial inner membrane,
mitochondrial protein complex, and respirasome (Fig. [64]3c). KEGG
pathway analysis indicated their involvement in Parkinson, Alzheimer,
amyotrophic lateral sclerosis, prion disease, and Huntington disease
(Fig. [65]3d).
Figure 3.
[66]Figure 3
[67]Open in a new tab
GO and KEGG enrichment analysis of the PD-related genes. (a) Biological
process analysis. (b) Molecular function analysis. (c) Cellular
component analysis. (d) KEGG pathway analysis. The color of the node
represents significance, and the size of the node represents the number
of genes enriched into the function.
Network-based proximity between drugs and PD
We tidied up the list of proximal drugs and excluded drugs irrelevant
to PD by applying a network-based proximity analysis. The density plot
showed that the distance distribution of drugs to PD-related genes
overlapped but differed significantly from that of the reference data
in the range of − 10.0 ~ 5.0 (Fig. [68]4). The overlap can be observed
visually at the point of 1.0 ~ 2.0, suggesting that drugs in this range
were unlikely to be treatment candidates for PD while drugs with
distance < 1.0 might be effective for PD treatment. Thus, we took 1.0
as the threshold to screen candidate drugs for PD and to exclude any
irrelevant drugs.
Figure 4.
Figure 4
[69]Open in a new tab
Proximity between drugs and PD. Density plot showing the distance
distribution of all drugs to PD-related genes (pink) and the reference
data (blue).
Calculation of drug signatures
Ultimately, 46 drugs with an FDR < 0.25 were identified. They may
significantly influence PD-related genes. The drugs with FDR < 0.1 (ten
drugs) and the corresponding targets (fifty-one targets) are shown in
Table [70]1 and could potentially be options for therapies. Some target
genes of the ten drugs significantly overlapped with the PD-related
genes or the known therapeutic targets for PD. Additionally, we
explored and visualized the interactions between the ten predicted
drugs, their corresponding targets, and the PD-related genes
(Fig. [71]5).
Table 1.
Prediction of drug target and distance information by top10
(FDR < 0.1).
Sig-drug Proximity FDR Targets
Carfilzomib − 4.769 0.00 ABCB1, PSMB5, PSMB1, PSMB2, PSMB8, PSMB9,
PSMB10
Afatinib − 4.737 0.02 EGFR, ABCB1, ABCG2, ERBB4
Prednisolone acetate − 4.989 0.02 ABCB1, SLCO1A2, NR3C1, SERPINA6
Isosorbide − 4.989 0.03 BCL2, BCL2L1, MCL1
Arsenic trioxide − 4.557 0.04 CYP3A4, TXNRD1, AKT1, JUN, IKBKB, MAPK3,
CCND1, MAPK1, CDKN1A, HDAC1, PML
Vorinostat − 4.638 0.04 HDAC2, HDAC1, HDAC8, HDAC3, HDAC6, ACUC1
Tucatinib − 4.516 0.05 ABCB1, ABCG2, SLC22A2, CYP2C8, SLC47A1, SLC47A2,
ERBB3
Tazemetostat − 4.656 0.05 ABCB1, ABCG2, SLC47A1, SLC47A2, EZH2, EZH1
Avapritinib − 4.611 0.05 ABCB1, ABCG2, KIT
Tixocortol − 4.989 0.09 HDAC2, NR3C1
[72]Open in a new tab
Figure 5.
[73]Figure 5
[74]Open in a new tab
Building and analysis of drug-target-gene network. Purple nodes
represent drug candidates, orange diamonds represent drug targets, and
blue rectangles represent PD-related genes. Visualization tool:
Cytoscape (3.7.2). [75]https://cytoscape.org/.
Differentially expressed drug-target genes analysis
We took the intersection of DEGs and drug-target genes, and nine
differentially expressed drug-target genes with p < 0.05 were screened.
Then, we compared the expression distribution of these genes in the two
groups and visualized them with a box plot (Fig. [76]6a). In the final
step, the heatmap and clustering tree revealed a distinct expression
pattern of nine genes between the groups (Fig. [77]6b). PSMB10,
SLC47A2, HDAC8, and BCL2 were clustered into the same model, while KIT,
TXNRD1, JUN, AKT1, and PML were clustered into another model,
suggesting that these nine targets may act through two distinct
mechanisms. Further GO enrichment analysis of biological processes (BP)
revealed that the nine genes were mainly involved in the response to
oxidative stress. KEGG pathway analysis revealed that they were
enriched in the apoptosis, neurotrophin, estrogen, MAPK, and PI3K-Akt
pathway (see Supplementary Table [78]S1 online).
Figure 6.
[79]Figure 6
[80]Open in a new tab
Differentially expressed drug-target genes analysis. (a) A box plot
showing the differential expression distribution of nine drug-target
genes between two groups. (b) Heatmap and clustering of nine
differentially expressed drug-target genes.
Discussion
Recently, the repurposing of existing drugs has been proposed as a
strategy for new drug development^[81]11,[82]12. In the current work,
we selected a systematic computation framework to explore potential
treatment options for PD based on existing data about diseases, drugs
and drug targets.
Since drugs usually interact with specific targets to exert an effect
on biological processes, and drug targets always interact with
disease-related genes, we collected PD-associated genes. GO enrichment
analysis of all genes showed that the most enriched terms were
oxidative respiratory chain, energy metabolism and ion transport, which
are consistent with prior findings^[83]13–[84]15. These findings
established the foundation for further mechanistic studies and provided
novel targets for therapy. It is well recognized that PD, as a complex
disease, may be caused by mutations of multiple genes or by the
dysfunction of multiple biological processes. Earlier studies have
shown that disease genes tend to interact in cellular networks^[85]16.
We calculated a score to predict the proximity between the drug targets
and PD-related genes by integrating the information in the PPI networks
and kept the drugs with high proximity as candidates.
The major features of PD pathology are the loss of dopaminergic neurons
from the midbrain and the presence of αSyn protein inclusions. Hence,
we chose profiles of neural cell lines to perform the IGSEA procedure.
Eventually, ten drugs were kept after filtration.
Among the ten candidates, six drugs have been approved by the Food and
Drug Administration (FDA) for clinical use in cancer patients. For
instance, carfilzomib and afatinib are epidermal growth factor receptor
(EGFR) inhibitors. Interestingly, EGFR gene polymorphisms were reported
to be related to the susceptibility to PD^[86]17. Vorinostat (also
known as a histone deacetylase inhibitor, which is an effective
anti‐neoplastic agent for different types of tumors) has recently been
reported to be a potential novel candidate for treating PD^[87]18.
However, there are no reports about the correlations of tucatinib,
tazemetostat and avapritinib with PD pathology. Two of these candidates
are hormones. To date, few effective treatments for PD have been
reported, but a phase 1/2a clinical trial (Identifier: [88]NCT04127578)
is ongoing, the aim of which is to characterize the potential efficacy
of methylprednisolone for treating patients with PD who have at least
one GBA1 mutation, and we look forward to witnessing more promising
discoveries. Isosorbide mononitrate (ISMN) is a candidate treatment for
cerebral small vessel disease and lacunar ischemic stroke^[89]19.
Arsenic trioxide showed beneficial effects on patients with acute
promyelocytic leukemia^[90]20 and systemic lupus erythematosus (SLE) in
a mouse model^[91]21. Inflammasomes might be involved in the
therapeutic mechanism of arsenic trioxide in the above diseases.
However, these two drugs have not yet been investigated in PD.
Nine differentially expressed drug-target genes with p < 0.05 were
screened, and PSMB10, SLC47A2, HDAC8, and BCL2 were clustered into the
same model, while KIT, TXNRD1, JUN, AKT1, and PML were clustered into
another one, suggesting that these nine targets may act through
distinct mechanisms. Sun et al.^[92]22 found that three beta subunits
of immunoproteasome (PSMB9, PSMB10, PSMB8) all colocalized with α-syn,
and PSMB9 knockdown aggravated the accumulation of α-syn in a cell
model of PD. BCL2 overexpression protects dopaminergic neurons against
neurodegeneration^[93]23 and it may play a role in dopaminergic
development and PD^[94]24. Stéphane et al. showed that eliminating Jun
N-terminal kinases (JNKs) can prevent neurodegeneration and improve
motor function in an animal model of PD^[95]25. Another study suggested
that activating the Akt1-CREB pathway might halt neurodegeneration in
PD^[96]26. Furthermore, we found no experimental studies that have
focused on PD in association with SLC47A2, HDAC8, TXNRD1, PML, and KIT,
so this requires more observational data for verification.
Further KEGG pathway analysis revealed that these nine genes were
enriched in the apoptosis, neurotrophin, estrogen, MAPK, and PI3K-Akt
pathway. Some of them may represent novel targets for therapeutic
intervention. Apoptosis is considered the main mechanism of neuronal
death in PD, which could be targeted as possible therapies for
PD^[97]27. In a randomized control trial (RCT), the effects of glial
cell line-derived neurotrophic factor (GDNF) in Parkinson's disease
were investigated^[98]28. MAPK consists of 3 subfamilies, ERK, JNK and
P38. Results from the PD model implicate that selective inhibitors of
p38 may help preserve the surviving neurons in PD and slow down the
disease progression^[99]29.
In the last decade, multi-target drugs have attracted considerable
interest in the treatment of complex diseases. The multi-target ligands
have clear advantages, such as more predictive pharmacokinetics and
reduced risk of drug interactions. For example, the multi-receptor
approach for the Cannabinoid Receptor Subtype 2 was proposed for cancer
and neurodegeneration therapy^[100]30. Bromophenols is a ligand for
both dopaminergic receptors and human monoamine oxidase^[101]31. This
study may provide new insights for revealing novel potential drugs and
targets for multi-target drug screens. Additionally, the present
strategy based on repositioning drugs could provide valuable clues for
drug design and exploration for the treatment of other disorders.
However, there were also several limitations of this study. First, a
potential drawback of proximity analysis is that it relies heavily on
known information, including genes, drugs and targets, but this
information is still far from being completely understood. Second,
although some of the drugs extracted appear to be good candidates for
further investigation, it is uncertain whether any of them would
actually be effective for PD. More investigations are needed to
determine the best use of these drugs to minimize side effects and to
maximize patient benefit. Additionally, future studies should pay more
attention to the novel targets, not just the drugs themselves.
In summary, this network-based approach enabled us to identify several
novel drug candidates and targets that could been applied in treating
PD. Although these results are still in the preliminary stages, they
will provide clues for further experimental exploration. Additional
investigation of these drugs and gene networks could lead to better
preventive strategies for PD.
Materials and methods
The workflow of this study is presented in Supplementary Fig. [102]S2
online.
DrugBank and expression data preprocessing
The drug–target relationships were constructed based on the DrugBank
database^[103]32, which contained 13,680 drug entries and 4875
corresponding targets. The gene expression data were downloaded from
Gene Expression Omnibus (GEO, [104]http://www.ncbi.nlm.nih.gov/geo).
The dataset with the largest sample size ([105]GSE99039)^[106]33 was
selected as the reference group. The platform data is [107]GPL570
Affymetrix Human Genome U133 Plus 2.0 Array, and 233 healthy controls
and 205 PD patients are included. After log2‐ transformation, the data
were centered and scaled for differentially expressed genes (DEGs)
analysis by limma Package^[108]34 of R software (version
3.6.4)^[109]35. DEGs were defined by a p-value < 0.05.
Collection of PD-related genes
Genes associated with PD were retrieved from KEGG
([110]https://www.genome.jp/kegg/)^[111]36,[112]37, OMIM
([113]https://www.ncbi.nlm.nih.gov/omim), Genotype
([114]https://www.ncbi.nlm.nih.gov/gap/phegeni), DISEASES
([115]https://diseases.jensenlab.org/Search), and HGMD
([116]http://www.hgmd.cf.ac.uk/). This was done by querying the above
databases using the “Parkinson’s disease” keyword. Genes from the above
five databases were combined and mapped to their corresponding HUGO
gene nomenclature committee^[117]38 (HGNC)-based official gene symbols.
Duplicate genes and genes of unknown function were removed, and all
remaining genes were retained as the PD-related genes.
Protein–protein interactions between the PD-related genes
Proteins are the molecules that execute most cellular functions and
many regulatory processes take place at this level, and biomolecules
always achieve certain functions through extensive interactions with
other proteins. To evaluate the correlations between the PD-related
genes, we adopted the protein–protein interactions (PPI) network-based
approach. The PPI data were obtained from the STRING^[118]39,
PINA^[119]40 and HuRI databases. In the STRING database, the combined
score is computed by combining the probabilities from the different
evidence channels and corrected for the probability of randomly
observing an interaction^[120]41. In this study, the combined score is
calculated based on experiments, databases, co-expression,
neighborhood, co-occurrence, and co-expression. STRING interactions
with a combined score of 0.9 or higher were retained. All proteins
retrieved from PINA and HuRI were preserved. After outlier removal, a
PPI network was constructed based on the common interplayed
relationships of three databases, and then visualized using the
Cytoscape software (version 3.7.2)^[121]42.
Enrichment analysis of the PD-related genes
Functional enrichment analysis is often conducted to investigate the
potential mechanism of the gene set of interest. Gene Ontology (GO)
annotation and KEGG analysis are the most commonly used methods. GO
provides the classification of gene functions, the relationships
between genes of interest in three categories (GO: biological process,
GO: cellular component, and GO: molecular function). The KEGG analysis
is applied to explore potential signaling pathways that genes may
participate in^[122]43. The GO annotation and KEGG pathway enrichment
analysis were performed using an R package “clusterProfiler”^[123]44.
Only the terms/pathways with a false discovery rate (FDR) < 0.05 were
considered significantly enriched in this work.
Network-based proximity between drugs and PD
To uncover the targeting genes of drugs, we also drew upon the new
approach from prior studies^[124]45 to calculate the distance between
drugs and PD-related genes. Given G, the PD-related genes-set; T, the
set of drug targets, the distance d(g,t), namely the shortest path
length between nodes g (g[∈]G) and t (t[∈]T) in the network, was
calculated as below:
[MATH: dG,T=1T∑t∈T
ming∈Gdg,t+w :MATH]
1
w, the weighted-score of a target; w = − ln(D + 1) if a target is in
the PD-related genes-set; if not, w = 0. D, the PPI degree of
PD-related genes.
The significance of relatedness between a drug and PD was evaluated
using a reference distance distribution corresponding to the drug.
Specifically, a set of proteins (P) matched to the number of drug
targets was randomly selected in the network. The distance d(G,P)
between these proteins and PD-related genes was computed. We repeated
the randomization process 10,000 times and achieved the reference
distribution. The mean μ[d(G,P)] and standard deviation σ[d(G,P)] of
the reference distribution were used to calculate a z-score by
converting observed distance to a normalized distance, i.e., proximity
value:
[MATH: zG,T=dG,T-μdG,P
msub>σdG,P
msub> :MATH]
2
Calculation of drug signatures
The drug-perturbed gene expression profiles were derived from the
Library of Integrated Network-based Cellular Signatures
(LINCS)^[125]46, which is based on gene expression changes that
describe the response of various types of cells when exposed to
different agents. 165 nervous system-related datasets were obtained.
Then, each filtered dataset was subject to Inverted Gene Set Enrichment
Analysis (IGSEA) with the PD-related genes^[126]45,[127]47, the
enrichment score (ES), and the nominal p-value for quantifying
enrichment magnitude and statistical significance of the genes.
Finally, multiple comparisons among all the expression datasets were
performed using the Benjamini–Hochberg FDR method. The gene set with
FDR < 0.25 after performing 1,000 permutations was considered
significantly enriched for gene expression datasets, and the
corresponding drugs were deemed to be potential candidates for PD.
Differentially expressed drug-target genes analysis
A gene that acts as both an effective drug target and a differentially
expressed site could emerge as an important therapeutic target for the
treatment. So, we took the intersection of DEGs and drug-target genes,
i.e., differentially expressed drug-target genes, compared the
expression distribution of these genes between the two groups of
samples, and visualized with a box diagram.
Supplementary Information
[128]Supplementary Information.^ (177.1KB, pdf)
Author contributions
L.F.Y., K.W. and P.S.Q designed the study strategy. P.S.Q., S.Y. and
X.Y.Z. performed bioinformatics data analysis. P.S.Q. and J.W.C drafted
the manuscript. C.Q.W., S.R.W. and L.F.Y. critically revised the
manuscript. All gave final approval and agree to be accountable for all
aspects of work ensuring integrity and accuracy.
Funding
This study was funded by the National Natural Science Foundation of
China (62072143).
Data availability
The gene expression data analyzed during the present study are
available in the GEO with the accession number [129]GSE99039.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.
Supplementary Information
The online version contains supplementary material available at
10.1038/s41598-021-92701-2.
References