Graphical abstract

   graphic file with name fx1.jpg
   [40]Open in a new tab

Highlights

     * •
       DREAMER is a network-based pipeline to explore the mechanism of
       clinical phenotypes
     * •
       Approach uncovers shared mechanisms of adverse drug reactions and
       disease phenotypes
     * •
       DREAMER provides insights for improving drug safety
     * •
       Potentially accelerates the process of drug repurposing

Motivation

   Adverse drug reactions (ADRs) are a major concern in healthcare and
   drug development, often resulting in severe clinical outcomes and
   leading to drug withdrawals. Despite their impact, understanding the
   complex, multifactorial nature of ADR mechanisms remains challenging,
   leaving critical gaps in drug safety assessments. An important question
   is whether network-based approaches, emerging as a promising frontier
   in systems pharmacology, can help uncover these mechanisms by
   integrating diverse biological and pharmacological data. We explored
   this question and developed DREAMER, a network-based framework designed
   to elucidate the molecular pathways underlying ADRs, with the aim of
   enhancing drug safety and identifying new opportunities for drug
   repurposing.
     __________________________________________________________________

   Firoozbakht et al. introduce DREAMER, a network-based framework that
   elucidates shared molecular mechanisms underlying adverse drug
   reactions and disease phenotypes. By integrating diverse biological
   data, DREAMER identifies proteins that reveal therapeutic insights,
   advancing drug safety and repurposing opportunities across various
   clinical contexts.

Introduction

   Adverse drug reactions (ADRs) are important concerns in pharmacology
   and healthcare. They are a leading cause of mortality and drug
   withdrawals.[41]^1 Gaining a deeper understanding of ADRs is essential
   for enhancing drug safety profiles and making informed healthcare
   decisions as they can reveal the complexity of in vivo human phenotypic
   responses.[42]^2^,[43]^3 By understanding the underlying mechanisms of
   ADRs, we can gain insight into a drug’s mechanism of action, which can
   assist in identifying new drug targets, enhancing drug repurposing,
   predicting new therapeutic indications, and advancing personalized
   medicine.

   Although some ADRs cannot be explained by known pharmacology and may
   result from non-specific interactions of reactive metabolites, drug
   kinetics, and/or environmental exposures, most ADRs are caused by
   unintended consequences of on-target or off-target drug-protein
   interactions.[44]^4^,[45]^5^,[46]^6^,[47]^7 Thus, drug-target
   interactions serve as valuable resources for understanding ADR
   mechanisms. Previous studies have considered the comprehensive set of
   drug targets to identify specific proteins associated with ADRs. An
   initial computational method for identifying ADR-related pathways
   (i.e., biological pathways that can explain the mechanisms of an ADR)
   was developed by Wallach et al.[48]^7 who hypothesized that drugs
   modulating the same pathways may lead to ADRs with similar phenotypes.
   To establish ADR-pathway relationships, they employed a logistic
   regression model to predict ADRs by quantifying drug-pathway
   interactions based on the docking scores of drugs to proteins within
   each pathway. Mizutani et al.[49]^8 identified protein-associated ADRs
   by calculating the sparse canonical correlation between drug-protein
   relations and drug-side-effect relations. Kuhn et al.[50]^9 further
   defined the relationship of proteins to ADRs by searching for
   statistically significant overlap between the set of drugs linked to
   their associated proteins and the set of drugs linked to the given ADR.
   To establish the relationship between ADRs and their potential drug
   targets, Lounkine et al.[51]^6 calculated an enrichment score for each
   target-ADR pair based on their observed versus expected co-occurrence,
   and a statistical significance test was applied to find likely
   target-ADR associations. Lim et al.[52]^10 constructed a heterogeneous
   network including drug, gene, and ADR nodes. They employed the ADR-gene
   pairs identified by Lounkine et al.[53]^6 and applied a collaborative
   filtering-based algorithm to predict the missing links between ADRs and
   genes. Next, using a permutation-based algorithm, statistically
   significant genes for each ADR were ascertained and used for pathway
   enrichment analysis. Park et al.[54]^11 assumed that ADRs reported for
   drugs targeting a single protein are entirely derived from perturbing
   that specific protein. Accordingly, they hypothesized that predicting
   the likelihood that a single-target drug causes an ADR corresponds to
   the probability that the protein target is associated with the ADR.
   Based on this concept, they reduced the problem of ADR-protein
   association prediction to the problem of drug-single-protein target
   prediction. To solve this problem, they constructed a network of
   drug-target and protein-protein interactions (PPIs) and used the
   node2vec representation algorithm to embed proteins and drugs into a
   low-dimensional vector space. They further used a logistic regression
   classifier for each ADR to score ADR-protein pairs.

   Despite the importance of drug targets in understanding ADR mechanisms,
   relying solely on these targets can lead to false positives (or failure
   to identify the true causative pathways owing to a limited search
   space). In contrast, exhaustive human genetic research has identified
   numerous disease-related genes. These genes are often linked to disease
   phenotypes (DPs), which can be considered analogous to ADRs.[55]^12
   Such relationships can be leveraged to strengthen our confidence and
   reduce the false positives complicating the drug-target analysis
   approach. Nguyen et al.[56]^13 hypothesized that phenotypes caused by
   genetic variations could predict those by drug interactions with the
   encoded proteins. They demonstrated a significant correlation between
   the organ systems affected by genetic variations and those exhibiting
   ADRs when targeting the encoded proteins.

   Understanding the molecular mechanisms behind ADRs and DPs remains
   challenging. Existing approaches often treat ADRs and DPs separately,
   overlooking shared mechanisms. In this study, we introduce Drug Adverse
   Reaction Mechanism Explainer (DREAMER), a network-based method to
   uncover shared protein mechanisms between ADRs and DPs, which can
   enhance drug safety and repurposing[57]^14 efforts. Specifically, we
   hypothesize that equivalent ADR-DP pairs, representing the same
   phenotype, arise from variations in shared biological pathways
   ([58]Figure 1A).

Figure 1.

   [59]Figure 1
   [60]Open in a new tab

   Overview of the DREAMER pipeline

   (A) The basic hypothesis: phenotypically similar adverse drug reactions
   (ADRs) and disease phenotypes (DPs) might result from targeting of and
   variation in the same biological mechanisms and pathways.

   (B) To obtain ADR-related proteins, we diffuse from the drug targets
   and perform a statistical test for each protein.

   (C) To obtain DP-related proteins, we diffuse from the disease-related
   proteins and perform a statistical test for each protein.

   (D) Left: ADR-DP proteins comprise the intersection set of proteins
   with significant overlap between ADR-related proteins and DP-related
   proteins; right: an example of identified ADR-DP proteins for dyspraxia
   (MedDRA: 10009696) phenotype.

   (E) To limit potential confounding effects by organ/tissue-related
   indications, ADR-DP proteins are identified after removing the drugs
   with the same organ/tissue indication as the organ/tissue affected by
   the ADR.

   (F) To analyze the confounding effects of drug indications, protein
   scores are determined by diffusing from proteins related to the
   indications of drugs associated with a specific ADR, resulting in the
   identification of significant proteins called indication-related
   proteins.

   To explore this hypothesis, we constructed a comprehensive knowledge
   graph (KG) integrating drugs, diseases, ADRs, DPs, and proteins. Our KG
   links drugs to ADRs and targets, and diseases to DPs and related
   proteins, and includes PPIs. DREAMER applies a network diffusion
   algorithm to identify proteins associated with ADRs and DPs, reducing
   potential false positives by integrating proteins linked to equivalent
   ADR-DP pairs. This dual perspective enables a holistic view of
   molecular mechanisms underlying shared phenotypes.

   Key contributions of this study include the following:
     * (1)
       Constructing a KG that integrates diverse data sources, and
       particularly PPI networks, allowing analysis beyond individual
       proteins to capture broader molecular landscapes.
     * (2)
       Developing DREAMER, a network-based pipeline to identify proteins
       associated with clinical phenotypes.
     * (3)
       Providing a database of protein sets linked to phenotype
       mechanisms.

   Overall, this study offers a systems-level perspective on joint ADR and
   DP mechanisms through DREAMER, integrating ADR- and DP-associated
   proteins to advance systems pharmacology and enhance our understanding
   of molecular mechanisms.

Results

Dataset and network construction

   We constructed a heterogeneous network, also referred to as a KG, where
   drugs, diseases, proteins, ADRs, and DPs are represented as nodes.
   Links between nodes were established using various databases,
   incorporating ADR-DP, drug-ADR, disease-DP, drug-target,
   disease-gene,[61]^9^,[62]^15^,[63]^16^,[64]^17 PPIs from STRING,[65]^18
   and physical interactions.[66]^19 Unless otherwise specified, results
   presented in the main text are based on the STRING network. An overview
   of our KG and the framework used for its construction are shown in
   [67]Figure S2. Summary statistics and data sources for the network are
   provided in [68]Table S1, with further details available in the
   [69]STAR Methods section.

DREAMER pipeline

   To explore the underlying mechanisms of a specific phenotype, DREAMER
   identifies proteins related to a pair of ADR and DP that exhibit the
   same phenotype. The step-by-step pipeline of DREAMER is summarized in
   the following:
     * (1)
       Identification of ADR-related proteins: we started by identifying
       proteins associated with the ADR. Using a network diffusion
       approach (i.e., personalized page rank; see [70]STAR Methods), we
       diffused the signal from protein targets of drugs with a certain
       ADR over the PPI network. Therefore, as the initial condition for
       network diffusion, each protein in the network was assigned a
       probability score based on the frequency of being targeted by the
       drug associated with the queried ADR. This scoring process was
       followed by a permutation test (see [71]STAR Methods), generating p
       values for each protein. We considered proteins with adjusted p
       values below 0.05 as significantly related to the ADR
       ([72]Figure 1B).
     * (2)
       Identification of DP-related proteins: we applied the same network
       diffusion approach to identify proteins related to the DP with the
       same phenotype as the ADR in step 1, substituting drug targets with
       proteins related to the diseases linked to the queried DP. This
       step mirrors the ADR analysis and identifies DP-associated proteins
       ([73]Figure 1C).
     * (3)
       Intersection to minimize false positives: to enhance the
       specificity of proteins identified for each queried phenotype, we
       obtained the intersection of the corresponding ADR-related (step 1)
       and DP-related (step 2) protein sets ([74]Figure 1D). To evaluate
       the significance of the intersections, we applied the
       hypergeometric test, with Benjamini-Hochberg correction (adjusted p
       <0.05). Proteins present in both sets that pass the significance
       test were designated as ADR-DP proteins, which we hypothesized to
       be involved in the mechanisms linking the ADR and DP. Among 649
       phenotypes in our network, 120 of them showed significant overlap
       between their ADR-related and DP-related proteins. These 120
       phenotypes with their identified proteins are listed in
       [75]Tables S2 and [76]S3.
     * (4)
       Considering the confounding effect of drug indications: we consider
       potential confounding effects related to drug indications by the
       following analysis:
          + (a)
            Controlling the effect of indication-ADR organ overlap:
            removing drugs with the same organ/tissue indication as the
            ADR to avoid false associations ([77]Figure 1E).
          + (b)
            Controlling the effect of indication-related proteins: scoring
            proteins by diffusing from those related to drug indications
            to identify and remove significant indication-related proteins
            ([78]Figure 1F).

   After controlling for the confounding effect, the number of our
   significant phenotypes was reduced to 67 and their proteins are listed
   in [79]Tables S4 and [80]S5.

   As an example of the ADR-DP proteins identified in our study, the
   phenotype dyspraxia (MedDRA: 10009696) was associated with six
   proteins: three sodium channels (SCN1A, SCN9A, SCN1B) for action
   potential propagation, and three GABA[A] receptor subunits (GABRB3,
   GABRG2, GABRA6) for synaptic transmission ([81]Figures 1D;
   [82]Table S2). GABA[A] receptors enhance sodium channel activation at
   myelinated axon nodes, regulating sensory feedback. Dysregulation can
   lead to dyspraxia due to impaired motor coordination.

   As for visualization, we propose the diffusion map, which is a
   scatterplot representing each protein by its ADR-related and DP-related
   diffusion scores ([83]Figure 2A). The red points show proteins with
   scores that are statistically significant for both ADRs and DPs and
   usually have large diffusion scores for both ADRs and DPs. It is worth
   mentioning that proteins with high diffusion scores might not
   necessarily be significant. In certain cases, these proteins may be hub
   proteins (i.e., highly connected), enhancing the probability of
   achieving high scores in the null model and leading to their rejection
   in the permutation test. As predicted, this analysis identifies many
   proteins involved in disease processes. Next, we provide three examples
   of protein sets identified by DREAMER as significantly associated with
   ventricular arrhythmia, vasculitis, and thrombocytosis.

Figure 2.

   [84]Figure 2
   [85]Open in a new tab

   Diffusion map and reliability assessment of the identified protein set
   using the network diffusion algorithm on STRING

   (A) Diffusion map for ventricular arrhythmia, vasculitis, and
   thrombocytosis. The abscissa and ordinate values represent the
   diffusion scores of proteins initiated from the drug targets and
   disease-associated proteins, respectively. In the diffusion map, drug
   targets are represented by triangles, disease-associated proteins by
   squares, proteins that are both drug targets and disease-associated
   proteins by diamonds, and proteins that are neither drug targets nor
   disease-associated proteins by circles.

   (B) Number of phenotypes shared between our constructed KG and known
   databases

   (C) Comparison of our diffusion-based method and the baseline method by
   identifying the number of ADRs and DPs with significant overlaps
   between proteins detected by different methods and those reported in
   the known databases (also see [86]Figures S3A and S3B).

   For ventricular arrhythmia ([87]Figure 2A), many of the significant
   proteins identified by the diffusion algorithm are ion channel
   proteins, such as those that contribute to Ca^2+ (CACNA1C,
   CACNA1C-IT2), Na^+ (SCN3A, SCN1B, SCN4B, SCN10A), or K^+ (KCNQ1, KCNE3,
   KCNH2) transport in the heart.[88]^20 Previously described mutations in
   KCNQ1 and KCNH2 are associated with dysfunction of the voltage-gated
   K^+ channel resulting in ventricular arrhythmias, such as long QT
   syndrome and ventricular fibrillation.[89]^21 Additionally, patients
   treated for ventricular arrhythmias often have their potassium (K^+)
   levels tested and receive supplements if their levels are low. This is
   because hypokalemia, or low potassium levels, is a well-known risk
   factor for arrhythmias.[90]^22 Mutations in Na^+-channel proteins
   (suprachiasmatic nucleus [SCN] proteins) can result in long QT syndrome
   or atrial fibrillation.[91]^23 Some of these ion channels are also
   present in other tissues, including brain, muscle, stomach, and colon.
   For example, mutations in the SCN1B can increase not only the risk of
   cardiac arrhythmia but also epilepsy.[92]^24 Therefore, drugs that
   alter their function can have cardiovascular, muscular,
   gastrointestinal, or neurological consequences, depending on which
   organs express the specific channels. Similarly, mutations in CACNA1C
   alter L-type voltage-gated Ca^2+-channels and are associated with long
   QT and short QT syndromes. An example is Timothy syndrome, the complex
   congenital syndrome caused by CACNA1C mutations,[93]^25 which involves
   cardiac manifestations such as long QT, along with one or more
   non-cardiac phenotypes such as skeletal, facial, and neurodevelopmental
   abnormalities.[94]^26

   Vasculitis encompasses a heterogeneous group of diseases involving
   large, medium, or small vessels depending on the underlying specific
   disease.[95]^27^,[96]^28 Hallmarks include damage or dysfunction of the
   endothelial cells that line blood vessels, and treatments vary
   depending on the specific type. The DP reflects the action of specific
   proteins that govern the inflammatory response ([97]Figure 2A),
   including PTGS1 and PTGS2, known as COX-1 and COX-2 enzymes. Kawasaki
   disease, a pediatric vasculitis, is treated with aspirin targeting
   these enzymes and thereby reducing inflammation. Similarly,
   methotrexate, which inhibits DHFR, is used in the treatment of other
   forms of vasculitis, having more potent anti-inflammatory effects than
   aspirin or non-steroidal anti-inflammatory drugs (NSAIDs). Activation
   of AHR, the aryl hydrocarbon receptor, is also associated with
   promoting vascular inflammation; however, downregulation of AHR can
   also exacerbate vascular injury by enhancing the function of monocytes
   and macrophages.[98]^29^,[99]^30 Statins, widely known for their
   ability to decrease cholesterol and reduce atherosclerosis via the
   inhibition of HMGCR, have beneficial anti-inflammatory effects on
   endothelial function and are being considered as additional therapies
   in some forms of vasculitis.[100]^31^,[101]^32

   Distinct from vasculitis, which involves inflammation of blood vessels,
   thrombocytosis is characterized by an elevated platelet count. This
   hematologic abnormality is reflected in the identified proteins that
   drive the phenotype, such as platelet-derived growth factor
   receptor-alpha and beta (PDGFR-α and -β) ([102]Figure 2A), which are
   present on both platelets and megakaryocytes, platelet
   precursors.[103]^33 Inhibition of these tyrosine kinases with imatinib
   and other related targeted therapies reduces megakaryocyte survival and
   proliferation and decreases platelet numbers by blocking PDGF
   signaling. Myeloproliferative syndromes, including essential
   thrombocythemia, can result from mutations in the JAK2, CALR, and MPL
   genes, each acting as drivers of the fusion protein BCR-ABL1 to
   increase cell (platelet as well as leukocyte) production.[104]^34

   In summary, these examples demonstrate how the DREAMER pipeline can
   identify proteins that are mechanistically known to be associated with
   specific phenotypes.

Reliability assessment of the network diffusion method

   In this section, we assess the reliability of our network diffusion
   algorithms in identifying relevant proteins compared to a baseline
   method. To do so, we examined the overlap between proteins identified
   by the algorithms (network diffusion-based and baseline) and proteins
   previously reported in the literature as associated with specific
   phenotypes (referred to as a priori known proteins). Specifically, we
   calculated how many phenotypes exhibit a significant overlap between
   ADR-related (or DP-related) proteins identified by our network
   diffusion-based method and known proteins. Significance was determined
   using a hypergeometric test with Benjamini-Hochberg correction
   (adjusted p <0.05). We then compared this count of significant
   phenotype overlaps with those observed using the baseline method. As a
   baseline, we implemented a published method[105]^9 that links proteins
   to phenotypes based on statistical testing (see [106]STAR Methods).
   Notably, both methods were applied to the same KG that we constructed
   for this study.

   Although curated resources of a priori known proteins for ADRs and DPs
   are limited, they can still serve as valuable benchmarks for assessing
   the reliability of our identified proteins. Accordingly, we compiled
   known proteins from various available sources to use as a reference
   standard, as described in the following.

Literature-based associations

   Lu et al.[107]^35 compiled DP-related proteins from the PubMed and
   SemMed databases using natural language-processing methods and manual
   curation, identifying co-occurrences of DP and protein keywords in
   abstracts published before January 2022. In our analysis, we identified
   134 phenotypes shared between the dataset of Lu et al.[108]^35 and our
   dataset ([109]Figures 2B and [110]S3A). We evaluated the performance of
   our network diffusion method against the baseline by calculating the
   number of DPs whose identified proteins significantly overlapping
   literature-based proteins ([111]Figures 2C and [112]S3B). As shown in
   [113]Figure 2H, our method achieved a significant overlap for 66 DPs,
   outperforming the baseline. Notably, we also observed a substantial
   overlap between our ADR-related proteins and literature-based proteins,
   despite the latter not covering ADRs.

Indirect associations

   Several DP terms are equivalent to disease terms, allowing genes
   associated with these diseases to serve as a priori known proteins. We
   term these “indirect associations.” This equivalence enables evaluation
   of our identified DP by their overlap with proteins encoded by
   disease-related genes. Lu et al.[114]^35 compiled a set of such
   indirect DP-gene associations using phenotype-genotype databases. We
   identified 115 DPs common to both our dataset and theirs
   ([115]Figures 2B and [116]S3A). As shown in [117]Figures 2C and
   [118]S3B, our identified DP proteins exhibit a significantly greater
   overlap with these indirect DP-gene associations compared to the
   baseline.

Open Targets-derived data

   The Open Targets platform aggregates direct and indirect associations
   between targets and diseases from various sources, including genetic
   associations, somatic mutations, drugs, pathways, RNA expression, text
   mining, and animal models.[119]^36 Direct associations are based on
   evidence explicitly linking a target to a phenotype. Indirect
   associations leverage the hierarchical structure of the disease
   ontology.

   We identified 69 phenotypes with direct associations and 70 with
   indirect associations shared between Open Targets and our dataset
   ([120]Figures 2B and [121]S3A). As shown in [122]Figures 2C and
   [123]S3B, our method outperformed the baseline in identifying known
   proteins reported in the Open Targets dataset. Notably, none of the
   disease-gene associations in the Open Targets dataset was found in our
   KG.

   While our diffusion-based model outperformed the baseline in
   identifying ADR- and DP-related proteins, we do not expect perfect
   overlap with known proteins. This is because the available known
   proteins are neither sufficiently comprehensive nor adequate as a
   complete ground truth. Therefore, although showing some degree of
   overlap between the identified ADR-related and DP-related proteins with
   proteins derived from a priori known proteins is useful for validation,
   we expect to recognize de novo proteins for each phenotype. Moreover,
   to reduce the false positives, we identified proteins present in both
   the ADR-related and DP-related sets for each phenotype, termed as
   ADR-DP proteins, as described in step 3 in DREAMER pipeline.

Holdout validation

   DREAMER identifies proteins mechanistically related to specific
   phenotypes by analyzing the network proximity of proteins to drugs and
   diseases associated with that phenotype. This raises a question about
   the generalizability of DREAMER-identified proteins: would ADR-DP
   proteins remain relevant when our KG encounters new drugs and diseases?
   Accordingly, we hypothesized that any new drug and disease linked to a
   given phenotype would likely have at least one associated protein in
   closer proximity to the DREAMER-identified proteins than drugs and
   diseases without that phenotype ([124]Figure S3C).

   To test this hypothesis, we performed a holdout analysis to further
   validate our pipeline. For each phenotype, we split its associated
   drugs and diseases into two sets: 80% as the discovery set and 20% as
   the validation set. Additionally, for each phenotype, we randomly
   selected drugs and diseases that are not associated with the given
   phenotype, with an equal number to that the validation set. Here, we
   refer to the drugs and diseases in the validation set as “positive
   assets” and the randomly selected ones as “negative assets.”

   In the discovery phase, we use DREAMER to identify proteins related to
   each phenotype based on the drugs and diseases in the discovery set. In
   the validation phase, for each phenotype, we assessed the network
   proximity of the identified proteins with the positive and negative
   assets. We expect that proteins identified in the discovery phase for a
   given phenotype would show higher network proximity to the positive
   assets than to the negative assets. To assess the network proximities,
   we used the shortest path in the network.

   Specifically, we counted the shortest paths of lengths less than or
   equal to X ∈ {0, 1, 2, …} for both positive and negative assets across
   all phenotypes. Using Fisher’s exact test, we evaluated whether the
   proportion of positive assets with a shortest path ≤X was significantly
   higher than that for negative assets. The results, shown in
   [125]Table 1, indicate that positive assets are indeed significantly
   closer to DREAMER-identified proteins than negative assets. This
   finding supports our hypothesis that ADRs and DPs that exhibit the same
   phenotype can arise from variation in the same proteins and pathways.

Table 1.

   The p values obtained from Fisher’s exact test for the holdout
   validation
   Holdout type Thresholds
     __________________________________________________________________

   X = 0 X = 1 X = 2 X = 3 X = 4 X = 5 X = 6
   Drug holdout 5.7e−08 1.4e−08 0.07 0.1 – – –
   Disease holdout 3.1e−09 1.2e−10 2.7e−07 0.1 0.03 0.3 0.5
   Drug holdout (after drug clustering) 0.003 0.003 0.9 0.9 0.9 – –
   [126]Open in a new tab

   Additionally, to ensure a more rigorous assessment, we repeated the
   process above by splitting the drugs into discovery and validation sets
   based on their molecular dissimilarities using the DataSAIL package in
   Python.[127]^37 Specifically, DataSAIL employs an algorithm to minimize
   similarity between molecules in the discovery and validation set. To
   measure drug similarities, DataSAIL calculates Tanimoto coefficients
   between molecular fingerprints derived from their Simplified Molecular
   Input Line Entry System (SMILES) representations. This approach reduces
   the risk of information leakage and structural similarity between
   discovery and validation sets. The results are presented in
   [128]Table 1 and are consistent with those from the previous holdout
   validation analysis.

Considering the confounding effect of drug indications

   The proteins identified by DREAMER for a specific phenotype may have
   been recognized under the influence of the indications of the drugs
   with the corresponding ADR. In this context, we consider two types of
   potential confounding effects:
     * (1)
       Organ/tissue overlap: when a drug is used to treat a condition in a
       specific organ/tissue, it might also have targets that seem
       relevant to its associated ADR affecting the same organ/tissue.
       However, these associations might only reflect the drug’s intended
       action in that organ or tissue, rather than the ADR’s underlying
       mechanism;
     * (2)
       Indication-related proteins: for drugs with a specific ADR, the
       conditions they treat may be associated with proteins that are
       closely related to our identified ADR-DP proteins. In such cases,
       the identified ADR-related proteins may be influenced by the drug’s
       therapeutic indications rather than a direct mechanistic link to
       the ADR itself.

   In the following, we focus on each of the mentioned confounding effects
   and describe the pipeline we employed to address them.

   To reduce organ/tissue-related confounding, for each ADR, we excluded
   all of its associated drugs with indications affecting the same
   organ/tissue as the ADR ([129]Figure 1E). For example, for the
   cardiovascular-related ADR-phenotype tachycardia, we excluded all drugs
   with tachycardia as their ADRs that also had at least one
   cardiovascular-related indication. Specifically, we manually identified
   the relevant organs/tissues for the 120 ADR-DPs ([130]Table S6) and
   obtained organ indications for the drugs from a previous study.[131]^13
   Among 465 drugs in our network, 328 were listed in their dataset. As a
   result, the number of ADRs with at least one associated drug was
   reduced to 97 (see [132]Tables S4 and [133]S5).

   We then reapplied the DREAMER pipeline to obtain a new set of proteins
   for each phenotype. For 90 out of 97 phenotypes, we observed a
   significant overlap between the proteins identified before and after
   the organ/tissue-based drug removal described above (tested via
   hypergeometric test, p < 0.05) listed in [134]Tables S4 and [135]S5. We
   note that, after the removal of the drugs with indications in the same
   organ/tissue as the ADRs, the average number of drugs was reduced to
   6.6 from 14.2 for each ADR. While, on average, 30% of the drugs are
   excluded in this analysis, the results do not show a significant
   variation, which further indicates the robustness of our pipeline.

   To address the potential confounding effect of indication-related
   proteins, we investigated whether ADR-DP proteins for each phenotype
   interact with proteins associated with therapeutic indications of drugs
   linked to the same phenotype. Using the network diffusion algorithm
   over the PPI network ([136]Figure 1F; see [137]STAR Methods), we
   computed diffusion scores for proteins based on their PPI adjacencies
   with drug indications, as previously described for identifying ADR and
   DP proteins. Of the 120 phenotypes analyzed, 95 were linked to at least
   one drug that has at least an indication with at least one associated
   gene in our dataset (listed in [138]Tables S4 and [139]S5). Limiting
   our analysis to these 95 phenotypes, we obtained diffusion scores from
   the indications of drugs linked to each ADR. [140]Figure 3 shows the
   diffusion map, now incorporating a third dimension representing
   diffusion scores based on drug indications. After identifying
   the diffusion score of proteins with respect to the drug indications,
   we performed a permutation test (see [141]STAR Methods) to assign a p
   value to each protein. Proteins with corrected p < 0.05 were considered
   as indication-related proteins (see [142]STAR Methods). We then
   recognized phenotypes with significant overlap (hypergeometric test
   with Benjamini-Hochberg adjusted p < 0.05) between the
   indication-related proteins and ADR-DP proteins. For 84 out of 95
   phenotypes (listed in [143]Tables S4 and [144]S5), no evidence of
   significant overlap was found. As can be seen in [145]Figure 3A, in
   these phenotypes, the ADR-DP proteins (indicated in red) have smaller
   diffusion scores with respect to the third dimension, suggesting that,
   for these phenotypes, the subnetworks related to drug indications are
   far from those related to ADR-DP proteins. For example, intracranial
   hemorrhage, a critical condition involving bleeding within the brain,
   was linked to proteins of the coagulation and anticoagulation pathways
   such as protein C ) PROC), factor X (F10), and prothrombin (F2)
   ([146]Table S2). Dysregulation of these proteins can impair hemostasis
   and formation of stable clots, leading to an increased risk of
   excessive bleeding events such as intracranial hemorrhage. These
   findings suggest that, at least for these phenotypes, the identified
   ADR-DP proteins have no significant association with the protein
   drivers of their clinical indications for their related drugs.

Figure 3.

   [147]Figure 3
   [148]Open in a new tab

   3D diffusion map and pathway enrichment results for the top phenotypes

   (A) Phenotypes whose identified ADR-DP proteins (indicated in red) are
   far from the subnetworks related to drug indications. In diffusion
   maps, the x, y, and z axes represent the diffusion scores of proteins
   from drug targets, disease-related proteins, and
   drug-indication-related proteins.

   (B) Phenotypes whose identified ADR-DP proteins (indicated in red) have
   larger values in the third axis, and their independence from
   indication-related proteins is not trivial (also see [149]Figure S4).

   (C) The ranked list of phenotypes based on the significance of their
   ADR-DP proteins.

   (D) Reactome and Gene Ontology over-representation analysis for ADR-DP
   proteins of the top three phenotypes.

   By contrast, for the remaining 11 phenotypes ([150]Figure 3B;
   [151]Figure S4), the ADR-DP proteins show higher values on the third
   axis, indicating a non-trivial independence with indication-related
   proteins that requires a further investigation by domain experts.

   Accordingly, the 3D diffusion map can help to inspect such ambiguities.
   Put another way, high z axis values may reflect reverse causality,
   where the ADR is a downstream consequence of treatment and not directly
   associated with the underlying phenotype mechanism. For example, in
   hyperuricemia, we identified ADR-DP proteins that included PDGFRA,
   FLT3, and ABL1 as having high values in the z axis. These proteins are
   commonly targeted by drugs for cancer treatment, including leukemia,
   and play roles in the differentiation, division, and growth of cells.
   Cell death induced by these treatments can lead to great increases in
   uric acid in the blood, overwhelming the body’s normal ability to clear
   that metabolite and ultimately causing renal dysfunction, further
   worsening the hyperuricemia. In vasculitis, proteins like PTGS2
   ([152]Figure 2B) are crucial in mediating inflammation via
   prostaglandin synthesis. Incorporating indication-related proteins, we
   observe an overlap in inflammatory activation between the indication (z
   axis) and vasculitis (y axis) ([153]Figure 3B). Similarly, ITGAL
   (CD11a/LFA-1), essential in leukocyte migration,[154]^38 may be
   detected due to the (1) modulation of immune filtration by the
   indicated drug/disease and/or (2) vasculitis itself affecting immune
   cell-endothelial interactions. For hypersomnia, VIP’s high z axis value
   may relate to VIPomas, where octreotide inhibits excessive VIP
   secretion. However, VIP is produced by neurons in the SCN of the
   hypothalamus where it maintains normal circadian rhythms,[155]^39
   supporting its potential involvement in hypersomnia too. VIP’s role in
   various phenotypes likely depends on its anatomic location and
   quantification.

   According to both analyses discussed above, we identified mechanisms
   for 67 phenotypes that show no evidence of association with drug
   indications ([156]Table S4). Additionally, when replacing the STRING
   PPI network with the physical PPI network, our analysis identified
   mechanisms for 56 phenotypes ([157]Table S5) that are not related to
   drug indications. Notably, there was an overlap of 29 phenotypes
   between these two analyses. The ADR-DP proteins identified from both
   analyses for all these 29 phenotypes showed significant overlap
   (hypergeometric test, adjusted using the Benjamini-Hochberg method,
   p < 0.05).

Biological insights into phenotype mechanism of action

   In this section, we investigate the biological function of the protein
   sets identified by DREAMER for 67 phenotypes with no evidence of
   association to drug indications, based on evidence from pathway
   enrichment analysis and supported by prior literature to connect these
   findings to known physiological and pathological processes, while
   knowing limitations of indirect connections. We first ranked these
   phenotypes based on the significance of the overlaps between their ADR
   proteins and DP proteins, which supports our hypothesis that
   overlapping ADR and PD protein sets may indicate shared underlying
   biological mechanisms. [158]Figure S5 shows the ranking of all
   phenotypes, with the top 20 phenotypes shown in [159]Figure 3C. We then
   found the enriched pathways for the ADR-DP protein sets based on an
   over-representation analysis using the Reactome[160]^40 and Gene
   Ontology[161]^41 ([162]Table S7) databases. The results for the top
   three ranked phenotypes are illustrated in [163]Figure 3D.

   Pathway analysis highlights the physiological processes involved in
   these disorders. For example, chloride is an anion that is mostly found
   in the extracellular space. Its concentration is regulated by the
   gastrointestinal tract, where it is absorbed from food, as well as the
   kidney, where it is excreted in urine or reabsorbed in the proximal
   tubule. Chloride transport relies on transmembrane ion transporters and
   cotransporters as well as additional Na^+/K^+ ATP-dependent ion
   transporters that provide energetics for Cl^− transport
   ([164]Figure 3D). Thus, its concentration is dependent on that of other
   ions, such as Na^+, K^+, and bicarbonate (HCO[3]^−). Owing to its
   inverse relationship with bicarbonate, hypochloremia can result in
   metabolic alkalosis. Hypochloremia can occur due to gastrointestinal
   causes, such as vomiting; or renal loss of chloride due to the use of
   diuretics (hypochloremic metabolic alkalosis due to excessive fluid
   loss leading to volume contraction); and/or because of hyponatremia and
   hypokalemia, as the fluxes in sodium and potassium will affect chloride
   levels.[165]^42

   The coordinated movement of ions through voltage-gated ion channels is
   important to maintain the rhythmic beating of the heart
   ([166]Figure 3D). Disruption of these processes leads to abnormal
   action potentials, arrhythmias, and ventricular fibrillation. Many of
   these ion channels also play a role in other organs, including the
   brain.[167]^43

   In personality disorder (a complex, comparatively non-specific
   phenotype), the identified pathways are all known key mechanisms for
   various psychiatric conditions. The neurotransmitter receptors and
   postsynaptic signal transmission reflect the significant roles of
   dysregulated neurotransmitter systems implicated in a range of
   personality disorders such as mood and bipolar disorder. Altered phase
   0, representing rapid neuronal depolarization, can lead to
   epilepsy[168]^44 ([169]Figure 3D). Dysregulated membrane potential can
   be influenced by chloride transport and can impair GABAergic
   transmission ([170]Figure 3D). Impairment in GABAergic transmission
   plays a significant role in the pathophysiology of major depressive
   disorder (MDD),[171]^45 schizophrenia,[172]^46 bipolar
   disorder,[173]^47 and autism,[174]^48 and lower levels of GABA are
   often identified as the main endophenotype of MDD.[175]^49 The
   antidepressant effect of ketamine may also be related to its selective
   impact on GABAergic interneurons, blocking NMDA receptors and reducing
   inhibitory signals to enhance cortical excitation. Additionally, the
   interaction between L1CAM and ankyrins ([176]Figure 3D) guides neuronal
   adhesion and signaling, where abnormalities are associated with
   neurodevelopmental disorders like autism.[177]^50 These mechanistic
   phenotype pathways emphasize the interconnected roles of
   neurotransmitter signaling, synaptic function, and neuronal
   excitability in personality (and other psychiatric) disorders.

   Overall, these results provide preliminary insights into potential
   biological mechanisms connecting ADRs and DPs through shared protein
   pathways. We note that, while pathway enrichment and literature
   evidence support these findings, future experimental studies need to
   confirm these mechanistic connections.

Application of DREAMER for therapeutic potential

   The proteins identified for each phenotype using DREAMER can open new
   avenues for drug design and drug repurposing (i.e., an approach to
   identifying new therapeutic uses for drugs that are already approved
   for specific disorders).[178]^14 It can be hypothesized that targeting
   proteins identified for each phenotype is most likely either to induce
   or treat the phenotype, as one cannot determine directionality a priori
   from this analysis. Therefore, DREAMER can be leveraged for drug
   discovery in two ways: (1) predict possible ADRs for new drugs based on
   their known targets, and (2) design new drugs or suggest repurposing
   candidates, based on their targets, for a disease.

   In particular, to showcase the application of DREAMER for drug
   repurposing in the context of the second case, we focus on phenotypes
   for which there is evidence that targeting their ADR-DP proteins can
   treat the corresponding phenotype. For this purpose, we identified
   phenotypes whose ADR-DP proteins contain at least one protein targeted
   by a drug with an indication with the same terminology as the specified
   ADR. For example, cardiac arrest is a terminology that is assigned to
   an ADR (MedDRA: 10007515 in the SIDER dataset), a DP (with hpo:0001695
   in the Human Phenotype Ontology [HPO] dataset), and drug indication
   (with mondo:0000745 in Mondo Disease Ontology dataset). Interestingly,
   we found three drugs (diltiazem, carvedilol, and verapamil) that are
   indicated for cardiac arrest and target at least one of the proteins
   recognized by DREAMER for cardiac arrest ([179]Figure 4). Thus,
   hereafter we refer to such drugs as “indicated drugs.” In our dataset,
   we identified a total of eight such phenotypes, namely cardiac arrest,
   hypophosphatemia, precocious puberty, Torsades de Pointes,
   thrombocytosis, peptic ulcer, ventricular tachycardia, and ventricular
   fibrillation ([180]Table S8). [181]Figure 4 illustrates PPI subnetworks
   for five of those phenotypes restricted to their ADR-DP proteins along
   with the drugs that target them and have indications for those
   phenotypes. The complete list of the indicated drugs along with their
   targets among ADR-DP proteins is provided in [182]Table S8.

Figure 4.

   [183]Figure 4
   [184]Open in a new tab

   Subnetworks of identified protein sets along with indicated drugs that
   target them

   Pink nodes are the proteins that are targeted by indicated drugs, and
   blue nodes are the rest of the ADR-DP proteins.

   To find opportunities for drug repurposing on the above-mentioned
   phenotypes, we found all the drugs that have at least one target among
   their ADR-DP proteins and focused only on those with no ADR on the
   corresponding phenotype and those that have not previously been found
   to have an indication for the corresponding phenotype in our dataset
   ([185]Table S9). We, thus, refer to these drugs as candidate drugs for
   repurposing. For example, sotalol is recognized for its efficacy in
   treating various cardiac arrhythmias by targeting KCNH2 (hERG)
   channels. In our dataset, sotalol is recognized to have indications for
   ventricular fibrillation. However, sotalol can cause prolongation of
   the QT interval, leading to ventricular arrhythmias such as ventricular
   tachycardia, ventricular fibrillation, cardiac arrest, and, in
   particular (based on our dataset confirmed by the published
   literature), Torsades de Pointes.[186]^51^,[187]^52^,[188]^53 Although
   sotalol has not been reported to be related to cardiac arrest in our
   dataset, it is among the candidate drugs for the treatment of cardiac
   arrest in our analysis ([189]Table S9). Similarly, ranolazine, a drug
   used to treat angina pectoris, has been used off-label for the
   treatment of ventricular arrhythmias.[190]^54 In addition, we conducted
   a comprehensive search on the clinical trials website
   ([191]ClinicalTrials.gov) to find evidence for these candidate drugs.
   Using a customized Python script, we queried all pairs of phenotypes
   and their candidate drugs and then carefully inspected all the derived
   results. Accordingly, we found a number of such phenotype-drug pairs
   listed in [192]Table 2 along with their [193]ClinicalTrials.gov IDs.

Table 2.

   The phenotype-drug evidence for candidate drugs in the ClinicalTrials
   website
   Phenotype Candidate drug ClinicalTrials.gov ID
   Ventricular tachycardia ranolazine [194]NCT01590979
   Cardiac arrest ranolazine, domperidone [195]NCT00998218,
   [196]NCT04024865, [197]NCT02500108, [198]NCT01907633, [199]NCT00925457
   Peptic ulcer baclofen [200]NCT00414856, [201]NCT00461604,
   [202]NCT00978016
   Precocious puberty anastrozole, letrozole [203]NCT00094328,
   [204]NCT00055302
   Torsades de Pointes progesterone, testosterone [205]NCT01929083,
   [206]NCT02513940
   Ventricular fibrillation ranolazine [207]NCT01887353, [208]NCT01558830
   [209]Open in a new tab

   To repurpose novel drugs for these phenotypes, we scored and ranked all
   candidate drugs based on the ratio between the number of proteins they
   target within ADR-DP proteins and their overall number of targets.
   [210]Figure 5 shows these rankings for the top 20 drugs, with circle
   sizes representing the total number of targets of each drug within the
   ADR-DP protein set. The circles are colored red to indicate drugs that
   have been found to be under investigation in clinical trials as being
   relevant to the queried phenotype. Candidate drugs that target proteins
   also targeted by indicated drugs may have a potential for repurposing
   (marked in green), as targeting these proteins has already been shown
   to be effective in treating the phenotype, but experimental testing is
   required to establish their efficacy. Circles colored in blue could
   potentially represent interesting findings to repurpose for their
   corresponding phenotypes, as they have not been previously targeted for
   the treatment of those phenotypes. Moreover, we hypothesize that drugs
   with higher ranks are more likely associated with the corresponding
   phenotype.

Figure 5.

   [211]Figure 5
   [212]Open in a new tab

   Top-ranked drugs with at least one target in the identified ADR-DP
   proteins for the phenotypes

   Red circles indicate candidate drugs that are found to be under
   investigation in clinical trials. Green circles indicate candidate
   drugs that target proteins that are already targeted by indicated
   drugs. Blue circles indicate candidate drugs whose targets were not
   already targeted by any indicated drugs. The size of the circles
   indicates the total number of proteins the candidate drugs target
   within ADR-DP proteins.: GHB∗, gamma-hydroxybutyric acid.

   In summary, DREAMER provides a framework for identifying potential
   therapeutic targets; however, further research is essential to validate
   these findings. Future studies should focus on experimental and
   clinical validation of these candidate drugs and explore the
   mechanistic basis of their associations with the specified phenotypes.

Discussion

   In this study, we present DREAMER, a network-based method designed to
   investigate the underlying mechanisms of DPs and ADRs. While previous
   efforts have explored the mechanisms of ADRs and DPs separately,
   focusing on ADR-target PPIs or DP-gene associations, our approach
   integrates both phenotypes, offering a more comprehensive insight into
   their mechanisms. By identifying interconnected modules between ADRs
   and DPs, DREAMER effectively uncovers shared molecular mechanisms
   underlying 67 phenotypes, supporting our hypothesis. Furthermore, our
   reliability assessments and validation analyses confirm the robustness
   of DREAMER’s approach, underscoring its contributions to systems
   pharmacology. Our pipeline has the potential to identify biomarkers for
   designing safer therapeutic strategies that minimize the need for drug
   discontinuation and enhance opportunities for drug repurposing, thereby
   supporting more effective and personalized treatments.

   Network-based methods have proved to be effective in understanding the
   complex biology and systems pharmacology.[213]^55^,[214]^56^,[215]^57
   By representing biological entities as interconnected nodes in a
   network, these methods enable the identification of interactions and
   functional modules that may not be apparent when examining isolated
   entities. Unlike traditional approaches, which often rely on linear
   associations, network-based methods capture the interconnected nature
   of biological processes, providing a holistic view that can account for
   indirect and higher-order relationships across entities. This approach
   is particularly beneficial in exploring mechanisms of ADRs and DPs, as
   it allows us to understand how proteins and pathways implicated in
   therapeutic indications might overlap with or diverge from those
   contributing to adverse effects.

   Targeting proteins in the organ of interest with drugs provides the
   basis for in vivo experiments that can explain the relationship between
   the functionality of that protein, systemic effects, and phenotypic
   responses. These proteins can be both on target and off target of
   drugs. While they are primarily targeted to treat specific indications
   or their phenotypes, they can also induce unintended side effects.
   Currently, drug safety evaluation, to a large degree, relies on animal
   experiments, which do not always translate reliably to humans owing to
   inherent biological differences.[216]^12 In recent years, the increased
   availability of public databases including drug targets and ADRs has
   become a more reliable source of human-specific information. Genetic
   variations, by contrast, can be considered natural experiments,
   providing insights into the mechanism of phenotypes. Genome-wide
   association studies have been extensively utilized to identify novel
   therapeutic targets, with a greater probability of drug approval when
   these targets are corroborated by human genetic evidence for the
   desired indication. Additionally, there is a growing interest in
   harnessing human genetic studies to predict the risk of ADRs. The
   importance of applying this strategy is more pronounced where suitable
   animal models for drug safety assessment are lacking.[217]^12 Protein
   modules that are affected by both drugs with a specific ADR and
   diseases with a similar phenotype provide more evidence to explain the
   mechanism underlying the phenotype. Our study advances this concept by
   considering modules targeted by drugs and diseases exhibiting
   phenotypically similar ADRs and DPs.

   The ADR-DP proteins identified by DREAMER may be influenced by the drug
   indications in the PPI, which can be recognized by their large values
   along the z axis in the 3D diffusion map ([218]Figures 3A and 3B). Such
   instances may affect our interpretations and should be addressed
   carefully. Specifically, we encountered three scenarios: (1) ADR-DP
   proteins have an indirect association with the phenotype of interest,
   as seen in cases like hyperuricemia caused by cancer therapies; (2)
   higher-order relationships between the drug indications and the
   phenotype (such as both being related to the same tissue or organ); and
   (3) the phenotype mechanism stems from on-target effects. We
   acknowledge that, while the last scenario does not limit our
   interpretation, distinguishing it from the first two scenarios might be
   challenging or even infeasible.

   Some phenotypes are multifactorial and are not directly linked to the
   drug’s molecular effect(s) alone. However, it is still valuable to
   explore whether any of these phenotypes have a specific molecular
   mechanism that connects them to both ADRs and DPs simultaneously. For
   example, in the case of female infertility, which often stems from
   prior infections, DREAMER has identified proteins enriched in the
   metabolism of steroid hormone pathways ([219]Tables S2 and [220]S7). In
   contrast, dementia may arise from multiple factors beyond genetics,
   such as age, lifestyle, social engagement, and cognitive function,
   which would be discarded by DREAMER owing to the lack of significant
   overlapping proteins connecting the ADRs and DPs.

   In this analysis, we used the STRING network, which integrates diverse
   PPIs from various sources, including direct physical interactions and
   indirect associations such as genetic co-occurrences, co-expression,
   and computational predictions. This broad interaction dataset enables
   exploration of phenotype mechanisms through higher-order interactions.
   While STRING offers a comprehensive view of generalized interaction
   data, we also applied DREAMER using a physical PPI network, which
   focuses exclusively on high-confidence interactions validated by
   experimental methods such as yeast two-hybrid screening, NMR
   spectroscopy, X-ray crystallography, and cryoelectron microscopy.
   Shared phenotypes between the STRING and physical PPI networks are
   extensively studied, commonly encountered in clinical practice, and
   span multiple organ systems, such as cardiac arrest, bradykinesia,
   ventricular arrhythmia, and interstitial pneumonitis ([221]Figure S5).
   However, the physical PPI network uniquely identifies dementia and
   parkinsonism (cognitive and motor symptoms), while STRING uniquely
   identifies akinesia, a symptom of parkinsonism related to movement
   initiation difficulties. The reasons for these distinctions remain
   uncertain at this time but likely reflect the complexity of the
   phenotypes as well as their lack of specificity in some cases. Similar
   trends are observed in inflammatory phenotypes: both networks identify
   interstitial pneumonitis, but the physical PPI network uniquely
   captures systemic inflammatory conditions (e.g., rheumatoid arthritis,
   leukocytosis, involving direct interactions with mediators derived from
   circulating immune cells), while STRING identifies organ-specific
   inflammation (e.g., cholecystitis, cholangitis). Overall, STRING
   identifies more phenotypes and a broader range of conditions, including
   structural cardiac and metabolic abnormalities, reflecting its
   integrative approach. In contrast, the physical PPI network identifies
   disorders with better-characterized molecular mechanisms.

   To assess the validity of our identified proteins, we conducted several
   analyses: (1) reliability assessment, to demonstrate the overlap of our
   identified diffusion-based proteins with a priori known proteins and
   show their superiority over a baseline model; (2) holdout validation,
   to show that the identified proteins are generalizable to new drugs and
   diseases that are added to our KG; and (3) robustness assessment, to
   demonstrate that the identified proteins statistically remain
   consistent even when 30% of the drugs are excluded from our KG.
   Although in silico validation for computationally identified proteins
   is necessary, true validity of the identified proteins can only be
   confirmed through experimental validation. For example, gene-knockout
   studies in animal models allow researchers to assess whether
   eliminating genes identified by in silico models produces a phenotype
   that mirrors the ADR of interest. Following validation in animal
   models, clinical studies provide the strongest confirmation of these
   mechanisms in humans.

   One potential and interesting application of this work could be in drug
   design and repurposing. We identified and showed eight phenotypes
   ([222]Table S8) where drugs targeting ADR-DP proteins for some
   phenotypes have indications for diseases with the same phenotype.
   Extending this idea to other phenotypes, DREAMER can reduce the search
   space to find relevant protein targets for a particular phenotype.
   Moreover, DREAMER can be used for drug off-target prediction. Drugs
   with a particular ADR are expected to bind to a protein within (or
   close to) the identified protein sets that govern that ADR (side effect
   module).[223]^58 The reduced protein space can then be used to infer
   the potential off-target proteins of drugs using computational methods
   (e.g., Autodock and Autodock-vina[224]^59^,[225]^60) or experimental
   methods (e.g., based on established physicochemical methods).

Limitations of the study

   DREAMER explores the mechanism of phenotypes without considering the
   specific variations in individual molecular profiles, which are crucial
   for personalized medicine. To advance our understanding in personalized
   medicine, one will also require access to individual-specific
   information. The Food and Drug Administration (FDA) Adverse Event
   Reporting System[226]^61 provides extensive patient information,
   including ADRs, drug prescriptions, dosages, and demographic details,
   which can be leveraged to help elucidate the mechanisms of phenotypes
   in the context of personalized treatments but ultimately will require
   molecular-level information with which to generate individual
   PPIs.[227]^62

   A potential future direction is the investigation of the phenotype
   mechanism in the context of combination therapy. Drugs can be
   prescribed as monotherapies or combination therapies,[228]^63 with the
   latter offering synergistic benefits for complex or multiple disorders
   but potentially introducing unique ADRs. For instance, in Parkinson’s
   disease, levodopa is prescribed to increase dopamine level and, in
   combination with that, carbidopa is prescribed to reduce peripheral
   conversion, reducing the ADRs such as nausea. The TWOSIDES[229]^1
   database provides insights into ADRs related to drug combinations and
   can aid in identifying their mechanisms.[230]^64^,[231]^65
   Additionally, similarities among certain phenotypes can improve the
   reliability of mechanism identification. Phenotypes can be clustered
   based on shared drugs and diseases using techniques like
   biclustering[232]^66 or KG representation learning.[233]^67 In
   conclusion, DREAMER advances our understanding of ADR and DP
   mechanisms, offering a valuable tool for improving drug safety,
   repurposing, and personalized medicine.

Resource availability

Lead contact

   Requests for further information and resources should be directed to
   and will be fulfilled by the lead contact, Farzaneh Firoozbakht
   (farzaneh.firoozbakht@uni-hamburg.de).

Materials availability

   This study did not generate new unique reagents.

Data and code availability

     * •
       ADR-phenotype, drug-ADR, drug-protein, gene-disease, gene-protein,
       drug-indication, and phenotype-disease links and pre-processed
       STRING PPI network have been deposited at
       [234]https://doi.org/10.6084/m9.figshare.28254812. Access to
       drug-protein links requires a usage license from the DrugBank
       dataset.
     * •
       All original code has been deposited at
       [235]https://github.com/faren-f/DREAMER and is publicly available
       at [236]https://doi.org/10.6084/m9.figshare.28254812 as of the date
       of publication.
     * •
       Any additional information required to reanalyze the data reported
       in this work paper is available from the [237]lead contact upon
       request.

Acknowledgments