Abstract

   Epilepsy is a complex neurological disorder and a significant health
   problem. The pathogenesis of epilepsy remains obscure in a significant
   number of patients and the current treatment options are not adequate
   in about a third of individuals which were known as refractory
   epilepsies (RE). Network medicine provides an effective approach for
   studying the molecular mechanisms underlying complex diseases. Here we
   integrated 1876 disease-gene associations of RE and located those genes
   to human protein-protein interaction (PPI) network to obtain 42
   significant RE-associated disease modules. The functional analysis of
   these disease modules showed novel molecular pathological mechanisms of
   RE, such as the novel enriched pathways (e.g., “presynaptic nicotinic
   acetylcholine receptors”, “signaling by insulin receptor”). Further
   analysis on the relationships between current drug targets and the
   RE-related disease genes showed the rational mechanisms of most
   antiepileptic drugs. In addition, we detected ten potential novel drug
   targets (e.g., KCNA1, KCNA4-6, KCNC3, KCND2, KCNMA1, CAMK2G, CACNB4 and
   GRM1) located in three RE related disease modules, which might provide
   novel insights into the new drug discovery for RE therapy.

Introduction

   Epilepsy is a collection of brain disorders, which is characterized by
   repeated, uncontrolled seizures. Seizures are abnormal firing of brain
   cells which may cause changes in behavior or attention, which affect
   over 65 million people in the world[[46]1]. The cause of many epilepsy
   cases remains unknown, although some are known to result from [47]brain
   injury, [48]stroke, [49]brain tumor, gene mutation, and/or
   [50]substance abuse disorders [[51]2–[52]4]. Seizure is correlated with
   the enhancement of glutamate responses mediated by N-methyl-D-aspartate
   (NMDA) receptor. The NMDA receptors are up-regulated when epilepsy
   occurs, and the corresponding ion channels are kept open causing the
   neurons discharge continuously[[53]5]. Studies show that brain’s neural
   circuits play a key role in controlling the balance between epileptic
   and antiepileptic factors[[54]6]. Studies also support the pathogenic
   role of neuroinflammation in RE [[55]7]. Microglia may initiate a cycle
   of inflammation-induced seizures and seizure-induced inflammation, and
   microglia-driven epilepsy may be a primary pathogenic process[[56]8].
   Antiepileptic drugs (AEDs), such as phenytoin sodium, phenobarbital and
   felbamate, are the first line treatment for controlling epileptic
   seizures [[57]9,[58]10], which are divided into three categories
   according to their mechanisms of efficacy. They work by inhibiting the
   voltage-gated ion channels, increasing the inhibitory effect of GABA,
   or inhibiting the conduction of glutamate-mediated excitability
   [[59]11–[60]14].The AEDs can inhibit the spread of abnormal firing
   patterns to distant sites, which are required for the expression of
   behavioral seizure activity [[61]15]. However, about a third of the
   patients do not respond to AEDs, and they are considered to have
   medically refractory epilepsies (RE) [[62]16].

   Furthermore, to fully address the drug resistance and side effects of
   AEDs, researchers have tried to investigate the underlying molecular
   mechanisms of RE to search for novel molecular targets and
   drugs[[63]17]. Meta-analysis studies on epilepsy have proposed new
   understanding of the genetic study of epilepsy [[64]18,[65]19].
   Furthermore, with the increase of the data sources on
   phenotype-genotype associations and protein-protein interactions (PPI),
   network medicine has provided an insightful approach to understand the
   molecular mechanisms of complex diseases [[66]20–[67]22].

   Molecular and genetic studies of epilepsy over the recent decades have
   produced an impressive list of disease-gene associations, together with
   the interactome network database, which can be used to investigate the
   molecular network mechanisms and potential drug targets related to
   epilepsy[[68]23]. Molecular networks disrupted in epilepsy have been
   discovered in the brain using an integrated systems-level analysis of
   brain gene expression data, which may lead to more effective treatments
   and help us identify new medications for epilepsy[[69]24]. In this
   study, we curated comprehensive phenotype-genotype associations related
   to RE to investigate the network module mechanisms and identify the
   potential drug targets for RE treatment. Here, we follow several main
   steps to perform the study ([70]Fig 1): 1) Using the RE terms from the
   Medical Subject Headings (MeSH) terminology, we extracted the seed
   genes from three databases to construct networks based on the
   interactions between the proteins they encode; 2) we utilized the a
   community detection method to obtain disease-related topological PPI
   modules; 3) Along with the existing AEDs from two databases, we
   identified three important epilepsy disease modules and ten potential
   drug targets by conducting gene ontology (GO) and pathway enrichment
   analysis.

Fig 1. Technical roadmap.

   [71]Fig 1
   [72]Open in a new tab

   The main analytic process was described in frame.

Materials and methods

Integrating disease-related genes

   The MeSH classification was defined by experts and provided a
   comprehensive vocabulary across all disease categories, which were
   systematically organized in a hierarchical tree[[73]25,[74]26]. Based
   on the 23 MeSH headings, which were extrapolated manually by trained
   experts, we searched the CoreMine PubMed search engine system[[75]27],
   the OMIM[[76]28], and the DiseaseConnect[[77]29] databases to extract
   disease-gene relationships. Moreover, we manually examined and verified
   disease-gene co-occurrence in the literature contained in the PubMed
   database, in order to ensure highly accurate relationships. We also
   checked the HGNC database for the approved name of the gene[[78]30].

Extraction of PPI network

   Detailed human PPI data was downloaded from the STRING 9.1
   database[[79]31]. The STRING database is a comprehensive
   protein-protein interactions (PPIs) data source, which aims to offer a
   critical assessment and integration of PPIs, including direct
   (physical) as well as indirect (functional) associations. The various
   data sources involved experimental, predicted and transferred
   interactions, together with interactions obtained through text mining.
   Known and predicted associations are scored and integrated (STRING10.0
   has expanded to 9.6 million proteins and 184 million interactions)
   [[80]31,[81]32]. The weight of each interaction between two proteins
   was defined by its correlation value. The interactions whose scores> =
   700 have high confidence or be considered as a high-quality
   subset[[82]33]. By filtering the links with the scores> = 700, we
   finally obtained a high-quality String 9 human subset with 218,157 PPI
   records and 14,379 distinct proteins.

Topological module identification

   Topological module refers to the subnetwork of one entire network,
   which has relatively dense links when compared with the links outside
   of the subnetwork[[83]20]. BGLL, a popular community detection
   algorithm which is based on modularity evaluation, was used to obtain
   the topological modules of the whole PPI network. Module size too small
   or too large is not benefit for the enrichment analysis. The module
   size limits made us iteratively divided the modules using BGLL,
   resulting in topological modules that contain between 5 and 400 member
   nodes [[84]34,[85]35]. The weight of the edges between the modules was
   calculated using the formula:
   [MATH:
   <mrow><mi>w</mi><mi>e</mi><mi>i</mi><mi>g</mi><mi>h</mi><mi>t</mi><mo>=
   </mo><mfrac><mi>c</mi><mrow><mi>a</mi><mo>×</mo><mi>b</mi></mrow></mfra
   c></mrow> :MATH]
   (1)

   where c is the real edge number, a and b represent the number of
   proteins in each module.

GO and pathway enrichment analysis

   Many online analysis platforms, as well as relevant analysis softwares,
   are available for GO enrichment analysis [[86]36,[87]37]. In this
   study, the results of GO enrichment analysis were performed using the
   BiNGO 2.44 plug-in for the Cytoscape 3.0.2 software[[88]38], according
   to the significance threshold (p-value<0.05). Application of BiNGO in
   molecular interaction networks, e.g. protein interaction networks,
   visualized in Cytoscape is better than other tools. BiNGO can map the
   dominant themes of the target gene based on GO hierarchy, and it also
   produce an intuitive and customiazable visual representative results by
   Cytoscape’s versatile visualization platform [[89]39].

   The BiNGO 2.44 plug-in can map the genes to GO terms using a
   hypergeometric distribution relationship. Then, we can obtain the GO
   terms with related genes. Through Bonferroni correction to control the
   false positive rate of analysis, this process will return a p-value.
   All the results of pathway enrichment analysis were obtained through
   the online analysis tool KOBAS 2.0[[90]40], according to the
   significance threshold (p-value<0.05). The KOBAS 2.0 can return the
   results of six pathway databases (Reactome, KEGG PATHWAY, BioCyc,
   PANTHER, BioCarta, and PID). We chose the Reactome database in our
   study because its results were the most comprehensive[[91]41]. By
   calculating the hypergeometric distribution relationship, we can pick
   out statistically significant pathways. Through Bonferroni correction
   to control the false positive rate of analysis, this process will also
   return a p-value. Network and module visualizations were designed with
   the help of the visualization software, Gephi [[92]42], while the heat
   map was created with the use of hemI software.

Function-based modules similarity

   A widely used method in both text mining and biomedical literature to
   quantify the similarity between two concepts is the cosine similarity
   of respective vectors [[93]26]. In this study, we applied the cosine
   similarity method to calculate function-based similarity. We used the
   equation −log(corrected p − value) to transform p-values to correlation
   values, since the smaller of the corrected p-value small the bigger of
   the correlation between modules and functional entities. The similarity
   between the vectors, A and B, of the two modules, A and B, is
   calculated as follows:
   [MATH:
   <mrow><mi>c</mi><mi>o</mi><mi>s</mi><mrow><mo>(</mo><mrow><mi>A</mi><mo
   >,</mo><mi>B</mi></mrow><mo>)</mo></mrow><mo>=</mo><mfrac><mrow><msubsu
   p><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>n</mi></msub
   sup><msub><mi>A</mi><mi>i</mi></msub><msub><mi>B</mi><mi>i</mi></msub><
   /mrow><mrow><msqrt><mrow><msubsup><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><
   mn>1</mn></mrow><mi>n</mi></msubsup><msubsup><mi>A</mi><mi>i</mi><mn>2<
   /mn></msubsup></mrow></msqrt><msqrt><mrow><msubsup><mo>∑</mo><mrow><mi>
   i</mi><mo>=</mo><mn>1</mn></mrow><mi>n</mi></msubsup><msubsup><mi>B</mi
   ><mi>i</mi><mn>2</mn></msubsup></mrow></msqrt></mrow></mfrac></mrow>
   :MATH]
   (2)

   where A[i] and B[i] are components of vector A and B respectively. The
   range of functional similarity is zero (no similarity of two modules)
   to one (two modules have same function), and a larger value for
   similarity represents more functional similarity between two modules.
   The combined score equals 0.5*((BP similarity+CC similarity+MF
   similarity)/3) +0.5*Pathway similarity.

Shortest paths between drug targets and seed genes

   We searched for relevant drugs by searching for disease keywords in the
   ‘indications’ field of drug information obtained from the DrugBank
   database, which combines detailed drug data with comprehensive
   drug-target information[[94]43,[95]44]. Shortest paths, a significant
   topological quantity, are often used for the analysis of social and
   biological networks, such as the well-known small world property of
   many complex networks [[96]26,[97]45,[98]46]. We used the Dijkstra’s
   algorithm to find the shortest path lengths between epilepsy drug
   targets and seed genes[[99]47]. To obtain random controls for the
   target-seed gene, we generated 100 independent randomized samples in
   the PPI network. Significant difference was calculated statistically
   using Student’s test.

Results

MeSH heading of RE subtypes

   Using epilepsy as the keyword, MeSH terms related to RE were extracted
   and evaluated from the MeSH vocabulary. As shown in [100]Table 1, 23
   MeSH headings were identified, which were typical and significant
   disease phenotypes, such as lissencephaly and myoclonic epilepsy
   (“Epilepsies, Myoclonic” as the MeSH heading of Dravet syndrome which
   was one of the best experimentally studied drug resistant primary
   epilepsy syndromes). The included syndromes are basically epilepsy
   syndrome. Individual syndrome belongs to always accompanied by
   epileptic seizures. Targets and pathways cross associated with epilepsy
   syndrome, but not a typical epilepsy syndrome.
   Anti-N-Methyl-D-Aspartate Receptor Encephalitis was known not long
   time, but Alexander Disease was known for a long time.

Table 1. Twenty three MeSH headings of RE subtypes.

   NO. MeSH heading Unique ID NO. MeSH heading Unique ID
   1 Alexander Disease D038261 13 Malformations of Cortical Development
   D054220
   2 Anti-N-Methyl-D-Aspartate Receptor Encephalitis D060426 14 MELAS
   Syndrome D017241
   3 Classical Lissencephalies and Subcortical Band Heterotopias D054221
   15 MERRF Syndrome D017243
   4 Epilepsies, Myoclonic D004831 16 Mitochondrial Encephalomyopathies
   D017237
   5 Epilepsy, Absence D004832 17 Myoclonic Epilepsies, Progressive
   D020191
   6 Epilepsy, Frontal Lobe D017034 18 Myoclonic Epilepsy, Juvenile
   D020190
   7 Epilepsy, Temporal Lobe D004833 19 Neuronal Ceroid-Lipofuscinoses
   D009472
   8 Fragile X Syndrome D005600 20 Rett Syndrome D015518
   9 Lafora Disease D020192 21 Spasms, Infantile D013036
   10 Landau-Kleffner Syndrome D018887 22 Tuberous Sclerosis D014402
   11 Leigh Disease D007888 23 Unverricht-Lundborg Syndrome D020194
   12 Lissencephaly D054082
   [101]Open in a new tab

   The MeSH headings which were listed in this manuscript all were
   according to the consensus of International League Against Epilepsy
   (ILAE) [[102]48,[103]49]. In this study, we focused on several key MeSH
   headings, such as Dravet syndrome, Lafora disease and refractory
   temporal epilepsy. Considering the complexity of epilepsy syndrome,
   other MeSH headings were all important related disorders or diseases
   (e.g. curable temporal epilepsy), which were used for comparable
   purposes. We hope to further understand and identify the new general or
   individual targets of refractory epilepsies through the comparable and
   extended research.

   Twenty-three MeSH headings were integrated in order to make comparisons
   to understand the reliability of target screening for correlation with
   disease phenotype.

RE associated genes and their functional characteristics

   After filtering the relationships with significant correlations (i.e.
   p-value < 0.05), total 3,219 disease-gene relationships were obtained
   with the corresponding number of occurrences in the CoreMine PubMed
   search engine system[[104]27]. All literatures related to the 3,219
   relationships in the PubMed database were analyzed and only 1,852
   disease-gene relationships with 1,065 distinct genes were identified.
   There were additional 24 disease-gene relationships with 21 distinct
   genes, which were not included in the CoreMine data sources by checking
   the Online Mendelian Inheritance in Man (OMIM)[[105]28] and
   DiseaseConnect[[106]29] databases. All together we finally obtain 1,086
   RE-related genes for further analysis. To further validate the
   reliability of the genes, we also conducted the external validation
   analysis of epilepsy disease-gene associations using the latest data
   from the high-quality genotype-phenotype associations of Human
   Phenotype Ontology (HPO) database [[107]50]. Using “seizure or
   seizures” as keywords, we obtained 662 genes from HPO disease-gene
   associations that are related to disease phenotypes with seizure
   manifestations. There were 215 (215/662 = 32.5%) genes were overlapped
   with the 1086 RE-related genes, which has 5.70-folds over random
   expectations (p-value<9.39E-100; binomial test). This result indicates
   that our curated disease-gene associations have significant overlap
   with the gene list associated with seizure phenotypes and thus further
   means that our data is reliable for further analysis. The disease-gene
   association network with 1876 links was visualized accordingly and it
   showed clearly that the 23 RE related disease phenotypes (in terms of
   MeSH headings) both have their own distinct genes and many shared genes
   among them ([108]Fig 2A). In addition, we found that most (711/1086 =
   65.47%) of the genes related to just one disease subtype ([109]Fig 2B).
   However, substantial ratio (~35%) of the genes is associated with
   multiple disease subtypes. For example, SCN1A associates with 10
   disease subtypes, while the four genes: KCNQ2, NHLRC1, PCDH19 and SCN1B
   associate with 9 subtypes.

Fig 2. Refractory Epilepsies (RE) disease-gene network and relationships
statistics.

   [110]Fig 2
   [111]Open in a new tab

   (A) Disease-gene network showing the relationships of 1,086 genes
   corresponding to 23 MeSH Headings. The long label big nodes denote
   diseases in MeSH terminologies; the small nodes denote genes. The size
   of the node is positively related to the number of linked genes.
   Different colors represent different diseases. Genes located in the
   center of the network are associated with several diseases (e.g. SCN1A,
   KCNQ2). Genes located at the periphery of the network are associated
   with a single disease (e.g. CHRNA5, CHRNB4). (B) The gene-related MeSH
   heading numbers are distributed from 1 to 10, with different numbers
   for different colors. The total number of seed genes is 1086, of which
   711 (accounting for 65.47%) are only associated with one disease
   subtype (MeSH heading).

   To obtain the functional descriptions of RE associated genes, we
   performed the enrichment analysis of the GO and pathways of these
   genes[[112]51]. GO contains three hierarchically structured
   vocabularies which describe gene products in terms of their connected
   biological processes (BP), cellular components (CC) and molecular
   functions (MF)[[113]39]. We obtained 1014 enriched BP terms, 214 CC
   terms, and 176 MF terms ([114]S1–[115]S3 Tables) of the RE associated
   genes. For example, the specific level enriched GO terms that
   characterize the functions of these genes include, BP terms: ion
   transport (fifth level), generation of neurons (seventh level), and
   regulation of transmission of nerve impulse (sixth level); CC terms:
   neuron projection (fifth level), ion channel complex (fifth level), and
   postsynaptic membrane (fifth level); and MF terms: excitatory
   extracellular ligand-gated ion channel activity (eighth level),
   neurotransmitter receptor activity (fifth level), and NADH
   dehydrogenase (ubiquinone) activity (seventh level). This indicates
   that the well-known molecular mechanisms of RE that involve nervous
   system development and intercellular signaling transduction
   [[116]52–[117]54].

   In addition, it also showed the potential novel mechanisms of RE, which
   included the enriched GO terms, such as “regulation of programmed cell
   death” (sixth level), “apoptotic process” (sixth level), “response to
   ethanol” (sixth level) and “response to nicotine” (sixth level). Recent
   study suggested that neuronal death seems closely linked to
   epileptogenesis, but the effect of seizures on neuronal death and the
   role of seizure-induced neuronal death in acquired epileptogenesis need
   further investigation[[118]55,[119]56]. Similarly, cigarette smoking
   and alcohol misuse were considered as the behavioral risk factors
   associated with epilepsy [[120]57,[121]58], which indicated that the
   mechanism for nicotine-induced and alcohol-induced seizure would be
   valuable for further investigation.

   Furthermore, we identified 55 enriched pathways (corrected
   p-value<0.05) in the Reactome database ([122]S4 Table), in which most
   pathways were consistent with the results of GO enrichment analysis.
   For example, the pathways, such as “mTOR signaling pathway”, “highly
   calcium permeable postsynaptic nicotinic acetylcholine receptors”,
   “potassium Channels”, “GABA A receptor activation”, “unblocking of NMDA
   receptor” and “glutamate binding and activation”, have close
   relationships to RE. It is well established that the abnormal
   activities of ion channels, neurotransmitters, and NMDA receptors lead
   to synaptic euphoria, which may also be involved in the onset of
   epileptic seizures[[123]59]. Furthermore, the citric acid cycle (TCA)
   and respiratory electron transport, which had important roles in RE
   development, were also identified to be enriched pathways. It is
   well-known that epilepsy has a strong genetic background with the
   combined risk related to hundreds or thousands of genes. Identifying
   the gene network or pathways that underlying epilepsy is important for
   detection of new targets for anti-epilepsy medications. Other enriched
   pathways, such as “presynaptic nicotinic acetylcholine receptors”,
   “signaling by insulin receptor”, may play a role in the development of
   RE and its therapeutic responses [[124]60–[125]62].

Disease modules that reveal the underlying molecular mechanisms of RE

   Disease associated genes are not distributed randomly on the molecular
   interaction network and they tend to work together in similar
   biological modules or pathways[[126]63]. To investigate the underlying
   molecular modular mechanisms of RE, we obtained high quality human PPI
   network (filtered by a significant score > = 700) from the STRING 9.1
   database [[127]64,[128]65]. Finally, using a widely used community
   detection algorithm (see [129]Methods), we identified 314 topological
   modules covering 13,733 proteins and 136,800 edges in the human PPI
   network. To obtain the RE disease modules, we calculated the overlap
   ratio (in terms of Relative Risk, RR)) between RE associated genes and
   the genes of each topological module. There were 921 (921/1086 = 84.8%)
   associated genes distributed in 185 (185/314 = 58.9%) modules, which
   contained 12,084 (12,084/13,733 = 88.0%) genes/proteins in total.

   We combined two filtering conditions to obtain the significant
   associated modules for RE disorders. Firstly, we calculated the
   correlation between 185 modules and RE disorders by Chi-Square test
   (with Bonferroni correction) and finally we found that there are five
   modules with corrected p-value<0.01, in which module M37 is the last
   top ranked module with corrected p-value ~3.67e-4 ([130]S5 Table). It
   suggested that those five modules may have a significant correlation
   with RE. Interestingly, all proteins (i.e., GRM1-8) of module M145 were
   seed genes of RE, which form a dense network corresponding to the
   glutamate receptor complex. Module M37 has 305 genes and overlapped 45
   (45/305 = 14.8%) seed genes with RE associated gene list and thus has a
   RR value 1.94. In fact, there are many modules with less number of
   genes, which although have not passed the most conserved type of
   multiple testing correction method, have much larger RR values over the
   RR value of M37. Therefore, finally we consider the RR value of M37 as
   the threshold to filter the significant associated modules of RE
   disorders, which include 42 modules ([131]S6 Table). It is these 42
   primary modules are used for further analysis.

   To distinguish the core modules from those 42 significant modules, we
   constructed a network with nodes representing the modules and links
   representing shared PPI interactions between them ([132]Fig 3). It
   showed that there are six modules (M37, M65, M80, M114, M155, and M197)
   with significantly higher degrees than that of the other modules
   ([133]Table 2), which means that these six modules are the central
   modules underlying RE molecular pathologies[[134]66].

Fig 3. Interactions between 42 topological modules.

   [135]Fig 3
   [136]Open in a new tab

   Red nodes: seed genes of modules > = 10; blue nodes: seed genes of
   modules < 10; green nodes: extensional modules. Node size corresponds
   to degree of module. The thickness of the edge is proportional to the
   weight, and the weight corresponds to strength of interactions.

Table 2. Top one enriched GO terms in the main ten modules.

   Module Size Degree BP CC MF
   GO term p-value GO term p-value GO term p-value
   M37 305 10 Oxidative phosphorylation < 1E-100 Mitochondrial membrane
   part <1 E-100 Hydrogen ion transmembrane transporter activity < 1E-100
   M65 184 16 Ion transport 2.29E-43 Calcium channel complex 9.34E-47
   Gated channel activity 3.87E-53
   M80 52 15 Regulation of TOR signaling cascade 8.69E-13 Cytosol 1.31E-07
   Protein serine/threonine kinase activity 4.76E-05
   M114 53 11 Regulation of fatty acid oxidation 4.68E-22 AMP-activated
   protein kinase complex 1.90E-11 Protein serine/threonine kinase
   activity 2.68E-15
   M145 7 3 Negative regulation of cyclase activity 8.16E-10 Integral to
   plasma membrane 2.81E-07 Glutamate receptor activity 8.27E-19
   M155 141 13 Ion transport 4.02E-69 Ion channel complex 6.63E-83 Gated
   channel activity 6.01E-98
   M188 43 6 Synaptic transmission, cholinergic 1.10E-28 Nicotinic
   acetylcholine-gated receptor-channel complex 5.90E-38 Nicotinic
   acetylcholine-activated cation-selective channel activity 9.95E-35
   M197 50 11 Membrane invagination 1.35E-09 Coated pit 5.83E-07
   Lipoprotein binding 1.68E-09
   M208 34 6 Muscle organ development 3.62E-09 Dystrophin-associated
   glycoprotein complex 1.09E-42 Calcium ion binding 8.94E-04
   M230 45 8 Protein localization 6.44E-15 Endosome 3.62E-29
   [137]Open in a new tab

   Size means the number of proteins in each module. Degree corresponds to
   number of edges connected with module. The function of a module is very
   similar to those of the proteins in it. The p-value is negatively
   related to the enrichment.

   We further examined the enriched GO terms of the 42 modules ([138]Table
   2, [139]S7 Table), in which we only list the top enriched specific GO
   terms. For example, the three modules, namely M37, M155 and M65, have
   enriched terms with lowest p-values, which include the BP terms:
   oxidative phosphorylation, ion transport, and synaptic transmission;
   the CC terms: mitochondrial inner membrane, channel complex, and
   synapse part; and the MF terms: hydrogen ion transmembrane transporter
   activity, ion channel activity, and gated channel activity.

   Most modules (except module M139 and M150) have their enriched pathways
   using six pathway databases in KOBAS 2.0. However, we only obtained 317
   enriched pathways (corrected p-value<0.05) for 28 modules using the
   Reactome database. The 317 pathways were classified into two types
   according to whether it was located in a single module or multiple
   modules. We call the former as “individual pathway” and the latter as
   “common pathway” for RE. Statistical analysis yielded 223 individual
   pathways located on 27 modules, and 94 common pathways distributed
   across 20 modules. We generated a heat map to display the 94 common
   pathways corresponding to their enrichments in 20 modules ([140]Fig 4).
   It showed that the common enriched pathways also referred to the branch
   of transmission across chemical synapses, neurotransmitter receptor
   binding and downstream transmission in the postsynaptic cell.
   Neurotransmitter secretion is triggered by the influx of Ca^2+ through
   voltage-gated channels, which gives rise to a transient increase in
   Ca^2+ concentration within the presynaptic terminal. Module M37, M65
   and M155 had significant enrichment pathways (red color) among the 20
   modules. For module M37, the highly enriched individual pathway was
   “Respiratory electron transport, ATP synthesis by chemiosmotic
   coupling, and heat production by uncoupling proteins”. Module M65 was
   closely related to the “Activation of NMDA receptor upon glutamate
   binding and postsynaptic events” “Glutamate Binding, Activation of AMPA
   Receptors and Synaptic Plasticity” “Trafficking of AMPA receptors” and
   “Unblocking of NMDA receptor, glutamate binding and activation”. Module
   M155 was most related to the “activation of voltage-gated potassium
   channels”, “GABA receptor activation”. NMDA receptor-mediated signal
   transduction is critical for synaptic plasticity. In acute and chronic
   seizures, a selective NMDA receptor antagonist has broad clinical
   application prospects[[141]5]. It has been shown that CaV3.2 channels
   regulate NMDA receptor mediated transmission and subsequent NMDA
   receptor dependent plasticity of AMPA-R-mediated transmission[[142]67].

Fig 4. Heat map of 94 common pathways corresponding to enrichment in 20
modules.

   [143]Fig 4
   [144]Open in a new tab

   X-axis: pathway; Y-axis: module. Color key from blue to red is
   representative of low to high enrichment (p value from large to small).
   If the cross of module and pathway form a black cell, it refers to that
   the pathway in the module is not enriched.

   To further detect the functional similarity between two module pairs,
   the vectors corresponding to each module were constructed using their
   GO terms and pathways as features, and the cosine similarity of the
   module vectors was calculated to quantify the similarity between the
   two modules pairs. A combined score was then used to comprehensively
   evaluate the functional similarity between these modules. The combined
   scores of the top seven module pairs are shown in [145]Table 3 ([146]S8
   Table). As we can see, module M65 and M155 exhibited the most
   functional similarity among the 42 modules.

Table 3. Combined score of top seven pair of modules.

   Module1 Module2  GO-BP  GO-MF GO-CC  Pathway Combined score
     M65    M155    0.615  0.814 0.557   0.364      0.513
    M279    M155   0.00572 0.323 0.0173  0.753      0.434
     M80    M114   0.0966  0.644 0.240   0.515      0.421
     M65    M188    0.358  0.490 0.356   0.268      0.335
    M271    M225    0.446  0.323 0.175   0.332      0.323
    M271     M24   0.0747  0.347 0.424   0.317      0.299
    M188    M155    0.306  0.417 0.386   0.178      0.274
   [147]Open in a new tab

   Only 36 (36/42 = 85.7%) modules that had functional similarity were
   classified into five clusters using the community detection algorithm
   in Gephi software. Each cluster corresponds well to the RE MeSH
   headings related to the PPI topological modules (RR>1.0). These
   allocations were based on the observation that genes causing similar
   diseases tend to link to each other in the interactome
   [[148]68,[149]69]. The patients with similar clinical manifestations
   may have different underlying disease mechanisms. Moreover, gene
   products linked to the same phenotype have a strong tendency to
   interact with each other and to cluster in the same network
   neighborhood[[150]70]. It showed that most of the MeSH headings map
   multi-modules, especially Myoclonic epilepsies ([151]Table 4).

Table 4. The number of modules corresponded to MeSH headings.

   MeSH heading Number of Module Module MeSH heading Number of Module
   Module
   Alexander Disease 1 M80 Fragile X Syndrome 5 M88,M145,M230
   M234,M271
   Leigh Disease 1 M37 Myoclonic Epilepsy, Juvenile 6 M83,M105,M140
   M145,M155,M266
   MERRF Syndrome 1 M208 Epilepsy, Temporal Lobe 7 M63,M105,M145,M150
   M155,M225,M232
   Mitochondrial Encephalomyopathies 1 M300 Spasms, Infantile 7
   M53,M63,M80,M139
   M155,M208,M253
   Epilepsy, Frontal Lobe 2 M188,M266 Epilepsy, Absence 8
   M65,M105,M145,M155
   M188,M266,M271,M279
   MELAS Syndrome 2 M37,M300 Rett Syndrome 8 M26,M50,M53,M145
   M225,M234,M252,M271
   Classical Lissencephalies and Subcortical Band Heterotopias 3
   M34,M80,M188 Tuberous Sclerosis 8 M24,M80,M83,M114
   M129,M179,M230,M234
   Unverricht-Lundborg Syndrome 3 M27,M63,M129 Malformations of Cortical
   Development 9 M34,M65,M80,M88,M112
   M145,M165,M208,M225
   Lafora Disease 4 M27,M114,M155,M230 Myoclonic Epilepsies, Progressive 9
   M27,M53,M114,M129,M150
   M155,M190,M225,M232
   Lissencephaly 4 M80,M114,M197,M208 Epilepsies, Myoclonic 12
   M63,M105,M112,M136,M139,M142
   M155,M179,M188,M219,M266,M300
   Neuronal Ceroid-Lipofuscinoses 4 M27,M96,M188,M229
   [152]Open in a new tab

   We have further investigated the associated modules corresponded to
   each MeSH headings and actually we found that there are distinct
   modules associated to some specific RE MeSH headings ([153]Table 4,
   e.g. module M197 for Lissencephaly, M136 for Myoclonic Epilepsies),
   which means that some RE disease subtypes have their corresponding
   distinct molecular mechanisms. Furthermore, we found that different
   MeSH headings have substantial number of shared modules, which makes
   some modules such as M155 and M80 associated to multiple MeSH headings.
   This suggested that these modules (e.g.M155) would be the common
   underlying molecular network of different RE subtypes. These
   indications could be further confirmed from the results of the shared
   genes (using Jaccard similarity) between each RE MeSH headings ([154]S9
   Table). We found that most RE MeSH headings have some degree of shared
   genes between each other.

   Moreover, frontal lobe epilepsy and temporal lobe epilepsy have very
   different gene distributions within their modules. Frontal lobe
   epilepsy with 49 associated genes correlates to two modules when
   RR>1.0, while temporal lobe epilepsy with 185 associated genes
   correlates to seven modules when RR>1.0. The therapeutic effect of
   different brain regions affected by disease varies considerably, which
   can, to some extent, be demonstrated from the difference in module
   distribution. Although the module distribution is disparate, there were
   functional similarities between the modules, which might indicate
   shared similar molecular pathologies between them.

The underlying mechanisms of AEDs from the perspective of drug-target-gene
interactions

   Related antiepileptic terms, such as anticonvulsant, antiseizure, and
   antiepileptic, were used to obtain the names of current AEDs from the
   DrugBank and Sider databases [[155]43]. Finally, 68 drugs and their 119
   corresponding targets were identified and obtained ([156]Fig 5A). Most
   drugs target only a few proteins, but some, like Zonisamide and
   Diazepam, have many targets. It has been suggested that the eight GABA
   receptors such as GABRAx, GABRB1, and GABRD are targeted by more than
   20 drugs each, and were the top eight proteins targeted by these drugs.
   The top seven drugs with most targets included Diazepam, Nitrazepam,
   and Primidone. They all had more than 20 target proteins, which were
   GABA receptors, and had significant overlap between each other. The
   distribution of drugs per target and the distribution of targets per
   drug are shown in [157]Fig 5B and 5C. There were 18 (18/68 = 26.5%)
   drugs with only one target, and 57 (57/119 = 47.9%) target proteins
   targeted by only a single antiepileptic drug. This suggested that new
   drugs tend to bind known target proteins, which had shown obvious
   limitations in the treatment of epilepsy. The 119 drug targets were
   mapped to 13,733 proteins, with an overlap for 105 (105/119 = 88.2%)
   drug targets. The minimum shortest distances between 105 drug targets
   and 921 seed genes were measured to analyze their interaction path
   lengths. The actual mechanism through which a drug acts may be unknown,
   however, the number of molecular steps between a drug target and the
   corresponding disease cause can be estimated by the shortest distance
   [[158]44]. A clear enrichment was observed in regions with the lower
   shortest distances, in comparison with randomized gene groups of
   similar size and pairing ([159]Fig 5D; p-value = 3.63E-126, t-test),
   showing the high direct-target effect of current AEDs. Moreover, this
   provided evidence that most AEDs directly corresponding molecular
   markers that aid the understanding of the cause of RE.

Fig 5. Refractory Epilepsies (RE) existing drug-target network, statistical
analysis about drug- target and minimum shortest path analysis.

   [160]Fig 5
   [161]Open in a new tab

   (A) Drug-target network. Blue nodes: drugs; red nodes: targets. The
   network was generated by using the known associations between drugs and
   their targets from the DrugBank and Sider databases. The size of each
   drug (target) node is proportional to the number of targets that the
   drug has (the number of drugs targeting the protein), respectively. A
   link is placed between a drug node and a target node if the protein is
   a known target of that drug. One drug can target multiple proteins, and
   one protein can be targeted by multiple drugs. (B) Distribution of
   drugs with respect to the number of their targets; (C) Distribution of
   targets with respect to the number of effective drugs. (D) Distribution
   of the shortest distances of the actual data and random protein groups.

   Drugs do not target diseases equally, but are clearly enriched in some
   modules[[162]44]. To quantify this effect and investigate the
   distribution of drug targets, the shortest path length between the 105
   drug targets and the 314 modules was calculated. The results showed
   that M155 had the greatest number of targets (19 direct targets and 8
   targets as the first neighbors of seed nodes), followed by M65 (16
   direct targets and 8 targets as the first neighbors of seed nodes). The
   other modules had <7 targets ([163]S10 Table).

   The most important modules that represented the greatest correlation
   were selected as the modules that included more drugs that are
   available. To understand the degree of a module in the RE drug
   discovery, it would be relevant to normalize the number of drugs by the
   number of proteins of each module. The number of drugs corresponding to
   each module was determined statistically ([164]S10 Table). M155 had the
   greatest number of drugs (48 drugs), followed by M65 (20 drugs). Module
   M95 was not in the primary 42 modules, although it had 21 drugs. The
   other modules had <11 drugs. The results indicated that modules M155
   and M65 had the greatest potential with RE pharmacology and were
   considered the most important modules for drug discovery. The two
   modules had the proteins, such as KCNA2, GABRA1, and GRIN2A, which had
   been recently be discovered as novel epilepsy genes [[165]71].

Novel potential drug targets detected in RE disease modules

   A disease module, a local neighborhood of the interactome whose
   perturbation is associated with epilepsy, can be mechanistically linked
   to a particular disease phenotype[[166]70,[167]72]. The precise
   identification of such disease modules could help with the elucidation
   of molecular mechanisms, identification of new disease genes, and
   related signaling pathways, and aid with rational drug target
   identification[[168]69]. Although module M37 had the greatest number of
   seed genes, there were no available AEDs for module M37, which had only
   10 (10/305 = 3.3%) proteins targeted by drugs for the treatment of
   non-epileptic disorders. Module M155 had the most seed genes and the
   greatest number of drugs when RR> = 2.0. From the aforementioned
   functional analysis, Module M155 can be identified as one of the most
   important modules for drug discovery. The most enriched terms are ion
   transport for BP, ion channel complex for CC, gated channel activity
   for MF, and the enriched pathway is the activation of voltage gated
   potassium channels, which is the only pathway under voltage gated
   potassium channels in the Reactome database. Potassium channels are
   important determinants of seizure susceptibility by modulating the
   electrical activity of neuronal and non-neuronal cells in the brain.
   The 141 proteins of module M155 were classified into six types on the
   basis of whether seed gene or drug target ([169]Fig 6A) and there were
   43(43/141 = 30.5%) proteins located in the pathway of activation of
   voltage-gated potassium channels.

Fig 6. Disease module network.

   [170]Fig 6
   [171]Open in a new tab

   (A) Protein classification network of module M155. The protein nodes
   are color-coded. Light pink nodes: currently known epilepsy drug
   targets which are not seed genes; green nodes: currently known drug
   targets, not specific for epilepsy, which are not seed genes; red
   nodes: currently unknown drug targets which are also not seed genes;
   blue nodes: currently known epilepsy drug targets which are also seed
   genes; yellow nodes: currently known drug targets—seed genes, not
   specific for epilepsy; purple nodes: currently unknown drug
   targets—seed genes, not specific for epilepsy. (B) Protein
   classification network of module M65. The node coloring is the same as
   that of module M155.

   In order to be considered as a novel drug target for epilepsy, the
   protein still needs to fulfill the following two conditions: 1) no
   known anti-epilepsy drugs target this protein; 2) The target should
   exist in the most enriched pathways related to epilepsy. Considering
   these two conditions, we finally identified six proteins (i.e. KCNA1,
   KCNA4, KCNA5, KCNA6, KCNC3 and KCND2) from M155 as the potential drug
   targets ([172]Fig 6A).

   The aforementioned functional similarity analysis revealed that module
   M155 and M65 had the largest correlations. Drugs distribution showed
   that module M65 had the second highest number of drugs when RR > = 2.0.
   Similarly, we, obtained four proteins (i.e. KCNMA1, CAMK2G, CACNB4 and
   KCNMB3) from module M65 as potential drug targets ([173]Fig 6B), and
   one (GRM1) from module M145 (actually M145 is a protein complex). The
   expression of 11 proteins at the tissue and cellular levels are shown
   in [174]Table 5. Specific gene mutations occurring exclusivelyin the
   brain can lead to RE, which was identified as one of the causes of RE
   [[175]4]. Hippocampal sclerosis and dysplasia of the cerebral cortex
   have been considered vitally important pathogenic factors of
   RE[[176]73,[177]74]. KCNMB3 is not expressed in the cerebral cortex and
   hippocampus; therefore, it could not be considered as a candidate drug
   target. In conclusion, ten potential drug targets were finally
   identified for RE in this study.

Table 5. Expression of potential drug targets in tissues and cells.

   Tissue Cerebral Cortex Hippocampus
   Cell Neuronal cells Glial cells Endothelial cells Neuropil Neuronal
   cells Glial cells
   M155 KCNA1 KCNA5 KCNA1 KCNA1 KCNA1 KCNA6
   KCNA4 KCNA6 KCNA4 KCNA4 KCNA4
   KCNA5 KCNA5 KCNA5 KCNA6
   KCNA6 KCNA6 KCNA6 KCNC3
   KCNC3 KCND2 KCNC3
   KCND2
   M65 CAMK2G CAMK2G CACNB4 KCNMA1 CAMK2G CACNB4
   CACNB4 CACNB4 CAMK2G CACNB4
   CACNB4
   M145 GRM1 GRM1 GRM1 GRM1 GRM1
   [178]Open in a new tab

   The target tissue information can be searched in the human protein
   atlas database.

Discussion

   In this study, network medicine approaches were used to integrate the
   data from multiple databases to investigate the molecular mechanisms
   and possible drug targets of RE. Total 1086 genes associated with RE
   were curated after validating each disease-gene relationships from
   medical literature. This gene list might be the most comprehensive
   phenotype-genotype association data repository for studying RE
   molecular mechanisms. Forty two primary disease-related gene networks
   (modules) were constructed and selected based on the interaction
   information of gene-encoded proteins in which about 35 modules had
   significant interactions and functional similarity between each other.
   These architectures of interaction module indicated the complicated
   molecular mechanisms of RE and the possible pharmacological targets of
   RE personalized treatment. The detected RE-related modules provided a
   potential subtyping of RE from molecular network perspectives. For
   example, three significant core modules (M155, M65 and M145) may
   contain potential drug targets. The protein components of these three
   modules were consistent with the current understanding of pathological
   mechanisms and pharmacology of epilepsy. Besides, the significant
   enriched pathways, for example, “GABA A receptor activation”,
   “unblocking of NMDA receptor, glutamate binding and activation”, are
   likely the main biological mechanisms of RE and deliver highly
   insightful information for its drug discovery.

   The ten potential targets for antiepileptic medications could be
   classified into two groups including the ion channel (potassium and
   calcium channel) and glutamate receptors. Consistently, the KCNAs,
   components of potassium channel, were detected as the potential drug
   target for RE in this study. Ezogabine or retigabine was an
   antiepileptic drug that reduces neuronal excitability by enhancing the
   activity of the KCNQ [179]potassium channels; KCNAs (e.g. KCNA1 and
   KCNA6) and KCNQ2 are associated with peripheral nerve hyperexcitability
   in humans[[180]75]. Based on the aforementioned network analysis, KCNA1
   and KCNA6 belong to M155 which is the most important module for
   epilepsy. Both KCNA1 and KCNA6 involved in cell signal transduction.
   Taken together, KCNA1 and KCNA6 might have the potential to become the
   best novel targets for antiepileptic drug discovery. CACNB4 could
   directly couple electrical activity to gene expression, which was
   responsible for a type of juvenile myoclonic epilepsy[[181]76]. Recent
   studies had shown that ganoderma lucidum polysaccharides may inhibit
   calcium overload and promote CaMK II α expression to protect epileptic
   neurons[[182]77].

   However, we searched the drugs for these ten potential targets from the
   Drugbank database, and found that most of the drugs were general
   anesthetics (Enflurane, Methoxyflurane, and Sevoflurane, etc.) or
   potassium channels blockers (Dalfampridine, Amitriptyline). Although
   the ten potential drug targets that were identified in this study have
   approved drugs but they are not antiepilepsy drugs. This means that
   these ten drug targets might be the novel targets for epilepsy
   treatment since they located in the significant associated modules of
   RE disorders. In addition, these approved drugs for other diseases
   might have the opportunity to be repurposed for RE treatment as well.
   As a matter of fact, six of the ten potential targets (i.e. KCNA1,
   KCNA4, KCNA5, KCNA6, KCNC3 and KCND2) have been already used in
   clinical practice. These proteins are the targets of Dalfampridine,
   which is a potassium channel blocker used to help multiple sclerosis
   patients walk [[183]78]. Dalfampridine is a neurofunctional modifier
   and the first drug that was specifically approved to help with mobility
   in these patients. However, Dalfampridine may cause seizure as its
   serious side effect [[184]79,[185]80]. This meant that there were still
   no particular drugs aiming for these ten potential targets to treat
   epilepsy. Nevertheless, our results indicate that further studies to
   elucidate the precise intervention of these targets might be valuable
   for novel drug development of RE.

   Our results showed that the curated RE-related 1086 genes were enriched
   in the nicotine addiction pathway. It reported that nicotine addiction
   could cause seizures in human subjects and reduction in the activity of
   the glutamate transporter type 3, leading to a decrease in glutamate
   uptake[[186]57,[187]81]. GRM1 was closely related to the RE
   pathological mechanism, which made it a potential new target for
   antiepileptic medications. The nicotine addiction pathway was also
   closely tied to the GABAergic synapse [[188]82]. More than half of GABA
   receptors were distributed in module M155. Therefore, module M155 may
   provide the potential molecular network mechanism for nicotine-induced
   seizures.

   Furthermore, the modules: M37, M80, M114, and M197, which had no
   available AEDs targets, also played central roles in the molecular
   network of RE. Functional analysis of these modules showed that M80 and
   M114 were mainly relevant to mTOR signaling pathway. This is consistent
   with that mTOR signaling pathway plays a role in RE
   physiopathology[[189]83]; Module M37 was another significant network
   module because its most important function was oxidative
   phosphorylation—a function that was crucial for the development of
   epilepsy[[190]84]; In our previous studies[[191]23], four proteins
   (MT-CYB, UQCRB, UQCRC1 and UQCRH) were identified to be potential drug
   targets for RE. All these four proteins related to oxidative
   phosphorylation are the components of module M37. The genes of module
   M197 were mainly enriched in lipoprotein metabolism and HDL-mediated
   lipid transport pathways. A previous study showed that long-term AEDs
   therapy could significantly increase the total cholesterol and
   high-density lipoprotein cholesterol (HDLC). Moreover, the effect was
   more pronounced with HDLC[[192]85]. Therefore, the further exploration
   of possible targets that interact with these modules would be much
   valuable for RE related drug pharmacological research.

   However, our results would also be influenced by incomplete data and
   may contain bias introduced by available publications. As for the
   topology of modules, different algorithms may produce different
   community detection results that might influence some specific results.
   In addition, this study did not experimentally validate the 10
   potential drug targets in vivo or vitro. This study is also not
   currently involved in clinical practice, but similar bioinformatics
   analysis has already applied to clinical cases [[193]86]. However, we
   believe that our results provide novel insights on molecular mechanisms
   of RE. Furthermore, the curated RE-related genes and identified modules
   serve a useful resource for the discovery of potential drug targets for
   RE.

   In the further research, we will validate some potential effective
   targets through molecular biology and animal model. We will also apply
   computer aided drug design (CADD) related software, such as DOCK, Auto
   Dock, and MOE, to facilitate potential drug design through docking
   computation[[194]87]. It is well known that both RE and cancer have the
   issue of drug resistance in clinical practice. Previous studies showed
   that there were possible similar molecular mechanisms underlying the
   drug resistance in both RE and cancer [[195]88]. We may reveal the
   potential molecular mechanism of drug resistance in RE and make
   comparison between RE and cancer through network analysis.

Supporting information

   S1 Table. GO enrichment analysis- Biological Processes (BP) terms of RE
   seed genes.

   Here we just showed the top 200 BP terms (total 1014 BP terms).

   (DOCX)
   [196]Click here for additional data file.^ (25.6KB, docx)
   S2 Table. GO enrichment analysis- Cellular Components (CC) terms of RE
   seed genes.

   Here we listed all the 214 CC terms.

   (DOCX)
   [197]Click here for additional data file.^ (27.1KB, docx)
   S3 Table. GO enrichment analysis- Molecular Functions (MF) terms of RE
   seed genes.

   Here we listed all the 176 MF terms.

   (DOCX)
   [198]Click here for additional data file.^ (24.9KB, docx)
   S4 Table. Reactome Pathway enrichment analysis of RE genes.

   (DOCX)
   [199]Click here for additional data file.^ (19.8KB, docx)
   S5 Table. Correlation between 185 modules and RE disorders.

   (DOCX)
   [200]Click here for additional data file.^ (27.1KB, docx)
   S6 Table. Primary 42 modules filtered by RR values.

   (DOCX)
   [201]Click here for additional data file.^ (18.2KB, docx)
   S7 Table. GO enrichment analysis of all 42 modules.

   We listed the smallest p-value terms of each module. If the smallest
   p-value >0.05, we leave it blank.

   (DOCX)
   [202]Click here for additional data file.^ (20.7KB, docx)
   S8 Table. Combined score of module pair.

   (DOCX)
   [203]Click here for additional data file.^ (49.1KB, docx)
   S9 Table. Jaccard similarity between MeSH headings.

   Here we listed all the 193 records.

   (DOCX)
   [204]Click here for additional data file.^ (27.8KB, docx)
   S10 Table. Drugs and targets distributed in modules.

   (DOCX)
   [205]Click here for additional data file.^ (16.8KB, docx)

Acknowledgments