Abstract

   Mental health disorders emerge from complex interactions among
   neurobiological processes across multiple scales, which poses
   challenges in uncovering pathological pathways from molecular
   dysfunction to neuroimaging changes. Here, we proposed a multiscale
   fusion (mFusion) method to evaluate the relevance of each gene to the
   neuroimaging traits of mental health disorders. We combined
   gene-neuroimaging associations with gene-positron emission tomography
   (PET) and PET-neuroimaging associations using protein-protein
   interaction networks, where various genes traced by PET maps are
   involved in neurotransmission. Compared with previous methods, the
   proposed algorithm identified more disease genes on both simulated and
   empirical data sets. Applying mFusion to eight mental health disorders,
   we found that these disorders formed three clusters with distinct
   associated genes. In summary, mFusion is a promising tool of
   prioritizing genes for mental health disorders by establishing
   gene-PET-neuroimaging pathways.

   Subject terms: Computational models, Gene expression
     __________________________________________________________________

   We introduced mFusion, a method that integrates gene and neuroimaging
   data to identify disease-related genes in mental disorders. By
   analyzing gene interactions and PET data, mFusion successfully clusters
   disorders and highlight critical gene pathways.

Introduction

   Mental health disorders, constituting 16% of the global burden of
   diseases, rank among the leading causes of disability worldwide^[30]1.
   In severe cases, they can diminish life expectancy by 10 to 20
   years^[31]2. Despite substantial progresses in understanding molecular
   mechanisms of brain functions in animal models, the rate of successful
   clinical translations to humans remains notably low^[32]3. The primary
   obstacle lies in the current knowledge gap between molecular
   processes^[33]4 and psychiatric symptoms. There exist many complex
   interactions across multiple scales from genes, through
   neurotransmitters, to neural networks. This complexity is compounded by
   the challenge of concurrently collecting multiscale data within the
   human brain. As human brain data rapidly accumulate but separately at
   various scales, there is an urgent need for dedicated analytic method
   to integrate these data comprehensively, enabling the discovery of
   insights into mental health disorders.

   At present, some public collection databases can identify
   disease-related genes, such as DisGeNET^[34]5 and CTD (Comparative
   Toxicogenomics Database)^[35]6, but they lack the capacity to establish
   connections with neurotransmitter systems or pathways. Both gene
   differential expression analysis and Genome-wide association study
   (GWAS) analysis fall short in addressing this challenge^[36]7, with
   limited coverage of disease phenotypes. Partial Least Squares (PLS)
   regression analysis can establish associations between genes and
   imaging phenotypes based on spatial molecular distribution patterns in
   the brain^[37]8,[38]9. However, it can only perform pairwise
   correlation analysis, necessitating a method to facilitate the
   establishment of cross-scale pathway associations.

   Neuroimaging studies have identified various alterations in
   neuroimaging features of human brains associated with mental health
   disorders, i.e., spatial distributions of alterations across different
   brain regions in psychiatric patients compared with healthy
   controls^[39]10. Leveraging transcriptomic data from postmortem brain
   tissues^[40]11, researchers have initiated efforts to correlate
   neuroimaging features with gene expressions, prioritizing relevant
   genes and molecular pathways^[41]12. In this way, genes associated with
   neurodevelopment, neuroplasticity, and neurotransmission have been
   implicated in autism spectrum disorder (ASD)^[42]9 and schizophrenia
   (SCZ)^[43]13. Despite these progresses, a significant knowledge gap
   persists between gene expressions and neuroimaging traits. Recently,
   positron emission tomography (PET) studies have started to reveal
   spatial associations between neurotransmitter receptors/transporters
   and structural/functional traits of mental health disorders in the
   human brain^[44]14,[45]15. Leveraging neurotransmissions revealed by
   PET images, this study aims to establish biological bridges for the gap
   between gene expressions and neuroimaging traits for mental disorders.
   The disease related genes are defined by 4 curated disease gene
   databases listed in Table [46]1.

Table 1.

   Four gene-disease databases
   Database # of SCZ risk genes # of ASD risk genes Collection date URL
   DisGeNet 2872 (score > 0) 1071 (score > 0) June, 2020 (v7.0)
   [47]https://www.disgenet.org/
   CTD 2875 (score > 15.28) 1071 (score > 29) June 30, 2023 (17123)
   [48]https://ctdbase.org/
   DISEASES 1548 (Z > 3) 211 (Z > 3) March, 2015
   [49]https://diseases.jensenlab.org/Downloads
   PGC-GWAS 380 (p < 5e-8) 56 (p < 5e-4)

   SCZ:2022^[50]57/

   ASD:2019^[51]53
   [52]https://pgc.unc.edu/for-researchers/download-results/
   [53]Open in a new tab

   This study proposes a multiscale fusion (mFusion) method to bridge
   genes to mental disorders through establishing links between gene
   expressions in brain tissues, neurotransmissions, and neuroimaging
   traits of these disorders. Leveraging the knowledge in the
   protein-protein interaction (PPI) network made available by the STRING
   database^[54]16, mFusion provides a tool for integrating 15,408 gene
   expression maps from the Allen Human Brain Atlas (AHBA)^[55]17,[56]18,
   45 PET maps across various neurotransmitter
   systems^[57]14,[58]19,[59]20, and neuroimaging traits associated with
   mental disorders. Performances of mFusion were first evaluated by
   numerical simulations, and then demonstrated by applying to
   neuroimaging traits of two mental disorders (i.e., autism^[60]9 and
   schizophrenia^[61]13). The ENIGMA (Enhancing NeuroImaging Genetics
   through Meta-Analysis) consortium has reported neuroimaging traits for
   mental disorders by analyzing thousands of neuroimaging scans^[62]21.
   Using these neuroimaging traits, mFusion enabled us to reveal the
   clustering structure for eight major mental disorders.

Results

Overview of mFusion framework

   In this study, the mFusion integrated gene expressions in brain tissues
   and PET maps for specific proteins (related to the receptors,
   transporters, or release of neurotransmitters) within a PPI network, to
   link neuroimaging traits to genes (Fig. [63]1; Additional file 1:
   Fig. [64]S1) through proteins (measured by PET maps; Table [65]2;
   Supplementary Table [66]S1). First, we examined Z-scores value of genes
   or proteins from three types of (PLS) associations independently,
   including gene-trait, PET-trait, and gene-PET associations. Second, we
   utilized the Z-transform test, also referred to as the “Stouffer’s
   method”^[67]22, to combine multiscale Z-scores of a gene. Meanwhile,
   the neighboring information of PPI network from STRING database was
   used to boost the ability of identifying disease related genes.
   Finally, disease category^[68]5 and Gene Ontology (GO)^[69]23 term
   enrichment analysis was conducted on the top-ranked genes, which were
   determined by the mFusion methods, to identify important biomolecular
   pathways or processes that relate to candidate genes. Further details
   are provided in Methods, and Supplementary Fig. [70]S1.

Fig. 1. The framework and working interface of the “mFusion” method.

   [71]Fig. 1
   [72]Open in a new tab

   By using partial least square association to integrate spatial
   correlations of gene expressions in the human brain with information
   about neurotransmission and neuroimaging, the mFusion method yields a
   relevance score for each gene and pathway associated with a mental
   disorder, facilitating the identification of top-ranked genes and
   pathways. This fusion method additively provided the potential reasons
   for neurochemical architectures (neurotransmissions) in PET images
   influencing gene scores. Subsequent enrichment analysis of top genes
   identifies biological process and pathways relate to the mental
   disorder.

Table 2.

   Neurotransmission-related PET maps included in analyses
   Protein Neurotransmitter Tracer Measure n Age Reference
   HTR1A Serotonin [^11C]CUMI-101 BP[ND] 8 (5) 28.4 ± 8.8 Beliveau et
   al.^[73]75
   HTR1A Serotonin [^11C]WAY-100635 BP[ND] 35 (17) 26.3 ± 5.2 Savli et
   al.^[74]76
   HTR1B Serotonin [^11C]AZ10419369 BP[ND] 36 (12) 27.8 ± 6.9 Beliveau et
   al.^[75]75
   HTR1B Serotonin [^11C]P943 BP[ND] 23 (8) 28.7 ± 7.0 Savli et al.^[76]76
   HTR1B Serotonin [^11C]P943 BP[ND] 65 (16) 33.7 ± 9.7 Gallezot et
   al.^[77]77
   HTR2A Serotonin [^18F]altanserin BP[ND] 19 (8) 28.2 ± 5.7 Savli et
   al.^[78]76
   HTR2A Serotonin [^11C]Cimbi-36 BP[ND] 29 (14) 22.6 ± 2.7 Beliveau et
   al.^[79]75
   HTR2A Serotonin [^11C]MDL100907 BP[ND] 3 (1) 35 ± 9 Talbot et
   al.^[80]78
   HTR4 Serotonin [^11C]SB207145 BP[ND] 59 (18) 25.9 ± 5.3 Beliveau et
   al.^[81]75
   HTR6 Serotonin [^11C]GSK215083 BP[ND] 30 (0) 36.6 ± 9.0 Radhakrishnan
   et al.^[82]79
   SLC6A4 Serotonin [^11C]DASB BP[ND] 100 (71) 25.1 ± 5.8 Beliveau et
   al.^[83]75
   SLC6A4 Serotonin [^11C]DASB BP[ND] 18 (6) 30.5 ± 9.5 Savli et
   al.^[84]76
   SLC6A4 Serotonin [^11C]MADAM BP[ND] 10 (2) range: 51–67 Fazio et
   al.^[85]80
   SLC6A4 Serotonin [^11C]MADAM BP[ND] 16 (2) range: 21–67 Dukart et
   al.^[86]20
   CNR1 Cannabinoid [^18F]FMPEP-d2 V[T] 22 (11) male: 27 ± 6; female:
   28 ± 10 Laurikainen et al.^[87]81
   CNR1 Cannabinoid [^11C]OMAR V[T] 77 (28) 30.0 ± 8.9 Normandin et
   al.^[88]82.
   DRD1 Dopamine [^11C]SCH23390 BP[ND] 13 (7) 33 ± 13 Kaller et
   al.^[89]83.
   DRD2 Dopamine [^11C]FLB457 BP[ND] 55 (29) 32.5 ± 9.7 Hansen et
   al.^[90]14.
   DRD2 Dopamine [^11C]FLB457 BP[ND] 6 (2) 39.5 ± 6.8 Sandiego et
   al.^[91]84.
   DRD2 Dopamine [^18F]fallypride BP[ND] 58 (22) 18.5 ± 0.6 Jaworska et
   al.^[92]85.
   DRD2 Dopamine [^11C]FLB457 BP[ND] 37 (20) 48.4 ± 16.9 Smith et
   al.^[93]86.
   DRD2 Dopamine [^11C]raclopride BP[ND] 7 (0) 24 ± 2 Alakurtti et
   al.^[94]87.
   SLC6A3 Dopamine [^123I]FP-CIT SUVR 174 (65) 61 ± 11 Dukart et
   al.^[95]88.
   SLC6A3 Dopamine [^123I]Ioflupano SUVR 26 (--) range 35 ~ 65 García-G et
   al.^[96]89.
   SLC6A3 Dopamine [^18F]FE-PE2I SUVR 10 (0) 28.1 ± 6.9 Sasaki et
   al.^[97]90.
   GABRA1 GABA -- -- 26 (0) 26 ± 5 Dukart et al.^[98]88.
   GABRA1 GABA [^11C]flumazenil B[max] 16 (9) 26.6 ± 8 Nørgaard et
   al^[99]91.
   HRH3 Histamine [^11C]GSK189254 V[T] 8 (1) 31.7 ± 9.0 Gallezot et
   al.^[100]92.
   OPRM1 Opioid [^11C]carfentanil BP[ND] 204 (72) 32.3 ± 10.8 Kantonen et
   al.^[101]93.
   OPRM1 Opioid [^11C]carfentanil BP[ND] 39 (19) 37.0 ± 4.9 Turtonen et
   al.^[102]94.
   SLC6A2 Norepinephrine [^11C]MRB BP[ND] 77 (27) 33.4 ± 9.2 Ding et
   al.^[103]95.
   SLC6A2 Norepinephrine [^11C]MRB BP[ND] 20 (8) 33.3 ± 10.0 Hesse et
   al.^[104]96.
   KIF17 Glutamate [^18F]GE-179 V[T] 29 (8) 40.9 ± 12.7 Galovic et
   al.^[105]97.
   SV2A* -- [^11C]UCB-J BP[ND] 10 (3) 36 ± 10 Finnema et al.^[106]98.
   VAT1L Acetylcholine [^18F]FEOBV SUVR 5 (4) 68.4 ± 3.4 Hansen et
   al.^[107]14.
   VAT1L Acetylcholine [^18F]FEOBV SUVR 6 (3) 67.0 ± 11.1 Aghourian et
   al.^[108]99.
   VAT1L Acetylcholine [^18F]FEOBV SUVR 4 (1) 37 ± 10.2 PI: Lauri Tuominen
   & Synthia Guimond
   VAT1L Acetylcholine [^18F]FEOBV SUVR 18 (13) 66.8 ± 6.8 Hansen et
   al.^[109]14.
   VAT1L Acetylcholine [^18F]FEOBV SUVR 5 (1) 68.3 ± 3.1 Bedard et
   al.^[110]100.
   CHRM1 Acetylcholine [^11C]LSN3172176 BP[ND] 24 (11) 40.5 ± 11.7
   Naganawa et al.^[111]101.
   GRM5 Glutamate [^11C]ABP688 BP[ND] 22 (10) 67.9 ± 9.6 PI: Rosa-Neto, P.
   & Kobayashi, E.
   GRM5 Glutamate [^11C]ABP688 BP[ND] 28 (13) 33.1 ± 11.2 DuBois et
   al.^[112]102.
   GRM5 Glutamate [^11C]ABP688 BP[ND] 74 (49) 20 ± 3.0 Smart et
   al.^[113]103.
   GRM5 Glutamate [^11C]ABP688 BP[ND] 22 (10) 67.9 ± 9.6 Hansen et
   al.^[114]14.
   CHRNA4 Acetylcholine [^18F]Flubatine V[T] 30 (10) 33.5 ± 10.7 Hillmer
   et al.^[115]104.
   [116]Open in a new tab

   The Protein column indicate the protein names in the STRING database.
   Supplementary Table [117]S1 also includes more extensive methodological
   details, such as Excitatory/Inhibitory, Ionotropic/Metabotropic, and
   Source toolkit. Values in parentheses (under n) indicate the number of
   females.

   BP[ND] parametric and regional non-displaceable binding potential,
   B[max] density (pmol ml^−1) converted from binding potential (5-HT) or
   distributional volume (GABA) using autoradiography-derived densities,
   V[T] tracer distribution volume, SUVR standardized uptake value ratio.

   *The synaptic vesicle glycoprotein 2 A(SV2A) is targeted by PET imaging
   to quantify synaptic density in human brains^[118]98.

mFusion outperformed the traditional method on simulation data

   We compared performance on simulation data between the traditional
   partial least squares (PLS) association method, and five fusion methods
   proposed by this study (i.e., meanGP, meanGPT, meanPPI, maxGPT, and
   maxPPI, see “Methods”). Evaluation metrics included the correlation
   between estimated gene scores and real gene weights, the number (or
   rate) of hits, the area under curve (AUC) of receiver operating
   characteristic (ROC), AUC of precision-recall (PR) curve (see
   “Methods”).

   Compared with other methods, we found that gene scores given by the
   meanPPI and maxPPI methods demonstrated higher correlation with real
   gene weights defined in the simulation model (Fig. [119]2a, unpaired
   Wilcoxon test, 500 times of simulations), higher hit rates of active
   genes in the simulation (Fig. [120]2b), and larger AUCs of both the ROC
   (Fig. [121]2c) and PR (Fig. [122]2d) curves, these curves were all
   generated by the mean value of 500 times of simulations.

Fig. 2. Evaluation of fusion methods from simulated datasets.

   [123]Fig. 2
   [124]Open in a new tab

   a The correlation between real gene weights and fusion weights measured
   by different fusion methods of 500 simulated experiments. The lower
   whisker extends from the first quartile (Q1) to the smallest data point
   that is within 1.5 * interquartile range (IQR) below Q1. The upper
   whisker extends from the third quartile (Q3) to the largest data point
   that is within 1.5 * IQR above Q3. The number next to bar represents
   the median of the population (using unpaired Wilcoxon test). b Average
   hit rates of genes in all 500 simulations. The hit rate was measured by
   the rate of really active genes in top K genes ranked by specific
   fusion method. c ROC (Receiver Operating Characteristic) curve of
   different fusion methods on simulation data. In simulation experiments,
   [MATH: <msub><mrow><mover accent="true"><mrow><mi
   mathvariant="bold-italic">w</mi></mrow><mo>~</mo></mover></mrow><mrow><
   mi
   mathvariant="bold-italic">X</mi></mrow></msub><mo>×</mo><msubsup><mrow>
   <mover accent="true"><mrow><mi
   mathvariant="bold-italic">w</mi></mrow><mo>~</mo></mover></mrow><mrow><
   mi mathvariant="bold-italic">M</mi></mrow><mrow><mi
   mathvariant="bold-italic">T</mi></mrow></msubsup> :MATH]
   is completely accurate connection matrix, and this noiseless PPI
   information greatly improves the performance of maxPPI and meanPPI
   methods, so the AUC-ROC of maxPPI is 1. d PR (precision-recall) curve
   of different fusion methods on simulation data. e AUC-ROC value of
   different fusion method when number of active genes changed. f AUC-ROC
   value of different fusion method when covariance between latent
   variables changed.

   We tested the performance of mFusion under different conditions as
   defined by both the sparsity in activate genes and the strength of the
   gene-PET covariance (Methods). The AUC-ROCs of both meanPPI and maxPPI
   outperformed the PLS method at different sparse levels of activate
   genes (Fig. [125]2e). Conversely, the results presented in Fig. [126]2f
   indicate that the two fusion methods, meanPPI and maxPPI, exhibited
   insensitivity to changes in the covariance between gene expression and
   neurotransmission PET maps.

   And then, three kinds of perturbations were performed on the PPI
   networks to illustrated the influence of PPI information on the mFusion
   method for 500 repetitions: (1) randomly shuffle 30% of the elements
   within the adjacency matrix
   [MATH: <msub><mrow><mover accent="true"><mrow><mi
   mathvariant="bold-italic">w</mi></mrow><mo>~</mo></mover></mrow><mrow><
   mi
   mathvariant="bold-italic">X</mi></mrow></msub><mo>×</mo><msubsup><mrow>
   <mover accent="true"><mrow><mi
   mathvariant="bold-italic">w</mi></mrow><mo>~</mo></mover></mrow><mrow><
   mi mathvariant="bold-italic">M</mi></mrow><mrow><mi
   mathvariant="bold-italic">T</mi></mrow></msubsup> :MATH]
   ; (2) set the minimum 30% of the elements in the adjacency matrix to be
   zero; (3) randomly shuffle 30% of the elements, and then set the
   minimum 30% of the elements in the adjacency matrix to be zero. We
   found that the meanPPI and maxPPI methods consistently outperformed
   their counterparts in all three conditions (Fig. [127]S2).

   Thirdly, we conducted a simulation of brain maps at three distinct
   spatial resolutions. Specifically, the number of brain regions (n) was
   varied between 100, 200, and 500 (see “Methods” for further details),
   as delineated in Fig. [128]S3. The results of this simulation
   demonstrated a positive correlation between the spatial resolution of
   the X, Y, and Z matrices and the efficacy of the methods in identifying
   activated genes. Notably, the meanPPI and maxPPI methodologies
   consistently exhibited superior performance compared to other methods,
   exhibiting a level of stability that highlights their robustness in
   high-resolution brain mapping analyses.

mFusion outperformed the traditional method on empirical data

   We used SCZ morphological similarity differences and ASD cortical
   thickness difference as the traits and get genes Z-scores from
   different fusion method, as described in Methods. Compared to the
   traditional PLS regression method and other fusion methods, the meanPPI
   and maxPPI method got a larger AUC on DisGeNet database (SCZ:
   Fig. [129]3a and Table [130]S2; ASD: Fig. [131]3b and Table [132]S3),
   which demonstrated superior identification of disorder-related genes.
   On the other hand, we compared the number of hits in the top K genes
   given by various methods. When we varied the parameter K from 41 to
   1541, where 1541 was 10% of the total of 15,408 genes, we found that
   the proposed methods had consistently more hits as compared with the
   other algorithms (Fig. [133]3c–j). Notably, when referencing the
   DisGeNet database, the meanPPI method outperformed all the other
   methods in identifying SCZ-related hit genes significantly
   (Fig. [134]3c; p < 0.001, paired Wilcoxon test for meanPPI and PLS
   method. Gene scores refer to Supplementary Table [135]S4). Among the
   ASD related genes in the DisGeNet database, the number of hit genes in
   the top K gene sets identified by the meanPPI method was also
   significantly greater than that identified by other five methods
   (Fig. [136]3g; p < 0.001, paired Wilcoxon test for meanPPI and PLS
   method. Gene scores refer to Supplementary Table [137]S5). Furthermore,
   when compared to fusion methods lacking PPI information, such as
   meanGPT and maxGPT, their PPI-informed counterparts, meanPPI and
   maxPPI, consistently demonstrated superior performance across the board
   (Fig. [138]3c–j).

Fig. 3. Performance on SCZ and ASD disease of fusion methods under different
disease databases.

   [139]Fig. 3
   [140]Open in a new tab

   a ROC curve of different fusion methods on DisGeNet database for SCZ. b
   ROC curve of different fusion methods on DisGeNet database for ASD. c–j
   Number of overlapped genes for SCZ (c–f) and ASD (g–j) in different
   standard datebases: DisGeNet, CTD, DISEASES, and PGC-GWAS datasets
   (corresponding to Table [141]1). Line types mean different fusion
   methods.

Sensitivity analysis on empirical data

   To identify optimal parameters for fusion methods, we compared
   performances of these methods with different network depths (d) and
   edge confidences (c) for the PPI. We observed that the meanPPI method
   exhibited superior performance (i.e., a larger number of hit genes,
   AUC-ROC value, or AUC-PR values) when its PPI depth d was set to 1 in
   comparison to 2 (Fig. [142]4 and Fig. [143]S4). This trend was
   consistent across various edge confidence values ranging from 0.3 to
   0.7. When the PPI depth was set as 2, meanPPI performed similarly to
   other methods (Fig. [144]S5). Meanwhile, we noted that the meanPPI’s
   performance was less sensitive to the edge confidence of PPI when it
   varied from 0.3 to 0.7 (Fig. [145]4e, f). However, when it increased to
   0.8 or 0.9, the meanPPI’s performance declined mainly owing to the fact
   that too few PPIs remained effective at such high confidence levels
   (Fig. [146]4 and Fig. [147]S4). Using the physical subnetwork (i.e.,
   with evidence of binding or forming a physical complex) instead of the
   full STRING PPI network, the meanPPI method exhibited a decrease in the
   number of hits. Nevertheless, it consistently outperformed other
   methods that did not incorporate the PPI information (Fig. [148]S6).
   Consequently, we opted for d = 1 and c = 0.5 in subsequent analyses.

Fig. 4. Performance of meanPPI method on DisgeNet database with different
threshold for pruning the PPI network.

   [149]Fig. 4
   [150]Open in a new tab

   a, b Number of hit genes for SCZ with different PPI depth d and
   confidence scores c, d = 1 in A and 2 in B, respectively. c, d Number
   of hit genes for ASD with different PPI depth and confidence scores,
   d = 1 in C and 2 in D, respectively. e ROC curve at different PPI
   confidence for SCZ. f ROC curve at different PPI confidence for ASD.

   In order to evaluate the importance of PPIs in the context of the
   mFusion-meanPPI method, a comparative analysis was conducted on SCZ and
   ASD phenotypes separately. The analysis comprised a computational
   evaluation of 500 randomly generated PPIs for each disease (see
   “Methods”), with the resulting null distribution of the number of hit
   genes presented in Fig. [151]S7A, B separately. The results
   demonstrated that the application of the meanPPI method using real PPI
   data markedly augmented the capacity to identify hit genes compared to
   the use of random PPI. In addition, a similar permutation was made for
   the 45 PET maps (see “Methods”) and reapplied to the analysis of the
   SCZ and ASD disease. The results in Figure [152]S7C, D revealed a
   marked reduction in the ability of the meanPPI method in pinpointing
   disease-associated genes, thereby indicating that real PET maps are
   pivotal in the meanPPI method.

   To assess the effect of the quality of PET maps on the results, the 45
   redundant maps were synthesized and averaged into 20 unique maps
   (Fig. [153]S8). Subsequently, the characteristics of SCZ and ASD were
   reanalyzed (Figs. [154]S9, [155]S10). The meanPPI method demonstrated
   remarkable consistency with the primary findings regarding the
   identification of disease risk genes, exhibiting a spearman correlation
   for gene scores of r = 0.97 (p < 2e-16) and r = 0.98 (p < 2e-16),
   respectively (Fig. [156]S9). Furthermore, both the meanPPI and maxPPI
   methods emerged as the most effective approaches (Fig. [157]S10).

Top-ranked genes enriched in the relevant diseases

   As an analysis module of mFusion analysis, we performed enrichment
   analysis for top 1541 (10% of 15,408) genes that had negative relevant
   scores to SCZ or ASD given by different methods (see “Methods”).
   Following the FDR correction among 30,170 diseases, traits, and
   phenotypes in the DisGeNet (Fig. [158]5a, b), genes prioritized by the
   meanPPI method for SCZ/ASD were enriched in the corresponding disease
   gene sets. In contrast, the top genes identified by the PLS method did
   not have such enrichments (Tables [159]S8, [160]S9).

Fig. 5. Enrichment analysis of top-ranked genes related to SCZ and ASD
traits.

   [161]Fig. 5
   [162]Open in a new tab

   a, b Disease enrichment results in DisGeNet diseases on top 1541
   trait-related genes for SCZ (a) and ASD (b). The Y-axis lists disease
   with categories in alphabetical order. c–f Clusters of GO terms
   enrichment results on top 1541 genes for SCZ (overlapped terms in c,
   terms uniquely enriched by meanPPI method in d) and ASD (overlapped
   terms in (e), terms uniquely enriched by meanPPI method in (f). The
   size and color of the dots were proportional to the number of pathway
   genes and enrichment significance, respectively. The p-values were
   adjusted using Bonferroni correction. Clusters were generated from
   enriched GO terms by aPEAR (Advanced Pathway Enrichment Analysis
   Representation) package. It exploits the similarities between pathway
   gene sets and represents them as a network of interconnected clusters.
   Each cluster is assigned a meaningful name that highlights the main
   biological theme of the experiment.

Top-ranked genes enriched in more biological pathways

   For SCZ, the meanPPI and PLS methods shared enrichment in 92 GO terms,
   while the meanPPI had enriched 837 new GO terms. The shared terms
   included the establishment of protein localization to the membrane
   (GO_BP:0090150), regulation of synapse structure or activity
   (GO_BP:0050803), channel inhibitor activity (GO_MF:0008200), etc.
   (Fig. [163]5c; Table [164]S6). Newly enriched terms of meanPPI included
   the calcium ion transport (GO_BP:0060402), cation channel activity
   (GO_MF:0022843), GABA-A receptor activity (GO_CC:1902711), etc.
   (Fig. [165]5d). Importantly, these unique biological processes have
   been implicated in SCZ^[166]24,[167]25.

   For ASD, these two methods shared enrichment in 38 GO terms, including
   the synaptic membrane (GO_CC:0097060), neuron projection terminus
   (GO_CC:0044306), positive regulation of protein transport
   (GO_BP:0051222), etc. (Fig. [168]5e). In comparison to the PLS results,
   the meanPPI results introduced new enrichments in 795 GO terms,
   including the gated channel activity (GO_MF:0022836), neurotransmitter
   secretion (GO_BP:0001956), GABA-A receptor activity (GO_MF:0004890),
   etc. (Fig. [169]5f; Table [170]S7).

Top-ranked genes had more hits in a disease-related gene database

   To characterize differences between genes prioritized by the proposed
   method (i.e., mFusion-meanPPI) and the traditional PLS method, we
   compared the top 1541 (10% of 15,408) genes identified by different
   ranking methods (Fig. [171]6). By comparing gene scores with
   disease-related genes listed in the DisGeNet database, we observed that
   higher meanPPI fusion scores were associated with higher hit rates.
   Since the PLS-regression is essentially a multivariate approach, which
   is prone to overfitting, we found more false positives in the genes
   with high PLS-regression weights. In contrast, we demonstrated that the
   mFusion-meanPPI approach reduced the false positive rate by combining
   the information from multiscale. Among the top 10% genes, the meanPPI
   method identified 534 SCZ-related genes listed in the DisGeNet
   database, which was significantly more than the 235 genes identified by
   traditional PLS method (p < 2.2e-16, Chi-squared test; Fig. [172]6a;
   Tables [173]S4, [174]S6). Similarly, among the 1071 ASD risk genes
   listed in the DisGeNet database, the meanPPI method identified 221 of
   them within the top 10% genes, which was significantly more than the 98
   genes identified by the PLS method (p = 5.42e-13, Chi-squared test;
   Fig. [175]6b; Tables [176]S5, [177]S7). Therefore, the proposed
   approach identified more genes that have already been implicated in
   mental disorders than the traditional PLS method did.

Fig. 6. Differential plot of genes by different fusion methods and
neurotransmissions for SCZ and ASD.

   [178]Fig. 6
   [179]Open in a new tab

   a, b Gene scores from meanPPI method and PLS method. Black dots: genes
   overlapped among the genes from DisGeNet standard database, top 10%
   genes from meanPPI method, and top 10% genes from PLS method
   simultaneously. Blue triangles: genes overlapped between the genes from
   DisGeNet database and 10% genes from PLS method. Magenta triangles:
   genes overlapped between the genes from DisGeNet database and 10% genes
   from meanPPI methods. The bar chart at the edge shows the hit rates of
   these disease related genes. c, d Associations measured by PLS Z-score
   between all PET maps of various neurotransmission process and disease
   trait (c: SCZ; d: ASD). e, f Top 20 candidate genes identified by
   meanPPI method, and the gene-PET effects measured by PLS Z- score for
   SCZ (e) and ASD (f) disease trait. Point shapes of genes in (e–f) have
   the same meanings as in (a, b).

   We examined the neurotransmissions-trait and gene-neurotransmissions
   association for SCZ and ASD (Fig. [180]6c, d). We found that the top 20
   genes prioritized for SCZ by mFusion-meanPPI had two patterns of
   correlations with five neurotransmitter receptors, including 17 genes
   with positive correlations with HTR1A, CNR1, DRD1 DRD2, and OPRM1, and
   3 genes with negative correlations with these receptors (Fig. [181]6e).
   Similar patterns were observed for ASD (Fig. [182]6f).
   Gene-neurotransmission PLS association analysis revealed that the
   majority of the top 20 genes were linked to these neurotransmissions
   (Fig. [183]6e, f). Specifically, 14 of the top 20 genes identified by
   the mFusion-meanPPI method were listed as SCZ-related genes in the
   DisGeNet database, and five of these 14 genes were not detected by the
   PLS method.

Comparison of correlations among multiple brain disorders

   We applied the mFusion-meanPPI algorithm to neuroimaging traits of
   eight disorder cohorts separately (Fig. [184]7a, see “Methods”), and
   prioritized top 10% genes based on their Z-scores. Spearman correlation
   analysis of these genes was performed to assess the similarity between
   each pair of disorders. Following this, hierarchical clustering was
   applied to the spearman correlation coefficients among these diseases,
   resulting in the identification of three distinct clusters. These
   clusters reflected the expressional association among these diseases,
   as inferred from the gene Z-scores. The first cluster comprised the
   ASD, EPI, and PD, the second included the ADHD and DEP, and the third
   cluster encompassed the OCD, SCZ, and BIP (Fig. [185]7b). This
   clustering structure was supported by both morphological (Fig. [186]7c)
   and genetic (Fig. [187]7d,) correlations. Especially, the OCD-SCZ-BIP
   cluster and the EPI-PD cluster presented in all three clustering
   structures, which are supported by previous studies of the
   cross-disease similarity at different levels^[188]10,[189]26,[190]27.
   In the other two clusters, the EPI-PD correlation exhibited consistent
   stability. However, while genetically ASD showed more similarity to the
   DEP-ADHD cluster, neuroimaging traits placed it closer to the EPI-PD
   cluster. Simultaneously, the DEP-ADHD correlation was more pronounced
   genetically but less evident in terms of imaging trait correlation. Our
   identification of the clustering structure for eight major mental
   disorders unveiled a notable concordance of these disorders across
   multiple scales (Supplementary Table [191]S8, Table [192]S9, and
   Table [193]S10).

Fig. 7. Correlation of eight brain disorders from multiple biomolecular
levels.

   [194]Fig. 7
   [195]Open in a new tab

   a Cohen’s d maps of cortical thickness difference for eight disorders
   on Desikan–Killiany atlas regions. b Heatmap of expressional
   correlations across eight disorders (Spearman’s r value). c Heatmap of
   morphological correlations across eight disorders (Pearson r value). d
   Heatmap of genetic correlations across eight disorders (LDCS
   [MATH: <msub><mrow><mi>r</mi></mrow><mrow><mi>g</mi></mrow></msub>
   :MATH]
   value). e The overlap of top10% genes among three disease clusters is
   shown in the Veen map. f GO:MF (molecular function) terms enrichment
   results for three groups of cluster-specific genes (Cluster1: 102;
   Cluster 2: 410; Cluster 3: 109). g GABRA1 related pathway scores across
   different neurotransmissions. ADHD Attention-deficit/hyperactivity
   disorder, ASD Autism spectrum disease, BIP Bipolar disorder, DEP
   Depression, EPI Epilepsy, OCD Obsessive-compulsive disorder, PD
   Parkinson’s disease, SCZ Schizophrenia.

   Comparing among the top 10% genes for each disorder, we identified
   three cluster-specific gene sets including 102, 410 and 109 genes for
   three clusters, respectively (Fig. [196]7e; Table [197]S11). Meanwhile,
   the genes related to cluster 1 were enriched in a wide range of pre-
   and post-synaptic functions, and the genes for cluster 2 enriched
   mainly in the postsynaptic functions (Fig. [198]7f). Notably, the
   “GABRA1” was the only gene associated with all eight disorders but with
   distinct gene-transmission pathways (Fig. [199]7g, Table [200]S12). The
   GABRA1-GRM5 or -CNR1 pathway was prioritized for PD, while the
   GABRA1-HRH3 pathway was prioritized for OCD. This is consistent with
   the literature reporting that CNR1 agonists help relieve symptoms in PD
   patients^[201]28–[202]30.

   In total, all 43,126 gene-neurotransmissions-trait pathways among
   15,408 genes, 20 neurotransmissions, and 29 disease traits were listed
   in a quadrable database
   ([203]https://xomicsbio.shinyapps.io/mfusion_shiny/) and summarized in
   Supplementary Fig. [204]S12.

Discussion

   For making use of the human brain data, that have been rapidly
   accumulating but separately collected at various scales, this study
   proposed an analytical method, namely mFusion, to bridge neuroimaging
   traits and genes for mental disorders. Different from previous methods
   that examine pair-wise associations across two scales, mFusion
   establishes gene-neurotransmissions-trait pathways across three scales.
   The advantage of the mFusion method over the previous methods was
   demonstrated in both simulated and experimental datasets. Both
   well-known genes and new candidate genes were identified by this method
   for mental disorders. To our knowledge, it is the first method to
   prioritize cross-scale pathways for mental health disorders, providing
   a richer and more comprehensive perspective on disease exploration. In
   the current study, we demonstrated the performance of the proposed
   mFusion as a tool for finding gene hits in mental disorders using the
   PET maps, it is worth noting that the method could be applied to any
   brain maps, such as the functional MRI or magnetoencephalography,
   single-photon emission computed tomography, etc.

   The proposed method, mFusion, also suggested new disease-related genes
   that have not been listed in the reference database (e.g., DisGeNet,
   Fig. [205]6E, F). For example, the gene CNR1 was prioritized for SCZ by
   mFution-meanPPI but not the traditional PLS method (Fig. [206]6E). The
   CNR1 (cannabinoid receptor 1) encodes cannabinoid receptors and is
   implicated in the pathophysiology of SCZ. In the literature, the
   decreased expression of this gene has been reported in the DLPFC of
   patients with schizophrenia^[207]31. The prioritization of this gene by
   the proposed method was contributed to by its gene-PET association with
   the DRD2, which is supported by its physical interaction with DRD2 to
   form CB1R–DRD2 heteromers^[208]32.

   Another example is the gene KCNC1 (Potassium Voltage-Gated Channel
   Subfamily C Member 1, see Supplementary Fig. [209]S11A for its PPI
   network), which is involved in the monoatomic ion channel activity and
   delayed rectifier potassium channel activity^[210]33. It was reported
   that the level of KCNC1 channels protein decreased in the neocortex of
   SCZ-infected mice compared with the control group^[211]34,[212]35.
   Another example is GABRA3, which has already been associated with both
   dopamine transporter transcripts and the disinhibition of nigrostriatal
   dopamine neurotransmission in the literature^[213]36. A recent study
   using peripheral blood-mesenchymal stem cells has reported its
   transcriptomic association with ASD^[214]37.

   Furthermore, for different disorders, gene-PET-trait pathways mediated
   by different neurotransmissions had great changes of influence
   (Fig. [215]S11B, Table [216]S12). For example, the neurotransmission
   GRM5 have strong effect on PD disease (average pathway score = 4.92,
   refer to Table [217]S12) while not for SCZ (score = 1.64) and BD
   (score = 1.95) disease. When we refer to pathways in Table [218]S12,
   the “SNCA” have stronger pathway scores mediated by neurotransmissions
   including GRM5 (score = 5.76), CHRNB4 (score = 5.00), and CNR1
   (score = 4.87), compared with other disease (these pathways scores less
   than 3 all). The SNCA (alpha-synuclein gene) has been widely reported
   to be involved in the onset of Parkinson’s disease, especially in the
   formation of Lewy bodies^[219]38–[220]40.

   Nevertheless, the multiscale fusion analysis framework has its
   limitations. First, the currently available 45 PET maps of
   neurotransmissions cover only 9 neurotransmitter systems and the
   synaptic density, more PET maps of neurotransmitters remained exclusive
   due to numerous methodological and data-sharing challenges. The present
   study would be strengthened in future with advanced biomolecular
   imaging techniques. Second, the choice of processing parameters can
   influence the AHBA gene expression estimates^[221]41. To mitigate this
   challenge, we normalized the expression values and focused only on
   analyses related to the relative rank of genes as opposed to the
   absolute values. Third, the gene expression data within brain tissues
   is restricted to a finite set of samples. As additional data
   encompassing a broader range of genes becomes accessible in the future,
   the proposed method will be poised for application to these expanded
   datasets.

Conclusion

   In this study, we proposed an analytical method to integrate
   information across multiple scales, including genes, neurotransmitters,
   and neuroimages. This method provides a neurotransmission bridge,
   bridging neuroimaging traits to genes in human brains for mental
   disorders. The mFusion method identified both well-known genes and new
   candidate genes of SCZ and ASD separately, demonstrating its advantages
   in mental disorder phenotypes. This novel method also prioritizes
   cross-scale pathways related to mental disorders, providing a richer
   and more comprehensive perspective on disease exploration.

Methods

Data preprocessing

Gene expression in human brain tissues

   Microarray expression data for brain tissues were sourced from the
   Allen Human Brain Atlas (AHBA)^[222]11,[223]17, featuring samples from
   six neurotypical donors aged between 26 to 54 years, with five males
   and one female. The database encompasses probe expressions from a total
   of 3702 samples, which have been normalized across all brains. Given
   the limited availability of right hemisphere samples from only two
   donors, our analysis focused on 2664 samples from the left hemisphere
   across all six donors. Following recommended preprocessing steps
   outlined by Arnatkevičiūtė et al. ^[224]18 and consistent with
   procedures detailed in our prior publication^[225]42, the data
   underwent re-annotation, intensity filtering, probe selection based on
   mean values, and normalization. This process yielded a matrix of gene
   expression comprising 2664 samples × 15,408 unique genes.

Neurotransmission images

   PET imaging has proven invaluable for noninvasively mapping the in vivo
   spatial distributions of neurotransmissions within the human brain. In
   this study, we curated a comprehensive database comprising 45
   neurotransmission-related PET maps for 9 neurotransmitter systems and
   synaptic density. Among them, 36 maps were provided in the neuromaps
   toolbox
   ([226]https://netneurolab.github.io/neuromaps/index.html)^[227]19, 6
   were available through the JuSpace toolbox
   ([228]https://github.com/juryxy/JuSpace)^[229]20, and 3 were available
   at the PET imaging database provided by Hansen et al. ^[230]14
   ([231]https://github.com/netneurolab/hansen_receptors/tree/main/data/PE
   T_nifti_images). These systems encompass serotonin, cannabinoid,
   dopamine, gamma-aminobutyric acid, histamine, mu-type opioid,
   norepinephrine, N-methyl-D-aspartate, synaptic vesicle membrane
   protein, acetylcholine, glutamate, and nicotinic-acetylcholine
   (Table [232]2 and Supplementary Table [233]S1).

Protein-protein interaction (PPI) network

   Recognizing the collaborative nature of proteins coded by genes in
   performing various functions^[234]43, our study employed the STRING
   Protein-Protein Interaction (PPI) network (Version 11.5, August 12,
   2021)^[235]16. This repository stands as one of the largest and most
   widely utilized sources of PPI data, encompassing both direct
   (physical) and indirect (functional) interactions. These interactions
   are derived from a range of sources, including experimental data, gene
   co-expression, and text-mining. Within the PPI network, the strength of
   an edge is quantified by the confidence score (c), while the distance
   between two nodes is measured by the depth (d). Specifically, a larger
   c and a smaller d contribute to a PPI network that is substantiated by
   stronger evidence.

Brain traits of mental disorders using the Desikan–Killiany (DK) atlas

   The ENIGMA consortium and ENIGMA toolbox
   ([236]https://enigma-toolbox.readthedocs.io/en/latest/index.html#)^[237
   ]21 have provided the structural case-control differences for eight
   mental disorders, including attention-deficit/hyperactivity disorder
   (ADHD)^[238]44, ASD^[239]45, bipolar disorder (BD)^[240]46, common
   epilepsy syndromes (EPI)^[241]47, depression (DEP)^[242]48,
   obsessive-compulsive disorder (OCD)^[243]49, Parkinson’s disease
   (PD)^[244]50, and SCZ^[245]51. In this study, we employed maps
   detailing case-control differences in cortical thicknesses, represented
   by inverted Cohen’s d values^[246]14 (this means, larger values
   represent greater cortical thinning), for 68 specific DK brain regions
   (Table [247]S13).

Brain traits of mental disorders in the DK308 Atlas

   In our investigation, we incorporated a brain map depicting
   case-control differences in morphological similarity, specifically the
   correlation of seven morphological parameters (i.e., gray matter
   volume, surface area, cortical thickness, Gaussian curvature, mean
   curvature, fractional anisotropy, and mean diffusivity) derived from
   MRI and diffusion-weighted imaging data, concerning schizophrenia. This
   map is defined by the Desikan–Killiany 308 atlas (DK308)^[248]13, an
   improved version of the DK atlas that maintains small-world properties
   of anatomical cortical networks while enhancing resolution with 308
   regions^[249]8. We also employed another case-control differences map
   in cortical thickness for ASD illustrated by DK308 atlas^[250]9.

GWAS summary statistics for mental disorders

   We compiled GWAS summary results for six mental disorders from
   published research, drawing from the Psychiatric Genomics Consortium
   (PGC) datasets for ADHD^[251]52, ASD^[252]53, BIP^[253]54, DEP^[254]55,
   OCD^[255]56, SCZ^[256]57. Additionally, we incorporated data from other
   relevant studies (EPI^[257]58, PD^[258]59). Table [259]S14 offers
   comprehensive details on the individual GWAS samples, including
   references, sample sizes, and SNP numbers.