Graphical abstract graphic file with name fx1.jpg [33]Open in a new tab Highlights * • The most comprehensive sQTL map of HIPPO was obtained by sQTL detection * • We decoded these sQTL SNPs function characteristics and function enrichment * • TWAS identified 16 significant genes associated with Alzheimer at the splicing level * • Our work offers new hope for understanding how AS regulation contributes to brain disease __________________________________________________________________ Molecular biology; Molecular mechanism of gene regulation; Neuroscience; Genomic analysis; Transcriptomics Introduction Alternative splicing (AS) is an essential post-transcriptional regulatory process that enhances the diversity and complexity at the RNA level through intron excision and exon ligation.[34]^1^,[35]^2 By alternative pre-mRNA splicing, multiple transcripts, and protein isoforms can be generated from a single gene with different biological functions.[36]^3 Emerging as one of the most important mechanisms regulating gene expression, AS events affects about 95% of human multi-exon genes,[37]^4 which often occur in a cell or tissue-type specific manner.[38]^5 Notably, the molecular mechanism of AS is particularly prevalent and intricate in the human brain,[39]^6 suggesting that such highly complex AS regulation in brain tissue can play a crucial role in the normal development and functional maintenance of the central nervous system.[40]^7^,[41]^8 Furthermore, accumulating studies have indicated that aberrant splicing of a specific gene affected by genetic variants may be involved in the etiology and pathogenesis of various brain disorders.[42]^9^,[43]^10 For instance, Xu et al. found that multiple de novo gene mutations with strong genetic evidence can confer the risk of schizophrenia by affecting AS.[44]^11 Multiple human postmortem brain studies have also reported AS dysregulation among individuals with Alzheimer’s disease,[45]^12^,[46]^13 Parkinson disease,[47]^14 and autism spectrum disorders.[48]^15^,[49]^16 Recent large-scale transcriptome analyses have identified thousands of genetic variants throughout the genome that play a regulatory role in AS, referred to as splicing quantitative trait loci (sQTL).[50]^9 Therefore, sQTL analysis is conducted to gain greater insight into the associations between genetic variants and AS regulation.[51]^17 The splicing level of a splice site or alternative exon can be considered as quantitative traits and maps to a genomic variant within a population in sQTL analysis. Previous studies have indicated that sQTLs have similar or even greater effects on diseases and complex traits compared to expression quantitative trait loci (i.e., eQTL, meaning genetic variants affecting gene expression), highlighting the utility of sQTLs as a crucial approach for uncovering genetic variants of AS regulation.[52]^9 Through sQTL analysis, several recent studies have successfully explored how functional genetic variation impacts AS (i.e., the splicing level) in human.[53]^18^,[54]^19 However, there are relatively few sQTL analyses based on RNA sequencing (RNA-seq) of the brain[55]^20^,[56]^21^,[57]^22 (i.e., most of these analyses mainly focused on non-neuronal tissues), and none include a comprehensive sQTL analyses of the postmortem hippocampal (HIPPO) brain tissue. Mounting evidence suggests that HIPPO region damage prominently implicated in the etiology and pathogenesis of brain disorders, including schizophrenia, Alzheimer, epilepsy, depression and anxiety, but there is less research accessible since current large consortia favor research on neocortical brain regions like the dorsolateral prefrontal cortex (DLPFC).[58]^23^,[59]^24 Here, to comprehensively investigate sQTLs in the HIPPO region of the human brain, we utilized transcriptome data and genotype data from 264 individuals in the second phase of the BrainSeq Consortium (i.e., BrainSeq2[60]^25). We eventually identified a total of 6,966 sQTLs involving in 14,559 AS events of 5,299 unique genes (defined as sGenes) that are expected to be independent of one another after undergoing stringent quality control. The functional characterization of these sQTL single nucleotide polymorphisms (SNPs) was examined, and we comprehensively described the implicated regulatory regions associated with sQTL variants as well as the genetic contribution of sQTL variants to the etiology of brain disorders. By comparing sQTL signals between HIPPO and DLPFC brain regions, we found a large genetic similarity between the two brain regions (r[g] = 0.87, calculated by ldsc). In addition, transcriptome-wide association studies (TWAS) integrating sQTLs summary data and genetic association signals from the genome-wide association studies (GWASs) of Alzheimer’s disease have been performed to unravel novel susceptibility genes in HIPPO tissue. In conclusion, these findings provide a comprehensive sQTLs map in the HIPPO brain, which can advance our understanding of the genetic architecture of complex brain disorders. Results Robust identification of cis-sQTLs in human HIPPO region Genetic variants with effects on RNA splicing are an important factor contributing to brain diseases risk. Considering that splicing regulation in the HIPPO region of the human brain has not been systematically explored, we performed a sQTL analysis of the HIPPO region to enhance the understanding of the link between splicing and brain diseases. All individuals (N = 264) from BrainSeq2 with no history of neurological and/or neuropsychiatric diseases were included in the sQTL analysis. We first employed LeafCutter to analyze RNA-seq data to comprehensively identify AS events in human HIPPO region. The regulation of brain splicing generates a large number of homologous genes, leading to incomplete isoform annotations in brain. LeafCutter is an annotation-free method that allows for alternative exon identification and therefore has important advantages in the context of brain. We then conducted cis-sQTLs (cis-window of ±100 kb around the AS event) discovery using the linear regression model implemented in FastQTL. After applying stringent quality control and normalization (see [61]STAR methods for details), we obtained a total of 11,157 significant sQTLs (FDR <0.05) involving in 16,492 AS events of 6,387 unique genes ([62]Table S1). Positional enrichment of sQTL SNPs showed that most of significant sQTL SNPs (i.e., with strong effects on splicing) were mainly enriched in exons or proximal introns regions (±100 bp of junction) ([63]Figure 1A). This observation is consistent with prior findings[64]^26^,[65]^27 that many variants within these regions have an impact on AS regulatory functions, indicating that sQTL SNPs closer to their splice sites had a significant effect on AS. Figure 1. [66]Figure 1 [67]Open in a new tab Characterization of identified sQTL SNPs (A) Each blue dot represents an SNP plotted according to its distance from the nearest AS event and the statistical significance of its association with AS (–log[10] p-value). (B) Pie charts indicating proportions of SNPs annotated with each functional category (splice acceptor, splice donor, splice region, synonymous, 5′ UTR, 3′ UTR, non-coding exon, intron and intergenic). SNPs in exonic regions (nonsense, readthrough, start-loss, frameshift, canonical splice site, missense, synonymous, splice region, 5′ UTR, 3′ UTR and non-coding exon) and SNPs in non-exonic regions (intron and intergenic) are indicated by red and blue colors, respectively. (C) Enrichment of sQTL SNPs in each different functional type of variants. Exonic variants are shown in red and non-exonic variants are shown in blue. p-values were calculated by two-tailed Fisher’s exact test with Bonferroni correction. Bars indicate 95% confidence intervals. Functional characterization of sQTL SNPs To characterize the function of each sQTL SNPs, we generated sQTL and non-sQTL datasets for comparative analysis. After sQTL SNPs extracting (p-value filtering) and LD pruning (see [68]STAR methods), we obtained sQTL and non-sQTL SNPs dataset, containing 6,966 SNPs (contributed to AS regulation independently) and 124,801 SNPs (not contributed to AS regulation independently), respectively. We first functionally classified these sQTL ([69]Table S3A) and non-sQTL SNPs ([70]Table S3B) with the variant types defined by SnpEff ([71]Figure 1B). As expected, we found that the sQTL SNPs were significantly enriched in exonic regions (correspond to red in [72]Figure 1B) compared to non-sQTL SNPs (p = 2.43 × 10^−76, odds ratio (OR) = 2.13, two-tail Fisher’s exact test). We then performed functional enrichment of sQTL SNPs among different variant types. Notably, the enrichment of variants located in the splice region was the most significant compared to non-QTL SNPs (p = 8.65 × 10^−62, OR = 4.83, two-tail Fisher’s exact test), suggesting that splice site variants may plays an important role in AS regulation ([73]Figure 1C). Moreover, there was significant enrichment of splice acceptor (p = 3.48 × 10^−18, OR = 17.94, two-tail Fisher’s exact test) and splice donor (p = 2.86 × 10^−15, OR = 14.23, two-tail Fisher’s exact test), and underrepresentation of intergenic variants ([74]Figure 1C). Functional enrichment of sQTLs SNPs in regulatory elements and RBP binding sites Previous studies have shown that regulatory elements (H3K4me1, H3K4me3, H3K9ac, H3K27ac, and DHS) and RBPs facilitate AS regulation during transcription,[75]^18^,[76]^21 so we explored whether genomic regions tagged by sQTL SNPs are enriched in regulatory elements and RBP binding sites to understand the mechanism by which sQTL SNPs influence AS regulation (see [77]STAR methods section for details). For regulatory elements, we found that sQTL SNPs are significant enriched in H3K4me3 (p = 2.04 × 10^−21, OR = 2.10, two-tail Fisher’ exact test with), H3K9ac (p = 1.69 × 10^−17, OR = 1.89, two-tail Fisher’s exact test with) and H3K27ac (p = 1.19 × 10^−18, OR = 1.91, two-tail Fisher’s exact test markers compared to non-sQTL SNPs ([78]Figure 2A). Increasing evidence indicates that H3K4me3 marks plays an important role in AS regulation.[79]^28^,[80]^29^,[81]^30 It was previously reported that H3K4me3 histone modification was involved in AS regulation through control of spliceosome recruitment.[82]^29 According to the scoring criteria defined by RegulomeDB ([83]Table S4), we found that sQTL SNPs were significantly enriched in SNPs with in the category with a score of one (i.e., these SNPs have been experimentally shown to affect TFs binding) compared to non-sQTL SNPs (p = 1.32 × 10^−42, OR = 2.51, two-tail Fisher’s exact test), indicating that these TFs affected by sQTL SNPs plays an important role in AS regulation ([84]Figure 2B). For RBP binding sites, as expected, we observed that sQTL SNPs were significantly enriched in 30 RBPs binding sites ([85]Figure 2C), providing further evidence that these sQTL SNPs fall in RBPs binding sites relevant to the AS regulation. Notably, identified RBPs with prominent known roles in brain RBPs include SRSF1,[86]^31 LIN28B,[87]^32 TARDBP,[88]^33 ELAVL1,[89]^34 DDX3X,[90]^35 and ATXN2,[91]^36 while the functional significance of the remaining RBPs is unclear. Figure 2. [92]Figure 2 [93]Open in a new tab Enrichment analyses of sQTL SNPs among genetic regulatory elements and RBP (A) Enrichment analysis of sQTL SNPs among variants in five types of regulatory elements (DNase I hypersensitive sites (DHS), H3K4 monomethylation marks (H3K4me1), H3K4 trimethylation marks (H3K4me3), H3K9 acetylation marks (H3K9ac), H3K27 acetylation marks (H3K27ac)). p-values were calculated by two-tailed Fisher’s exact test with Bonferroni correction. (B) Enrichment analysis of sQTL SNPs among TF binding sites. The lower the horizontal coordinate score, the greater the possibility of variants site affecting TF binding. p-values were calculated by two-tailed Fisher’s exact test with Bonferroni correction. (C) The blue dot represents the top 30 most enriched RBPs, and the red dotted line represents p = 0.05 (Bonferroni-Corrected-value threshold). (D) Enrichment analyses of sQTL SNPs among loci associated with 12 brain disorders. Enrichment analysis of sQTL SNPs among brain disease-associated loci To identify the potential contribution of sQTL SNPs to the risk of brain-related diseases, we utilized data mined from the GWAS catalog to perform enrichment analysis for identified sQTL and non-sQTL SNPs. By exploring whether sQTL SNPs are significantly enriched for GWAS loci associated with 12 brain disorders (see [94]STAR methods), we found the sQTL SNPs were significantly enriched in brain disorders compared to non-sQTL SNPs (p < 1.00 × 10^−3, OR = 4.84, one-tailed Fisher’s exact test). Enrichment analysis of sQTL SNPs using summary data from the largest GWAS to date for individual brain disorders was further performed. The most significant enrichment among psychiatric disorders was observed for schizophrenia (p = 2.06 × 10^−47, OR = 4.36), and the most significant enrichment among neurodegenerative disease was observed for Alzheimer (p = 6.48 × 10^−53, OR = 7.05) ([95]Figure 2D). These results suggested that some of these SNPs probably contribute to disease (including schizophrenia and Alzheimer) risk by affecting AS. Enrichment of sGenes in different cell types of the brain To further identify the cell types associated with AS regulation, EWCE was used to estimate the enrichment of these identified sGenes to various cell types. We found that sGene was significantly (Benjamini-Hochberg p < 0.05) enriched in a variety of cell types, including astrocytes ependymal, oligodendrocytes, and pyamidal CA1, and 10 cell subtypes ([96]Figures 3A and 3B). The results indicated that these identified sGenes may have an important role in these cell types. In addition, we explored the potential cell-type differences across the HIPPO (including CA1∼CA4 and DG) and cortex brain regions. We downloaded single-cell sequencing expression data (N = 2,031 samples) of adult mouse HIPPO and cortex region from Batiuk et al.,[97]^37 including AST1, AST2, AST3, AST4, AST5, microglia, neuron, oligodendrocyte, mural cell, and endothelial cell. Interestingly, we observed that sGenes were significantly enriched in AST1, AST2, AST3, AST4, AST5, and oligodendrocyte cell types in both HIPPO and cortex regions. ([98]Figure S2). Many studies have reported the association of astrocytes with diseases, including Alzheimer’s disease.[99]^37^,[100]^38 For example, one study has reported the identification of a population of disease-associated astrocytes in a mouse model of Alzheimer’s disease,[101]^39 suggesting that these results are consistent with our findings. Figure 3. [102]Figure 3 [103]Open in a new tab Enrichment of sGenes in different cell types of the brain and their comparison to DLPFC sQTLs (A) Associations between sGenes and cell types from somatosensory cortex and hippocampal CA1 in mice were marked with asterisks, indicating Benjamini-Hochberg significance (p < 0.05). The results show 3 significantly enriched cell types in the primary classification (p < 0.05), including astrocytes oligodendrocytes and pyamidal CA1. (B) The results show 9 significantly enriched cell types in the secondary classification (p < 0.05). (C) The Venn diagram shows the number and percentage of sQTL SNPs that overlap in the two brain regions. (D) Partitioning heritability enrichment of four brain disorders (Alzheimer’s disease, amyotrophic lateral sclerosis, schizophrenia, and bipolar disorder) in two brain regions, p < 0.05 was significant (Bonferroni-Corrected-value threshold). Among the four diseases, Alzheimer’s disease was the most significantly enriched, especially in the HIPPO. One asterisk (∗) indicates a p-value <0.05; two asterisk (∗∗) indicate a p-value <0.01, and three asterisk (∗∗∗) indicate a p-value <0.001. Tissue specificity of sQTLs between brain HIPPO and DLPFC regions Previous studies have showed that AS events associated with brain diseases tend to be brain region-specific,[104]^22 indicating that there are differences between HIPPO and DLPFC region sQTLs at the level of SNPs. By performing the same sQTL mapping analysis on the DLPFC region (the specific procedure is the same as HIPPO, see [105]STAR methods section for details), we observed approximately 6.8% of sQTL SNPs overlapped between the HIPPO and DLPFC (as shown in the [106]Figure 3C), and we noted that the p values of the overlapping sQTL SNPs were highly significant in both brain regions. To further measure the genetic similarity of sQTLs in two different brain regions, we calculated r[g] using the ldsc software[107]^40 (r[g] = 0.87), indicating a strong genetic similarity between the two brain regions. We further compared the enrichment patterns of 12 brain disease-associated GWAS in sQTL from HIPPO brain regions and DLPFC brain regions, and found significant enrichment in four brain disorders ([108]Figure 3D). Among them, Alzheimer-associated variants are the most significantly enriched in the HIPPO and DLPFC regions (p = 9.49 × 10^−6 and p = 8.15 × 10^−4, respectively) ([109]Figure 3D). Interestingly, sQTL SNPs showed more significant enrichment in HIPPO brain regions compared to DLPFC, suggesting that Alzheimer-related risk variants may be more relevant to HIPPO brain regions. Therefore, studying AS events in HIPPO brain regions for Alzheimer deserves in-depth study. Additionally, genetic variants associated with risk for amyotrophic lateral sclerosis (p = 2.28 × 10^−2 and p = 5.73 × 10^−3, respectively), schizophrenia (p = 1.45 × 10^−3 and p = 1.76 × 10^−4, respectively) and bipolar disorder (p = 1.22 × 10^−3 and p = 1.10 × 10^−4, respectively) were significantly enriched for both HIPPO and DLPFC regions, whereas depression GWAS was not significant for both regions ([110]Figure 3D). Overall, these results are consistent with previous findings[111]^41 that sQTL SNPs identified by HIPPO harbor substantial disease risk, suggesting AS regulation in the HIPPO brain region deserves further study in brain disorders development. sGenes with the context of splicing networks By constructing a splicing co-expression network using robust WGCNA, we identified a total of 14 functional modules of co-expressed sGenes in the HIPPO region ([112]Figure 4A). We performed GO and KEGG enrichment analysis of obtained sGenes modules to identify important biological processes and metabolic pathways ([113]Table S5A). One interesting splicing co-expressed module are turquoise module with the highest number of sGenes ([114]Figure 4B), which was enriched for regulation of mRNA metabolic and RNA splicing process ([115]Figure 4C). Specifically, GO analysis showed that the sGenes contained in the turquoise module were mainly enriched in splicing- and neurodevelopment-associated pathways, including RNA splicing, regulation of RNA splicing, modulation of chemical synaptic transmission, regulation of trans-synaptic signaling, axonal transport, axo-dendritic transport ([116]Figure 4D; [117]Table S5B). KEGG analysis showed that the sGenes contained in the turquoise module were mainly enriched in the pathways related to spliceosome and dopaminergic synapse ([118]Figure 4E; [119]Table S5B), suggesting that the sGenes have an important function in the AS regulation and synaptic transmission. Figure 4. [120]Figure 4 [121]Open in a new tab Splicing co-expression networks in HIPPO (A and B) (A) Network analysis dendrogram based on hierarchical clustering of sGenes by their topological overlap, identifies 14 modules, and (B) indicates the number of differential sGenes enriched in each module, with the most enriched sGenes in the turquoise module. (C–E) (C) GO analysis was performed on the sGenes of each module, and the first two most significant results were extracted (The p-value was corrected by Benjamini-Hochberg). The turquoise cluster module is involved in RNA-splicing processes and various neurodegenerative disease pathways as described in detail by (D) GO and (E) KEGG analyses. TWAS prioritize potential causal genes for Alzheimer The most significant enrichment among Alzheimer-associated loci observed above ([122]Figure 2D) suggested that some of these SNPs could confer risk for Alzheimer’s disease by affecting AS. To further leverage these sQTLs data (using intron excision levels as a reference panel) identified in HIPPO to prioritize potential causal genes for Alzheimer, we performed a TWAS incorporating cis-sQTLs and Alzheimer GWAS. After Bonferrroni correction, we identified 16 genes with intron associations whose imputed cis-regulated expression is associated with Alzheimer ([123]Figure 5A) ([124]Table S6). Of note, APOC1 showed the most significant association, strongly suggesting that APOC1 is a promising causal gene for Alzheimer. Our sQTL analysis showed that rs56131196 was associated with the splicing levels of APOC1 (P[corrected] = 1.56 × 10^−3) (i.e., differential intronic usage for chr19:45419582:45434257 stratified by rs56131196 genotypes) ([125]Figure 5B), suggesting that APOC1 could causally confer Alzheimer risk by affecting AS in the human HIPPO brain. Other top significant genes include PVR, ZNF138, ERCC1, CD2AP, KNSTRN, ANK3, STAG3, CRYL1, SLC12A9, SNRPC, HMGCLL1, LAMTOR4, CCDC18, SLC39A13, and PTK2B. These significant genes represent promising candidate causal genes for Alzheimer. To identify novel risk regions beyond the Alzheimer GWAS identification, we then examined the overlap between the 38 independent genome-wide significant Alzheimer GWAS loci and significant HIPPO TWAS associations. We observed seven GWAS loci harbor HIPPO TWAS associations (overlap of TWAS significant loci (±500 kb) with GWAS significant loci), and nine additional Alzheimer risk loci where significant TWAS loci do not overlap GWAS significant loci. Given the evidence that the HIPPO is associated with the development of Alzheimer’s disease, we believed that these HIPPO sQTLs data would provide new perspectives on Alzheimer risk, especially given the sensitivity of AS regulation to tissue source. Figure 5. [126]Figure 5 [127]Open in a new tab TWAS in Alzheimer’s disease (A) Manhattan plot of TWAS results from intron Alzheimer associations, highlighting the 16 significant gene. The red solid line represents Bonferroni corrected p-value threshold (i.e., p = 1.13 × 10^−5). (B) The violin diagram shows association of rs56131196 with the PSI of APOC1 in our dataset (P[corrected] = 1.56 × 10^−3 in our sQTL analysis). The distribution of PSI abundances in a genotype class, that is, AA (N = 10), AG (N = 66), or GG (N = 156). The superimposed boxplot is shown with horizontal black lines and white boxes for median and quartile distances, respectively. Outliers are shown as black dots. Discussion Our work provides the most comprehensive genome-wide sQTLs map of the HIPPO, a critical region in brain prominently implicated in the pathogenesis of brain disorders, significantly expanding our understanding of AS regulation in the HIPPO brain. We have comprehensively characterized properties and regulatory elements of the detected sQTL SNPs, enabling the incorporation of tissue specificity to previous brain genetic variant interpretation and functional genomic analysis. By comparing sQTLs of the HIPPO and DLPFC region from the same individuals, we found that approximately over 90% of significant sQTLs in HIPPO are distinct from DLPFC, indicating that sQTL detecting in HIPPO warrants more attention in future studies. By integrating sQTLs with GWAS data via TWAS, we identified novel putative mechanisms detected in HIPPO brain datasets, demonstrating the distinct contribution of sQTLs to the genetic architecture underpinning brain disorders variation. These results collectively confirmed that sQTL SNPs identified in the HIPPO is a key data resource for deciphering the genetic mechanisms underlying complex brain disorders. To further leverage these splicing to characterize Alzheimer loci with the specific brain region, we performed a TWAS analysis by integrating cis-sQTLs and GWAS to identify potential risk genes whose change in splicing are associated with Alzheimer. Our analysis corroborated seven known Alzheimer genes, including APOC1,[128]^42 STAG3,[129]^43 CRYL1,[130]^44 SNRPC,[131]^45 LAMTOR4,[132]^46 SLC39A13,[133]^47 and PTK2B,[134]^48 and also identified nine genes that have not previously been reported in suggestive Alzheimer loci. As a proof of principle, we highlight two of these genes: ERCC1 and ANK3. ERCC1 is a protein coding gene, which is involved[135]^45 in the recognition and nucleotide excision repair pathway.[136]^49 Interestingly, age-related motor neuron degeneration has been observed in Ercc1 (Delta/-) mice with impaired DNA repair systems.[137]^50 Considering that Alzheimer’s disease is the most important age-related neurodegenerative disorder, these results demonstrate that ERCC1 may confer risk of Alzheimer by affecting the function of age-related motor neurons. ANK3 (ankyrin 3, node of Ranvier) encodes ankyrin 3 or G, which functions as a scaffold linking plasma membrane proteins to the actin/β-spectrin cytoskeleton.[138]^51 Notably, ANK3 has been reported to contribute significantly in the brain specific isoforms found in axons, particularly associated with the voltage-dependent sodium channel.[139]^52 Evidence from linkage studies suggests that ANK3 are involved in the genetic architecture of late-onset Alzheimer’s disease,[140]^53 indicating that this gene is a particularly promising candidate for Alzheimer locus. Therefore, our results provide further support and guidance for future experimental validation and serve to elucidate the role of these sQTL SNPs in the etiology of Alzheimer. We also explored the potential function of allele-specific sQTL for gene ApoE4 or other key susceptible genes of Alzheimer summarized by Pasqualetti G et al.’s work.[141]^48 For gene ApoE4, we found that ApoE4 were not identified in our HIPPO sQTL analysis, possibly due to the use of normal human rather than Alzheimer patient samples for sQTL analysis and sample size limitations. For other key susceptible genes (N = 32) of Alzheimer, we found that several SNPs were significantly associated with the splicing levels of ZCWPW1, BIN1, CD2AP, CCELF1, HLA-DRB1, IQCK, PTK2B, SSLC4A4, and WWOX ([142]Figure S3), implying that these genes could causally confer Alzheimer risk by affecting AS in the human HIPPO brain. However, we found that the remaining 23 Alzheimer key susceptibility genes were not identified in our HIPPO sQTL analysis. Obtaining genotype and transcriptome sequencing data from HIPPO brain region samples of Alzheimer patients for sQTL analysis may help to reveal the potential functions of these Alzheimer genes in the future work. By comparing the enrichment patterns of brain disease-associated GWAS in sQTL between HIPPO and DLPFC brain regions, we found that Alzheimer-associated variants were more significantly enriched in the HIPPO regions ([143]Figure 3D). Our results suggest that the HIPPO region may be implicated in the pathogenesis of Alzheimer’s disease. Indeed, there is growing evidence that many AS events are highly brain tissue specific (i.e., highly heterogeneous).[144]^22^,[145]^54 The DLPFC is a neocortical region that is the hub of cognitive circuits and is affected in Alzheimer. Interestingly, previous studies have reported that the HIPPO region was one of the earliest brain regions to be affected by the neuropathology of Alzheimer’s disease, indicating a key role for HIPPO in Alzheimer’s disease pathology.[146]^55^,[147]^56 In addition, Alzheimer’s disease is also characterized by the pathological deposition of amyloid-β peptides in the HIPPO.[148]^57 These lines of evidence support that the HIPPO may be an important and promising brain region for Alzheimer treatment. For regulatory elements, we found that sQTL SNPs are significant enriched in H3K4me3 (p = 2.04 × 10^−21, OR = 2.10, two-tail Fisher’Q exact test), H3K9ac (p = 1.69 × 10^−17, OR = 1.89, two-tail Fisher’s exact test) and H3K27ac (p = 1.19 × 10^−18, OR = 1.91, two-tail Fisher’s exact test) compared to non-sQTL SNPs after performing Bonferroni correction ([149]Figure 2A). Increasing evidence indicates that H3K4me3 marks plays an important role in AS regulation.[150]^28^,[151]^29^,[152]^30 It was previously reported that H3K4me3 histone modification was involved in AS regulation through control of spliceosome recruitment. Interestingly, we also found that sQTL SNPs were significant enriched in 5' UTRs sites ([153]Figure 1C), implying that 5' UTRs are more prone to AS, which is consistent with results reported in the literature.[154]^58 Growing evidence suggests that sQTL SNPs occurring at 5' UTR are involved in transcriptional activation, as the H3K4me3 marks are enriched in the 5' end of gene bodies, which normally includes the 5′ UTRs.[155]^59 In addition, the sGenes contained in the turquoise module were also enriched in brain-related disorders ([156]Figure 4E), especially in multiple neurodegenerative diseases (including Alzheimer and Parkinson), which is consistent with the results of enrichment analyses of sQTL SNPs among disease-associated loci. In summary, we performed a well-powered sQTL analysis in the HIPPO of human brain and further decoded these sQTL SNPs’ function characteristics, regulatory elements and RBP enrichment, and associated brain disorders. Our results may advance the understanding of how functional genetic variants confer brain disease risk by affecting AS regulation in specific tissues. Limitations of the study Our study also has several limitations. First, although the sample size (N = 264) was sufficient to confidently detect sQTL in the HIPPO, most of brain disorders heritability remain unexplained. Additional significant sQTLs may be found with very large sample sizes. Second, we limited the current sQTL analysis to only tissues from a single brain region (i.e., HIPPO), which contains a mixture of multiple cell types. Since sQTLs are often occur at cell type-specific manner, this may mask the ability to detect cell-type-specific sQTLs, particularly affecting a minor cell population in the bulk tissues. Future analyses of sQTLs using large-scale RNA-seq data at cell types or single cells resolutions will provide additional informative data resource of genetically AS regulation in human brain. Third, our study identified multiple statistically significant genetic variants affecting splicing in the HIPPO, but further functional experimental validations are required to elucidate the molecular mechanisms of these significant sQTL SNPs. STAR★Methods Key resources table REAGENT or RESOURCE SOURCE IDENTIFIER Deposited data __________________________________________________________________ GWAS of SCZ Trubetskoy et al. [157]^60 [158]https://pgc.unc.edu/for-researchers/download-results/ GWAS of BP Mullins et al.[159]^61 [160]https://pgc.unc.edu/for-researchers/download-results/ GWAS of DEP Howard et al.[161]^62 [162]https://pgc.unc.edu/for-researchers/download-results/ GWAS of ALZ Douglas et al.[163]^63 [164]https://pgc.unc.edu/for-researchers/download-results/ GWAS of ADHD Demontis et al. [165]^64 [166]https://pgc.unc.edu/for-researchers/download-results/ GWAS of ALS Wouter et al.[167]^65 [168]https://www.ebi.ac.uk/gwas/efotraits/MONDO_0004976 GWAS of PD Jiang L et al.[169]^66 [170]https://www.ebi.ac.uk/gwas/efotraits/MONDO_0005180 GWAS of MS Jiang L et al.[171]^66 [172]https://www.ebi.ac.uk/gwas/efotraits/MONDO_0005301 GWAS of FTD Deerlin et al.[173]^67 [174]https://gwas.mrcieu.ac.uk/datasets/ieu-b-43/ GWAS of Stroke Jiang L et al.[175]^66 [176]https://www.ebi.ac.uk/gwas/efotraits/EFO_0000712 GWAS of Autism Jakob et al.[177]^68 [178]https://pgc.unc.edu/for-researchers/download-results/ GWAS of Epilepsy Jiang L et al.[179]^66 [180]https://www.ebi.ac.uk/gwas/efotraits/EFO_0000474 __________________________________________________________________ Software and algorithms __________________________________________________________________ R softwares R project [181]https://www.r-project.org/ clusterProfiler R package Yu et al.[182]^69 [183]https://mrcieu.github.io/TwoSampleMR fastp Chen et al.[184]^70 [185]https://github.com/OpenGene/fastp STAR Dobin et al.[186]^71 [187]https://github.com/alexdobin/STAR WASP van de Geijn et al.[188]^72 [189]https://github.com/bmvdgeijn/WASP LeafCutter Li et al.[190]^73 [191]https://github.com/davidaknowles/leafcutter FastQTL Halit et al.[192]^74 [193]https://github.com/francois-a/fastqtl BEDtools Quinlan et al.[194]^75 [195]https://github.com/arq5x/bedtools2 PLINK1.9 Purcell et al.[196]^76 [197]https://www.cog-genomics.org/plink/1.9/ LDSC Bulik et al.[198]^40 [199]https://github.com/bulik/ldsc WGCNA R package Peter et al.[200]^77 [201]http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/Rpac kages/WGCNA FUSION Gusev et al.[202]^78 [203]https://github.com/gusevlab/fusion_twas EWCE Nathan et al.[204]^79 [205]https://github.com/NathanSkene/EWCE SnpEff Pablo et al.[206]^80 [207]https://pcingola.github.io/SnpEff/ [208]Open in a new tab Resource availability Lead contact Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact: Prof, Junfeng Xia ([209]jfxia@ahu.edu.cn). Materials availability This study did not generate new unique reagents. Method details RNA-seq data of the HIPPO tissue The RNA-seq FASTQ files for the HIPPO region of human brain were downloaded from BrainSeq2[210]^25 database ([211]http://eqtl.brainseq.org/phase2/) via Globus file transfer tool. Participants with no prior history of neurological and/or neuropsychiatric conditions were included in the study, and the final number of individuals included in this study was N = 264 (The detailed ancestry distribution is shown in [212]Figure S1). Briefly, total RNA was extracted by using RNeasy Lipid Tissue Mini Kit (QIAGEN)]. Paired-end sequencing libraries preparation was performed using the TruSeq Stranded Total RNA Library kit (Illumina), after which the libraries were sequenced using Illumina HiSeq 3000 platform. The procedures for sample preparation, RNA sequencing, quality check and data processing have been described in detail in the original study.[213]^25 Downloaded FASTQ files for mapped reads from each individual were pre-processed for quality control with fastp[214]^70 and then converted into BAM files by using STAR.[215]^71 SNP genotyping data Additional quality-controlled DNA genotyping data for the same HIPPO individuals (N = 264) were also obtained from the BrainSeq2 resource dataset. The genotyping of DNA samples was performed on the Illumina platform (build hg19/GRCh37). The human postmortem brain tissue was obtained by autopsy primarily from the Offices of the Chief Medical Examiner of the District of Columbia, and the National Institute of Child Health and Human Development Brain and Tissue Bank for Developmental Disorders. After imputation and quality control (SNPs with MAF >5% and Hardy–Weinberg p > 1 × 10^−6) steps, totaling 7,023,860 high-quality variants remained in this dataset. More detailed information about quality assessment, genotyping analysis, and SNPs identification can be found in the originally literature.[216]^25 Of these remaining SNPs, we extracted all SNPs that are within ±100 kb of the identified AS events and used them for cis-sQTL mapping. Detection and quantification of AS events To comprehensively perform splicing analyses of RNA-seq data from HIPPO tissue, we used the LeafCutter algorithm[217]^73 to calculate intron excision ratios, which represents the proportion of usage of an intron relative to other introns in the same cluster. The ‘percentage spliced-in’ value (PSI) was calculated as a quantification of the intron excision. Before running LeafCutter, the alignment and quantification of RNA-seq reads (i.e., converted BAM files) were processed using STAR[218]^71 and reads exhibiting potential mapping bias were removed with WASP.[219]^72 We then ran LeafCutter with default parameter settings to perform intron clustering, which allows maximum introns with length up to 500 kb. By using prepare_genetype_table.py script in LeafCutter, intron excision ratios were calculated and introns used in less than 40% of individuals or with almost no variation were filtered out. After normalization and quantile standardization of intron excision rates, the final splicing expression matrix (i.e., PSI matrix) was obtained, which could be used as the standard input data for FastQTL.[220]^74 Splicing QTL mapping To explore significant associations between SNP dosages and PSI of AS, we performed cis-sQTL mapping (within 100 kb of intron clusters) using linear regression implemented in the FastQTL software.[221]^74 Considering the influence of genetic, biological, and technical factors, we conducted covariance correction analysis to control potential confounding variables. For the selection of covariates, we first employed the first five principal components derived using GCTA[222]^81 of the genotype matrix to account for the influence of genetic ancestry, as well as the first 10 principal components of the PSI phenotype matrix to regress out the effect of known/hidden factors. The principal components analysis was then conducted to regress out the technical and biological covariates such as age, sex and RNA integrity number (RIN). Furthermore, we ran the permute comparison model in FastQTL and set 1,000–100,000 adaptive permutes to calculate the p-value (i.e., permutation p-value) for each significant sQTL SNP. The calculated Beta-approximated empirical p-values were corrected using the Benjamini-Hochberg correction to account for the multiple tests, and we detected all significant sQTLs (FDR <0.05) corresponding to AS events. sQTL and non-sQTL datasets generating To further characterize the sQTL SNPs, we generated two sets of sQTL and non-sQTL SNPs for comparative analysis by screening Benjamini-Hochberg corrected p-values and pruning SNPs. Specifically, we first selected the most significant SNP (i.e., the most representative sQTL SNP for each AS event) among the sQTL SNPs after correcting the p values. And the SNPs screened above with linkage disequilibrium (LD) were pruned using PLINK software (version 1.9)[223]^76 with the parameters “--indep-pairwise 50 5 0.5”, which represent window size, shift step and r^2 threshold, respectively. The haplotype of 1000 Genome Project Phase 3 (EUR)[224]^82 were used as the reference genotypes for LD-based pruning in PLINK. For the dataset of sQTL SNPs, we extracted a total of 6,966 SNPs that contributed to AS regulation independently of each other after screening the most significant SNPs among the sQTL SNPs with corrected p value <0.05 and LD pruning. For the controlled dataset of non-sQTL SNPs, we obtained a total of 124,801 SNPs that did not contribute to AS regulation and independent of one another through screening the most significant SNPs among the sQTL SNPs with uncorrected p value >0.05, and LD pruning with the same setting. Overall, we generated 6,966 sQTL SNPs and 124,801 non-sQTL SNPs strongly independent of each other for downstream analysis, including functional annotation and functional enrichment analysis. Functional annotation of sQTL and non-sQTL SNPs For sQTL and non-sQTL SNPs, we functionally annotated these SNPs using SnpEff v5.1.[225]^80 Information on SnpEff annotations was gathered using MyVariant.info ([226]http://myvariant.info/) and VeP (Variant Effect Predictor).[227]^83 Based on the annotation information, the variants (including sQTL and non-sQTL SNPs) were classified into 9 categories, and overlapping functional categories were prioritized as: synonymous >5' UTR >3' UTR > intergenic > intron > splice region (defined as the variant within 1–3 bases of exon and 3–8 bases of intron from the nearest intron–exon boundary) > non-coding exon > splice acceptor > splice donor. Based on the different functional classifications of sQTL SNPs, we then performed enrichment analysis for each functional class of sQTL SNPs using Fisher's exact test for 2 × 2 table (i.e., the rows listed as SNPs "belonged" and "not belonged" to the specific classification, and columns listed as sQTL SNPs and non-sQTL SNPs). Functional enrichment of sQTL SNPs among genetic regulatory elements and splicing factor binding sites To investigate the functional enrichment of sQTL SNPs in various regulatory elements in HIPPO tissue, we downloaded the histones annotation information with BED format (including monomethylated histone H3 lysine 4 (H3K4me1), trimethylated histone H3 lysine 4 (H3K4me3), acetylated histone H3 lysine 9 (H3K9ac), acetylated histone H3 lysine 27 (H3K27ac) and DNase I hypersensitive sites (DHS)) from the Encyclopedia of DNA Elements project (ENCODE portal[228]^84) and CistromeDB[229]^85 database. BEDTools[230]^75 was used to analyze whether SNPs are included in these regulatory elements. In addition to the histones, transcription factor (TF) is another important transcriptional regulator involved in transcriptional regulation. To determine the effect of SNPs on gene transcriptional regulation by TF, we scored these variants using RegulomeDB[231]^86 v2.0 ([232]http://www.regulomedb.org), which allowed us to focus on non-coding variants that may directly affect TF binding (i.e., categories 1 and 2 defined by RegulomeDB). Considering that AS events are largely regulated by RNA binding proteins (RBP), we also downloaded publicly available human RBP binding sites data (bed format files and built hg38) from CLIPdb database.[233]^87 We utilized liftOver[234]^88 to convert the coordinates of sQTL SNPs from hg19 to hg38 and used BEDTools to assess the enrichment of sQTL SNPs on RBP binding site. Ultimately, the enrichment analyses of sQTL SNPs among genetic regulatory elements and RBP binding sites were conducted by Fisher’s exact test with 2 × 2 table (i.e., the rows listed as SNPs within and not within the regulatory element and RBP binding sites, and columns listed as sQTL SNPs and non-sQTL SNPs). Functional enrichment of sQTLs among brain disease-associated loci SNPs in high LD with the sQTL SNPs (r^2 > 0.8) were calculated with PLINK software using the 1000 Genomes Project reference data. We downloaded GWAS variants information from the GWAS Catalog v.1.0.2[235]^89 (only SNPs with P values < 1 × 10^-5 were included in this list) and then used these GWAS variants information to annotate high LD SNPs. BEDTools was used to determine whether the SNPs obtained above falls within disease-associated loci. Moreover, we performed functional enrichment analysis of sQTLs among brain disease-associated loci by Fisher’s exact test with 2 × 2 table (i.e., the rows listed as SNPs within and not within the brain disease-associated loci, and columns listed as sQTL SNPs and non-sQTL SNPs). We conducted enrichment analyses for the following 12 brain disorders: Alzheimer, amyotrophic lateral sclerosis, Parkinson, frontotemporal dementia, epilepsy, autism, schizophrenia, bipolar disorder, depression, attention deficit hyperactivity disorder, multiple sclerosis, and stroke. DLPFC and HIPPO tissue comparison To compare effect sizes consistently between DPPFC and HIPPO brain regions, we also obtained RNA-seq data in the BrainSeq2 for DLPFC with the same samples as HIPPO, and the related data processing can be found in the original publication.[236]^25 The RNA-seq data processing procedure (including quality control, PSI calculating and sQTL detecting) of DLPFC is the same as that of HIPPO. For SNPs corresponding to multiple splicing events at a single mutation site, we selected the SNP with the smallest P value. Eventually, we identified 8,423 sQTL SNPs in DLPFC for subsequent comparative analysis, including partitioned heritability. Partitioned heritability LD Score software v1.0.1[237]^90^,[238]^91 was used to calculate partitioned heritability to identify enrichment of GWAS summary statistics in DLPFC and HIPPO brain regions. The sQTL annotations from both brain regions were created by using a window size of 500 bp (±250 bp) around each sQTL SNPs, resulting in 11,157 HIPPO sQTL annotations ([239]Table S1) and 13,383 DLPFC sQTL annotations ([240]Table S2). The annotation files were created by tagging all HapMap3[241]^92 SNPs that fell within the sQTL annotations. By using 1000 Genomes Phase 3 (EUR)[242]^93 as LD reference panel, we calculated LD-scores with an LD window of 1 cM. The enrichment of each annotation was calculated by dividing the proportion of heritability explained by each annotation by the proportion of SNPs in the genome that fell into that annotation category. The sQTLs in DLPFC and HIPPO tissue were examined separately for enrichment in genetic variants associated with 12 brain disorders mentioned in Section [243]functional enrichment of sQTLs among brain disease-associated loci. Expression-weighted cell type enrichment analysis (EWCE) To explore sQTLs-relevant cell types, we employed R package to perform cell type enrichment analysis for sQTL SNPs sets. We downloaded single-cell RNA-seq gene expression data of mouse somatosensory cortex and hippocampal CA1 region from Zeisel et al.,[244]^94 including 9 cell types and 47 cell subclasses. The EWCE R-package[245]^79 was used to investigate cell-type expression specificity of identified sQTL-related genes, and analysis results were considered to be significant at P < 0.05. Further details on the principles, model algorithm implementation of the EWCE analysis can be found in the publication.[246]^79 Weighted gene co-expression network analysis (WGCNA) We mapped intron clusters (obtained from the LeafCutter) to sGenes based on the hg19 reference genome annotation downloaded from GENCODE.[247]^95 The individual genes corresponding to multiple probe sets were merged, resulting in a final splicing matrix of 15,277 genes. Network analysis was performed by a robust consensus WGCNA (rWGCNA)[248]^77 to cluster sGenes into specific modules based on biased intermediate correlations among gene expression patterns. A soft-threshold power of 5 was chosen to achieve approximate scale-free topology of the network (r^2 > 0.9), and each co-expression network used the same estimated power parameter. Gene modules were defined by using the topological overlap dendrogram with specific module cutting parameters (deepSplit = 2, minimum module size = 30, merge threshold = 0.25), and sex and age were selected as phenotypes. For each module of genes, we performed Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis using the clusterProfiler R package.[249]^69 The calculated P values were corrected with Benjamini-Hochberg correction and top two significant terms are reported. TWAS analysis We performed TWAS analysis using the FUSION package ([250]http://gusevlab.org/projects/fusion/) by integrating GWAS summary statistics of Alzheimers disease[251]^63 and PSI in HIPPO brain tissue. The SNP-based predictive weights of each intron in HIPPO were calculated by the computational weight pipeline described in FUSION, which represents the correlation between SNPs and AS regulation in the reference panel. Furthermore, TWAS was performed using the FUSION software[252]^78 with several regularized linear models (including Elastic Net, LASSO, and BLUP) to evaluate the imputation of the expressions. More details of the TWAS analysis can be found in our previous study.[253]^96 The absolute value of the Z-score calculated by TWAS reflects the strength of the association between the intron usage (i.e., AS events) of interest and the disease. The multiple correction was applied to the TWAS analysis results, and the significance level was set at 1.13 × 10^−5 (Bonferroni correction 0.05/4,420). Multiple comparison To avoid the false positive and false negative results, the Bonferroni correction and Benjamini-Hochberg correction for multiple testing were applied to the analysis results in our study. For all enrichment analysis and TWAS analysis, the P-value adjustments were performed using Bonferroni correction method. For sQTL mapping, EWCE and WGCNA analysis, the P-value was further corrected with Benjamini-Hochberg method. Quantification and statistical analysis All quantification and statistical analyses were performed as described in the [254]method details section of the [255]STAR methods. Acknowledgments