Abstract
GWAS of Alzheimer’s disease have been predominantly based on European
ancestry cohorts with clinically diagnosed patients. Increasing the
ancestral diversity of GWAS and focusing on imaging brain biomarkers
for Alzheimer’s disease may lead to the identification of new genetic
loci. Here, we perform a GWAS on cerebral β-amyloid deposition measured
by PET imaging in 3,885 East Asians and a cross-ancestry GWAS
meta-analysis with data from 11,816 European participants. Our GWAS
analysis replicates known loci (APOE4, CR1, and FERMT2) and identifies
a novel locus near SORL1 that is significantly associated with
β-amyloid deposition. Single-nucleus expression analysis shows that
SORL1 is differentially expressed according to β-amyloid positivity in
microglia. Our joint association analysis using the SORL1 lead variant
(rs76490923) and the APOE4 allele demonstrates that the risk of
β-amyloid deposition is reduced by up to 43.5% in APOE4 non-carriers
and up to 55.6% in APOE4 carriers, according to the allelic dosage of
the rs76490923 T allele. Our findings suggest that SORL1 may play an
important role in the pathogenesis of Alzheimer’s disease, particularly
in relation to β-amyloid deposition.
Subject terms: Genome-wide association studies, Alzheimer's disease,
Gene expression
__________________________________________________________________
A cross-ancestry GWAS meta-analysis of β-amyloid deposition in East
Asian and European populations identifies a novel SORL1 locus and
highlights its differential expression in microglia associated with
β-amyloid positivity.
Introduction
Alzheimer’s disease (AD) is a complex neurodegenerative disease with a
high genetic heritability estimated to be 60–80%^[62]1. Large
genome-wide association studies (GWAS) have identified genetic variants
associated with AD and highlighted important genes and pathways in AD
pathology^[63]2,[64]3. However, while over 70 genetic loci were
identified by leveraging up to 1,126,563 samples, they only explained
about 15% of the phenotypic variance in the diagnosis of AD^[65]3. This
may be partially due to the failure of outcome phenotypes of GWAS to
match the biology of the disease and the lack of sufficient ancestral
diversity in study populations.
Over the past decades, the paradigm of AD diagnosis has shifted from
clinical diagnosis to biomarker-based diagnosis. The recently revised
criteria for diagnosis and staging of AD emphasize abnormalities on
core biomarkers for the diagnosis of AD, of which amyloid positron
emission tomography (PET) is the best validated^[66]4. However, most
large GWAS on AD used clinically diagnosed AD or proxy AD using family
history as the outcome phenotype^[67]2,[68]3. Recently, GWAS analyses
using imaging or fluid biomarkers to identify genetic factors that are
more biologically relevant have been reported^[69]5–[70]9. For
instance, a GWAS meta-analysis using amyloid PET identified genome-wide
significant loci in genes previously known to be associated with
clinical AD, such as APOE, CR1, FERMT2, and ABCA7, along with a novel
African-specific variant in PTDSS1. This observation supports their
involvement in AD pathology via amyloid-related mechanisms and suggests
the importance of cross-ancestry biomarker-based research in expanding
our understanding of AD genetics^[71]5.
The ethnic diversity of study populations is also an essential factor
that could contribute to significant discoveries. The majority of GWAS
to date have been conducted primarily in European (EUR) populations,
not only limiting the generalization of the findings in non-EUR
populations but also inevitably missing many variants that are absent
or rare in EUR. Given that only a small fraction of variants is common
across different populations^[72]10, many biologically relevant but
rare variants may not have reached genome-wide significance due to the
insufficient statistical power. Therefore, multi-ethnic GWAS including
non-EUR populations may identify novel genetic loci.
In this study, we performed a cross-ancestry GWAS meta-analysis of
cerebral β-amyloid (Aβ) deposition in EUR and East Asian (EAS)
populations. We identified variants associated with a core pathological
biomarker of AD using quantified uptake on amyloid PET. We demonstrated
that the lead variant in the SORL1 region is associated with SORL1
expression specifically in microglia, based on single-nucleus RNA
sequencing (snRNA-seq) data from the ROSMAP (Religious Orders
Study/Memory and Aging Project)^[73]11. Finally, we investigated
whether SORL1 was differentially expressed according to Aβ pathology in
single-cell transcriptome data of postmortem brain samples.
Results
β-amyloid genetic association analysis and meta-analysis in EAS population
(stage 1)
For the discovery GWAS analysis (stage 1), Aβ and genome-wide genotype
data for 3387 EAS samples were obtained from the Korea-Registries to
Overcome and Accelerate Dementia research (K-ROAD) cohort from 2017 to
2023. The stage 1 participants had a mean age of 72.0 years (standard
deviation [SD] = 8.4), and 63.4% were female (Supplementary
Data [74]1). After stringent quality control (QC) and genotype
imputation, we performed a GWAS for Aβ with overlapped 5,651,782
variants in 2049 GWAS microarray chip data and 1338 whole-genome
sequencing (WGS) data (Supplementary Fig. [75]1). In addition, we
obtained external summary statistics-level GWAS data in EAS, including
498 samples and 5,619,153 variants from a previous multi-ethnic GWAS
performed by Ali et al.^[76]5. We then performed a genome-wide
fixed-effect meta-analysis for Aβ between K-ROAD (chip array and WGS
data) and the external GWAS summary statistics, which retained the
association results for 5,110,277 autosomal variants with minor allele
frequency (MAF) > 0.5% in both cohorts. A total of 3885 samples and
5,110,277 variants (stage 1) were included in the meta-analysis of EAS
population (Fig. [77]1a).
Fig. 1. Manhattan plots of EAS and cross-ancestry meta-GWASs for Aβ
deposition.
[78]Fig. 1
[79]Open in a new tab
a Manhattan plot of the GWAS meta-analysis for Aβ deposition in EAS
(stage 1, n = 3855). b Manhattan plot of the cross-ancestry GWAS
meta-analysis for Aβ deposition in EAS and EUR populations
(n = 15,701). p values for GWASs were calculated using a fixed-effect
inverse variance-weighted meta-analysis. The x-axis represents
chromosomal position, and the y-axis represents the −log[10](p value)
for the association of variants with Aβ deposition. Reported p values
are two-sided and not corrected for multiple testing. The light green
dots represent the genome-wide significant lead variants. The red text
highlights previously unreported loci along with their mapped genes,
while the black text indicates previously reported loci. The pink and
blue horizontal dashed lines indicate the genome-wide significance
level (p value = 5.00 × 10^−8) and the genome-wide suggestive level (p
value = 1.00 × 10^−5), respectively. EAS East Asian, EUR European, GWAS
genome-wide association study.
We identified three genome-wide significant loci (p < 5.00 × 10^−8),
comprising a total of 87 variants, from the GWAS meta-analysis for Aβ
in EAS. The most significant association was observed with rs429358, a
single-nucleotide polymorphism (SNP) involved in determining APOE4
status, which demonstrated a strong association with Aβ levels
(beta = 0.5589, standard error [SE] = 0.0244, p = 8.70 × 10^−116).
Moreover, 73 variants were in high linkage disequilibrium (LD) with
rs429358 and located near the APOE region on chromosome 19. On
chromosome 11, a set of 12 variants, led by rs76490923 (beta = −0.1661,
SE = 0.0279, p = 2.46 × 10^−9), exceeded genome-wide significance and
had not been previously reported in the largest Aβ GWAS^[80]5. These
variants were located within the intron region of the SORL1 gene and
demonstrated high LD with each other. In addition, the rs4851914 single
variant located intron region of the ECRG4 gene was identified on
chromosome 2 (beta = 0.3926, SE = 0.0712, p = 3.47 × 10^−8)
(Table [81]1). The λ[GC] was 1.093, and the SNP-based heritability for
Aβ was estimated to be 14.0% in the EAS GWAS meta-analysis.
Table 1.
Genetic variants associated with Aβ deposition in the meta-analysis of
East Asian population
Gene No. of variants^a Lead variant locus (hg38) NEA EA Beta SE p value
EAF (stage 1^b) EAS EAF (gnomAD^c)
APOE 74 rs429358 chr19:44908684 T C 0.5589 0.0244 8.70 × 10^−116 0.1865
0.0978
SORL1 12 rs76490923 chr11:121583074 C T −0.1661 0.0279 2.46 × 10^−9
0.2267 0.2104
ECRG4 1 rs4851914 chr2:106075677 A G 0.3926 0.0712 3.47 × 10^−8 0.0617
0.0815
[82]Open in a new tab
East Asian meta-GWAS for Aβ deposition was performed using a
fixed-effect inverse variance-weighted meta-analysis. All reported p
values are two-sided and unadjusted for multiple testing.
chr chromosome, NEA non-effect allele, EA effect allele, EAF effect
allele frequency, EAS East Asian, SE standard error, GWAS genome-wide
association study.
^aThe number of significant variants (p value < 5.00 × 10^−8) of GWAS
for stage 1 in each locus.
^bEAFs were derived from the 3387 individuals with genotype data in the
stage 1 GWAS.
^cEAFs were derived from the Genome Aggregation Database (gnomAD
version 4.1.0, [83]https://gnomad.broadinstitute.org/).
Replication analysis for novel associations in EAS population (stage 2)
To replicate the significantly associated SNPs in the SORL1 and ECRG4
regions with Aβ level, we genotyped an additional 753 samples in the
next phase of K-ROAD from 2023 to 2024. The stage 2 participants had a
mean age of 70.8 years (SD = 8.8), and 62.4% were female (Supplementary
Data [84]1). We estimated genetic relationships and selected 655
unrelated samples (stage 2) with the discovery samples (stage 1) that
were eligible for replication analysis. We then performed linear
regression analyses adjusted for age, sex, and five genetic principal
components (PCs) with Aβ as the dependent variable using the
significantly associated 12 SNPs in SORL1 and the rs4851914 variant in
ECRG4.
In the replication analysis, we found that the 11 SNPs identified in
the SORL1 region from the stage 1 GWAS were associated with Aβ levels.
The lead SNP, rs76490923, was significantly associated with Aβ levels
(beta = −0.1527, SE = 0.0561, p = 6.67 × 10^−3), showing a similar
effect size to the discovery results (beta = −0.1661 [stage 1] vs.
−0.1527 [stage 2]). The detailed results are shown in Supplementary
Data [85]2. In contrast, the variant rs4851914 in the ECRG4 region was
not significantly associated with Aβ levels in the replication sample
(beta = 0.0874, SE = 0.1100, p = 0.427).
Comparison of SORL1 variants for β-amyloid between EAS and EUR
To investigate ethnic differences in the identified SORL1 variants, we
obtained the summary statistics of EUR GWAS for Aβ, including 11,816
samples and 5,622,864 variants from a large-scale EUR GWAS performed by
Ali et al.^[86]4 and then compared these results with our GWAS signals.
In the EUR GWAS for Aβ, we found that rs76490923 had a marginally
significant association (beta = −0.1240, SE = 0.0404, p = 2.20 × 10^−3)
with Aβ in EUR population. rs76490923 was associated with decreased Aβ
levels and exhibited the same effect directionality (beta coefficient,
−0.1240 [EUR] vs. −0.1661 [EAS]) as in our EAS GWAS results
(Fig. [87]2). In contrast, the lead SNP, rs76490923, showed a notable
difference in MAF between the two populations. According to the gnomAD
database^[88]12, the MAF for rs76490923 differed nearly tenfold between
EUR (non-Finnish, MAF: 0.021) and EAS (MAF: 0.210).
Fig. 2. Regional plots for SORL1 signals associated with Aβ deposition.
[89]Fig. 2
[90]Open in a new tab
a Regional plot for SORL1 signals from the East Asian GWAS. b Regional
plot for SORL1 signals from the European GWAS. c Regional plot for
SORL1 signals from the cross-ancestry meta-GWAS. p values for GWASs
were calculated using a fixed-effect inverse variance-weighted
meta-analysis. Each dot is colored by r^2 of linkage disequilibrium
with the purple-colored lead SNPs indicated with texts. Reported p
values are two-sided and not corrected for multiple testing. The x-axis
represents chromosomal position, and the y-axis represents the
−log[10](p value) for the association of variants with Aβ deposition.
EAS East Asian, EUR European, GWAS genome-wide association study, SE
standard error, MAF minor allele frequency, gnomAD Genome Aggregation
Database.
Cross-ancestry GWAS meta-analysis in EAS and EUR populations
To maximize the power of gene discovery, we conducted a cross-ancestry
meta-analysis with Aβ GWAS summary statistics in EAS (stage 1
[n = 3885] in this study) and EUR (n = 11,816 from Ali et al.^[91]4)
using METAL^[92]13.
A total of 15,701 samples and 6,648,274 variants were included in the
cross-ancestry GWAS meta-analysis. The λ[GC] was 1.035. We identified
four genome-wide significant loci associated with Aβ levels
(Fig. [93]1b). The APOE allele, rs429358, demonstrated the most
significant association with Aβ levels (beta = 0.5982, SE = 0.0122,
p < 1.00 × 10^−320). The SORL1 signals on chromosome 11 (rs76490923,
beta = −0.1526, SE = 0.0230, p = 3.09 × 10^−11), initially identified
in the EAS GWAS (stage 1), demonstrated even a greater significance
level in the cross-ancestry meta-analysis (Fig. [94]2 and Supplementary
Fig. [95]2). The remaining two loci had been previously reported in EUR
GWAS (21 variants near the CR1 gene and 16 near the FERMT2 gene) and
also exhibited greater significance levels in the cross-ancestry
analysis than the EUR GWAS (Supplementary Data [96]3 and Supplementary
Figs. [97]3–[98]5). However, the ABCA7 locus on chromosome 19,
previously reported in EUR GWAS^[99]4, was not significantly associated
(rs12151021, beta = 0.0054, SE = 0.0317, p = 0.864) with Aβ in our EAS
GWAS (stage 1).
At the suggestive level of association (p < 1.00 × 10^−5), we
identified an additional 24 independent genome-wide suggestive loci
associated with Aβ (Supplementary Data [100]4). We then conducted a
pathway enrichment analysis using the 28 lead SNPs and their
corresponding mapped genes and gene ontology (GO) database (biological
process 2023)^[101]14. This analysis revealed 10 significant GO
pathways that met the Bonferroni correction threshold (p < 0.05/619)
and 55 pathways that passed the false discovery rate (FDR) < 0.05 using
Benjamini–Hochberg correction (Table [102]2 and Supplementary
Data [103]5). Notably, pathways associated with Aβ formation and
amyloid precursor protein (APP) catabolism were among the most
significant, alongside those implicated in tau protein kinase activity
and neuronal death.
Table 2.
Pathway enrichment analysis of candidate genes identified in
cross-ancestry meta-analysis for Aβ deposition
GO term (biological process 2023) Odds ratio Bonferroni-corrected
p value
Regulation of Amyloid-Beta Formation (GO:1902003) 201.5 2.83E−09
Negative Regulation of Amyloid Precursor Protein Catabolic Process
(GO:1902992) 333.8 1.93E−08
Negative Regulation of Amyloid-Beta Formation (GO:1902430) 277.2
3.42E−06
Regulation of Aspartic-Type Endopeptidase Activity Involved in Amyloid
Precursor Protein Catabolic Process (GO:1902959) 479.2 8.48E−05
Regulation of Amyloid Precursor Protein Catabolic Process (GO:1902991)
299.5 2.49E−04
Negative Regulation of Amide Metabolic Process (GO:0034249) 37.2
5.10E−03
Cholesterol Transport (GO:0030301) 49.8 3.03E−02
High-Density Lipoprotein Particle Assembly (GO:0034380) 219.4 4.19E−02
Regulation of Tau-Protein Kinase Activity (GO:1902947) 219.4 4.19E−02
Regulation of Neuron Death (GO:1901214) 43.5 4.46E−02
[104]Open in a new tab
Pathway enrichment analysis of candidate genes was performed using the
gene ontology (biological process 2023) database. The analysis was
based on the Fisher’s exact test. Reported p values are two-sided and
were corrected for multiple testing using the Bonferroni correction (p
value < 0.05/619).
GO gene ontology.
SORL1 signals in AD case-control GWAS
To investigate whether the Aβ-associated SORL1/rs76490923 variant is
also associated with AD risk, we performed a case-control association
analysis for the SORL1/rs76490923 ± 2 Mb region using individual-level
data from 4042 EAS participants across stages 1 and 2. After excluding
individuals with mild cognitive impairment (MCI), 2043 participants
were eligible for the association test, comprising 961 AD cases and
1082 cognitive unimpaired (CU) controls. The analysis revealed a
significant protective association between the SORL1/rs76490923 variant
and AD risk (odds ratio [OR] = 0.7657, 95% confidence interval
[CI] = 0.6119–0.9196, p = 6.79 × 10^−4) (Supplementary Fig. [105]6).
We also examined the association between SORL1/rs76490923 and clinical
AD risk in previously established large-scale EUR AD or AD proxy GWASs
with sample sizes ranging from 63,920 to 788,989
individuals^[106]2,[107]3,[108]15,[109]16. Although the significance
levels varied slightly, the SORL1/rs76490923 variant consistently
showed a protective association against AD in the EUR population.
Additionally, we identified rs11218343, an LD proxy variant for
SORL1/rs76490923 (r² = 0.62), which exhibited a strong association with
the EUR-based AD GWAS. rs11218343 (MAF = 0.039 in EUR) was a more
frequent variant than rs76490923 (MAF = 0.021 in EUR) (Supplementary
Fig. [110]7).
Joint association of SORL1 and APOE4 with Aβ deposition
We performed a joint association analysis to investigate the
complementary effects of the identified SORL1 signal and APOE4 on Aβ
deposition risk using the 4042 participants (combined stage 1 and 2
datasets). This analysis involved SORL1/rs76490923 genotype dosage (0
[CC], 1 [TC or CT], or 2 [TT]) and APOE4 status (carrier or
non-carrier) (Fig. [111]3).
Fig. 3. Forest plots of risk of Aβ deposition according to the APOE4 and
SORL1 status.
[112]Fig. 3
[113]Open in a new tab
a Aβ deposition risk between APOE4 carriers and non-carriers. b Aβ
deposition risk according to APOE4 status and SORL1/rs76490923
genotype. c Aβ deposition risk according to SORL1/rs76490923 genotype
stratified by APOE4 status. The joint association analyses were
conducted on 4042 study samples, combining data from stage 1 and stage
2 datasets. The number of Aβ-positive cases represents participants
classified as Aβ-positive according to the CERAD neuropathological
category. Boxes represent the adjusted odds ratio, with horizontal
lines around the boxes indicating 95% confidence intervals. The odds
ratios were estimated using logistic regression models, adjusted for
age, sex, cohort type, and the first five principal components of
ancestry. p values were determined using a Wald test. p values for the
trend were calculated using the Cochran-Armitage trend test. All p
values were derived from two-sided tests. Exact statistical values are
provided as a Source Data file. OR odds ratio, CI confidence interval,
CERAD Consortium to Establish a Registry for Alzheimer’s disease.
We observed a gradual decremental association between increasing
rs76490923 T allele dosage and a decreased risk of Aβ deposition.
Non-carriers for APOE4 with rs76490923 alternative homozygote (TT) had
the lowest risk for Aβ deposition (OR = 0.092, 95% CI = 0.059–0.143,
p = 1.80 × 10^−26) (Fig. [114]3b). In stratified analyses by APOE4
status, rs76490923 was associated with up to a 55.6% reduction in the
risk of Aβ deposition among APOE4 carriers (OR = 0.444, 95%
CI = 0.256–0.770, p = 3.85 × 10^−3) and up to a 43.5% reduction among
non-carriers (OR = 0.565, 95% CI = 0.370–0.862, p = 8.06 × 10^−3)
(Fig. [115]3c). All trend analyses showed significant (p < 0.001), but
no significant interactions between SORL1 and APOE4 genotypes were
identified (p = 0.391).
Brain single-nucleus RNA sequencing eQTL and colocalization analyses
To further investigate the link between SORL1 expression levels in the
brain and the identified GWAS signals, we conducted an expression
quantitative trait loci (eQTL) analysis using brain snRNA-seq
expression data from Columbia ROSMAP^[116]17. Our results revealed that
the lead SNP (rs76490923) was significantly associated with SORL1
expression levels in microglia (beta = 0.393, SE = 0.087,
p = 8.81 × 10^−6) among seven cell types in the dorsolateral prefrontal
cortex (Table [117]3 and Supplementary Fig. [118]8). Furthermore, we
found that GWAS signals near the SORL1 gene also overlapped with the
microglia-specific eQTL signals of SORL1 (posterior probability
of H4 = 0.56).
Table 3.
Brain single-nucleus RNA sequencing eQTL analysis of the SORL1 gene
Variant ID (non-effect allele; effect allele) Gene Cell type Beta
coefficient Standard error p value Frequency
rs76490923 (C;T) SORL1 Microglia 0.393 0.087 8.81E−06 0.021
Oligodendrocytes −0.171 0.154 0.269 0.021
Inhibitory neurons 0.196 0.158 0.213 0.021
Oligodendrocyte precursor cells 0.068 0.158 0.666 0.021
Endothelial cells 0.186 0.289 0.520 0.016
Astrocytes 0.021 0.119 0.863 0.021
Excitatory neurons 0.099 0.078 0.205 0.021
[119]Open in a new tab
The eQTL analysis of ROSMAP data used tensorQTL to test associations
between genotypes and gene expression matrix, with reported p values
derived from the permutation test. Reported p values are two-sided. The
frequency represents the minor allele frequency for individuals by each
cell type.
eQTL expression quantitative trait loci.
Brain single-nucleus RNA sequencing data and DEG analysis
We additionally investigated whether the SORL1 gene was differentially
expressed according to Aβ positivity using cell-type specific
differentially expressed genes (DEG) in the brain using the snRNA-seq
data derived from 15 EAS individuals. The expression levels of SORL1
(log fold change = −2.94, SE = 0.121, FDR corrected p = 0.017) were
lower in Aβ-positive patients in microglia cell type than in controls
(Fig. [120]4 and Table [121]4).
Fig. 4. Brain cell-type specific expression of the SORL1 from Korean
single-nucleus RNA-seq.
[122]Fig. 4
[123]Open in a new tab
a UMAP colored by major brain single-cell types. b UMAP plot colored by
Aβ positivity across single-cell types. The red represents Aβ-positive
and blue represents Aβ-negative. c SORL1 gene expression across
single-cell types colored by z-scored expression levels. The red
indicates higher gene expression, and the blue indicates lower
expression. d Box plots for differential gene expression of SORL1 by Aβ
positivity in single-cell types. Aβ positivity was defined based on the
CERAD neuropathological category. Differential gene expression analysis
was performed using MAST (v1.16.0) and brain cell-type-specific gene
expression data from 15 Korean individuals (nine Aβ-positive and six
Aβ-negative individuals). The asterisk indicates a significant result
obtained from the likelihood ratio test that meets the threshold of a
false discovery rate-corrected p value < 0.05, calculated using the
Benjamini–Hochberg correction from two-sided tests. This box plot
illustrates the differential expression of SORL1 across major brain
cell types, grouped by Aβ positivity (positive in red and negative in
blue). On the box plots, the horizontal line indicates the median, the
box indicates the first to third quartile of expression and whiskers
indicate 1.5 × the interquartile range. The x-axis represents various
cell types, and the y-axis represents SORL1 log normalization
expression. Source data are provided as a Source Data file. UMAP
uniform manifold approximation, and projection, UMI unique molecular
identifiers, MG microglia, OD oligodendrocytes, IN inhibitory neurons,
OPC oligodendrocyte precursor cells, End endothelial cells, Ast
astrocytes, Ext excitatory neurons, ns non-significant, CERAD
Consortium to Establish a Registry for Alzheimer’s disease.
Table 4.
Brain single-nucleus RNA sequencing DEG analysis for Aβ positivity of
the SORL1 gene (n = 15)
Gene Cell type Log fold-change p value FDR-corrected
p value
SORL1 Microglia −2.94 8.36E−04 0.017
Oligodendrocytes −0.23 0.021 0.125
Inhibitory neurons 0.13 0.199 0.548
Oligodendrocyte precursor cells 0.67 0.004 0.088
Endothelial cells 0.68 0.140 0.559
Astrocytes −0.30 0.151 0.369
Excitatory neurons 0.11 0.918 0.958
[124]Open in a new tab
Differential gene expression analysis was performed with Korean
single-nucleus RNA sequencing using MAST (v1.16.0), with reported p
values derived from the likelihood ratio test. False discovery rate
corrected p values were calculated using the Benjamini–Hochberg
correction. Reported p values are two-sided.
DEG differential gene expression, FDR false discovery rate.
Discussion
We performed a cross-ancestry GWAS using our Korean genotype-phenotype
cohort to identify genetic variants associated with amyloid pathology.
By including a substantial proportion of EAS samples with amyloid PET
data, we identified four loci with genome-wide significant association
and 24 additional suggestive loci. The SNPs identified in our analysis
were enriched in pathways related to Aβ formation and APP catabolism,
as well as pathways implicated in tau protein kinase activity and
neuronal death—key processes in AD pathogenesis. Ancestral diversity
and precise phenotype enabled us to identify SORL1 as the most
significant locus besides the well-known APOE gene. Our joint
association analysis using the SORL1 lead variant and APOE4 allele
showed that the risk of Aβ positivity was significantly diminished by
carrying the SORL1 variant.
Clinical diagnosis of AD is made based on clinical and
neuropsychological examination and the absence of other diseases on
structural brain imaging. However, because it takes more than 15 years
from the deposition of Aβ in the brain to the manifestation of
cognitive impairment^[125]18, many individuals with AD pathophysiology
can be included in the control group. Also, individuals with the same
amount of pathology can either be symptomatic or asymptomatic depending
on their cognitive reserve^[126]19. Furthermore, AD can present with
atypical symptoms and non-AD neurodegenerative diseases can mimic
AD^[127]20, making accurate clinical diagnosis challenging. Therefore,
GWAS using clinical diagnosis labels can be limited and biased by the
heterogeneity of study participants in both case and control groups. As
a promising alternative, genetic studies using phenotypes based on
imaging biomarkers are needed.
Previous GWAS meta-analysis using amyloid PET reported four genome-wide
significant loci associated with cerebral Aβ deposition. Our GWAS of
the largest EAS individuals to date identified significant associations
at a previously unreported locus near the SORL1 gene. While this locus
was not identified in previous EUR amyloid PET GWAS, the lead SNP
showed an association with amyloid PET uptake at a nominal significance
with the same effect direction (beta = −0.1240, SE = 0.0404,
p = 2.20 × 10^−3). In line with this, the association level was further
strengthened in meta-analyses with EAS and EUR GWAS. This suggests that
variants in SORL1 may have shared pathophysiological effects between
EAS and EUR but their associations were not identified in the EUR GWAS
partly due to insufficient statistical power because of their low MAF.
We further investigated whether the identified SORL1/rs76490923 was
associated with AD risk. Our findings confirmed that the
SORL1/rs76490923 variant or its LD proxy variant may have a potential
protective effect against AD risk across populations. Despite
variations in allele frequency, this protective association was
consistently observed in both EAS and EUR populations. These results
suggested that the identified SORL1/rs76490923 variant plays an
important role in the pathogenic mechanism of AD following that of Aβ
and exemplifies the necessity of studying ancestry-specific variants
and the rationale for conducting genetic studies in diverse non-EUR
populations.
We confirmed that the SORL1/rs76490923 T allele was significantly
associated with the upregulation of SORL1 expression levels in
microglia, among seven cell types in the dorsolateral prefrontal cortex
using brain snRNA-seq. In addition, DEG analysis revealed that SORL1
expression levels were significantly lower in microglia from
Aβ-positive individuals than in those from controls. Our results
aligned with this previous knowledge in that the levels of SORL1
expression were decreased in Aβ-positive individuals^[128]21,[129]22.
Since SORL1 is implicated in the reduction of amyloidogenic processing
of APP^[130]23,[131]24, this upregulation is particularly significant.
Our snRNA-seq DEG analysis supports this, demonstrating that higher
SORL1 expression correlates with a reduced Aβ burden. Therefore, the
protective effect of the rs76490923 T allele on amyloid positivity may
be mediated through enhanced SORL1 expression in microglia, leading to
decreased Aβ accumulation. This underscores the critical role of
microglial SORL1 in modulating Aβ pathology and highlights the
potential of targeting microglial pathways to develop therapeutic
strategies for AD.
Our joint association analysis using the SORL1 lead variant
(rs76490923) and the APOE4 allele demonstrated that the risk of Aβ
positivity was reduced by up to 43.5% in APOE4 non-carriers and up to
55.6% in APOE4 carriers, according to the allelic dosage of the
rs76490923 T allele. In the current clinical setting, the routine use
of amyloid PET is constrained by financial considerations and
accessibility. Among the tests currently performed in clinical
settings, genotyping APOE4 is considered the most important for
predicting the risk of Aβ positivity. However, given that the APOE4
genotype alone still does not explain a substantial portion of
variability in Aβ positivity, the identification of additional genetic
risk factors may enable more precise risk evaluation. These genetic
risk factors may be ancestry-specific and may need to be applied in
real-life practice under consideration of ethnicity.
Although SORL1 has not been identified in previous amyloid PET GWAS,
variants in SORL1 have been identified in relation to familial and
sporadic AD. Prior studies showed that rare variants in SORL1 can be
the major cause of familial AD without the well-known APP, PSEN1, and
PSEN2 mutations^[132]25,[133]26. Also, a recent whole-exome sequencing
study showed that the gene-based burden of rare damaging variants in
SORL1 had significant impact on AD^[134]27. In terms of sporadic AD,
SORL1 has been identified in large-scale case-control
GWASs^[135]3,[136]16. In EAS GWASs, association signals at SORL1 tended
to be more pronounced despite relatively small sample
sizes^[137]28,[138]29. The SORL1 gene encodes a transmembrane protein
SORLA which is involved in the trafficking and recycling of endosomal
cargoes including APP, which is a key protein in the development of
cerebral amyloidosis^[139]30–[140]33. Specifically, depletion of SORL1
leads to longer residence of APP in the early and recycling endosomes,
which is a favorable environment for amyloidogenic cleavage of APP by
BACE1^[141]34.
There are several limitations in our study. First, while we used
temporal validation scheme using samples having difference in calendar
time from the discovery set, replication in an independent cohort is
needed to confirm our finding regarding the SORL1 variant. Second, the
ROSMAP data used in the snRNA-seq eQTL and colocalization analyses are
derived from only EUR while our main analysis was performed in EAS
data. This may have contributed to rather marginal results in our
colocalization analysis. Despite the discrepancy, the significant eQTL
result in microglia was in line with the differential expression
analysis using Korean autopsy samples. Lastly, the autopsy data used in
differential expression analysis is relatively small in sample size.
Nevertheless, SORL1 showed significant differential expression in
microglia despite small sample size even when using the more
conservative DEG analysis tool demonstrating the biological impact of
this gene.
In summary, by utilizing the EAS genotype-phenotype cohort with
quantified amyloid PET measurements, our cross-ancestry GWAS
meta-analysis identified a previously unreported association of SORL1
variants with Aβ and replicated known associations. SORL1 demonstrated
significant differential expression in microglia between Aβ-positive
and Aβ-negative individuals. Our findings suggest that SORL1 may have
potential value in the diagnosis and treatment of AD, particularly
considering the growing importance of early diagnosis and aggressive
treatment of amyloidosis.
Methods
Ethics
This study complied with all relevant ethical regulations for research
involving human participants and was conducted in accordance with the
criteria set by the Declaration of Helsinki. This study received
approval from the institutional review board (IRB) of Samsung Medical
Center, and written informed consent was provided by all participants.
WGS and genotyping using microarray were conducted using blood samples
obtained from the participants. The collection, storage, and analyses
of biospecimens, genetic data, and data as part of the K-ROAD were
approved under the Samsung Medical Center; IRB No. 2022-07-092. All
data were handled in accordance with relevant data protection and
privacy regulations.
Study population
A total of 4381 individuals of Korean descent with either WGS or GWAS
microarray chip data were recruited from the Korean dementia
hospital-based cohort named K-ROAD. As an open cohort with ongoing data
accumulation, K-ROAD aims to develop a genotype–phenotype cohort to
accelerate the development of advanced diagnostic and therapeutic
techniques for AD and related other dementias. Overall, 25
university-affiliated hospitals in South Korea participated in the
K-ROAD cohort. Eligible participants were individuals on the spectrum
of Alzheimer’s clinical syndrome—CU, MCI, and DAT—who underwent amyloid
PET imaging. We used genomic data generated up to 2022 for the
discovery analysis (stage 1) and data generated from 2023 for the
replication analysis (stage 2). Detailed characteristics of the
participants by stage are shown in Supplementary Data [142]1.
Phenotype definitions
All participants were assessed through clinical interviews,
neurological examinations, neuropsychological testing, and brain
magnetic resonance imaging (MRI). After these evaluations, clinical
diagnoses were established by consensus among a multidisciplinary team.
CU was defined to have (1) no medical or psychiatric history that is
likely to affect cognitive function (2) no objective cognitive
impairment observed after a comprehensive neuropsychological test on
any cognitive domain (above the −1.0 SD of age- and education-matched
norms in memory and below −1.5 SD in other cognitive domains). MCI was
diagnosed as having: (1) subjective cognitive complaints by the
participants or caregiver; (2) objective cognitive impairment in any
cognitive domain (below the −1.0 SD of age- and education-matched norms
in memory or below −1.5 SD in other cognitive domains); and (3) no
significant impairment in activities of daily living. The participants
with dementia of the Alzheimer type (DAT) met the NIA-AA diagnostic
criteria for AD dementia^[143]35.
All K-ROAD participants underwent Aβ PET with either 18F-florbetaben
(FBB) or 18F-flutemetamol (FMM). To quantify the Aβ burden upon PET
scans as Centiloids (CL), we followed the methods described in our
previous study^[144]36. In brief, T1-weighted MR images were
co-registered onto the MNI-152 template, followed by co-registration of
individual PET images to the corresponding MRI images. The PET and MRI
images were spatially normalized using the transformation parameters of
the SPM8 unified segmentation method of T1-weighted MRIs. A whole
cerebellum mask downloaded from the GAAIN website
([145]http://www.gaain.org/) was used as the reference region. The
common cortical target volume of interest (VOI) for FMM and FBB PET was
defined by subtracting the average image of 18 Aβ-negative old controls
from the average image of Aβ-positive DAT participants in a
head-to-head study cohort^[146]37. After calculating the individual
standardized uptake values of the common cortical target VOI in all PET
images, we converted the values into CL units using the equation from
the original paper by Klunk et al.^[147]38:
[MATH: CL=100×SUVRind−SUVRY
C(SUVRD
AT−S
UVRYC<
/msub>) :MATH]
Here, SUVR[ind] refers to the individual standardized uptake value
ratio (SUVR) values, whereas SUVR[YC] and SUVR[DAT] denote the group
mean SUVR values from Aβ-negative young controls and Aβ-positive DAT
patients, respectively.
All imaging analyses for the K-ROAD study were conducted at the
laboratory of Samsung Medical Center, which served as a core center.
While centiloid values were used as the main outcome variable, Aβ
positivity as determined by expert visual assessment was used in our
joint association analysis. Visual assessment was performed by
board-certified nuclear medicine physicians according to the guidelines
of each radiotracer manufacturer^[148]39,[149]40.
Genotyping and imputation
DNA used for genotyping K-ROAD participants was collected from whole
blood. The majority of samples were genotyped using the Illumina Asian
Screening Array (ASA) BeadChip (Illumina, CA, USA) (n = 2314).
Additionally, 496 participants were genotyped on the customized Korea
Biobank array (KBA) chip (Affymetrix, CA, USA).
We performed QC for both types of SNP data. SNPs were removed using the
following criteria: (1) call rate of <98%, (2) MAF of <1%, or (3)
genotype frequencies significantly deviating from the Hardy–Weinberg
equilibrium (HWE) with a p value of <10^−6. After QC, the genotype data
were imputed to estimate genotypes for variants that were not directly
genotyped and to combine datasets of different genotyping arrays (ASA
chip and KBA chip). Imputation was conducted using the Minimac4
software with all available reference haplotypes from the Trans-Omics
for Precision Medicine (TOPMed) at the Michigan Imputation Server.
Consequently, we performed post-imputation QC with (1) an MAF of <1% or
(2) a low imputation quality (R2 < 0.8 for imputed SNPs). Genome
annotations were generated using the GRCh38 assembly.
Based on the genotype data, participants were excluded in accordance
with the following criteria: (1) call rate of <95%, (2) sex mismatch,
(3) heterozygosity excess (±5 SDs from the mean), or (4) one of the
related pairs of individuals with second-degree or closer relationships
estimated using the KING software^[150]41. After QC, 2049 samples
remained available for genetic association analyses in stage 1, and 526
samples remained in stage 2.
WGS data alignment and variant calling
Genomic DNA was extracted from blood samples using the QIAmp DNA Mini
Kit (QIAGEN). For sequencing, library preparation was performed with
the TruSeq® DNA PCR-Free Library Prep Kit (Illumina) with size
selection DNA performed with Covaris ultrasonication using 1 µg of
input DNA for an average insert size of 350 bp and sequencing was
performed at an average depth of 30X with paired-end sequencing using
NovaSeq™ 6000 instrument with an S4 flow cell.
The paired-end raw sequencing data were initially processed by quality
trimming, adapter trimming, removing short sequences, and hard-trimming
using trim galore software (RRID:SCR_011847)
([151]https://github.com/FelixKrueger/TrimGalore). Subsequently, the
sequenced reads were aligned to the hg38 reference genome using BWA-MEM
software^[152]42,[153]43. After alignment, duplicates were removed
using GATK MarkDuplicate^[154]44–[155]46. Base quality score
recalibration was conducted using BaseRecalibrator with WGS interval
contig, insertions and deletions (Indels) from Mills and 1000 G gold
standard, known Indels from Homo sapiens assembly^[156]41, and high
confidence SNPs from 1000 G phase1. Next, germline SNPs and Indels
calling were performed using HaplotypeCaller, and base quality score
recalibration was conducted using ApplyBQSR. GenotypeGVCFs was used to
produce gVCFs, and a DB folder was created using genomicsdbimport.
Finally, gVCFs from the DB folder were combined using CombineGVCFs.
WGS quality control
QC of variants and samples was mainly performed by hail
([157]https://github.com/hail-is/hail). Firstly, we conducted
pre-filtering and genotype QC. Pre-filtering included splitting
multi-allelic variants, Variant Quality Score Recalibration (VQSR)
filtering, allele count greater than 0, removing low complexity regions
(LCRs), and removing a duplicated sample (n = 1). Genotype QC was
performed using criteria including genotype quality (GQ) (GQ ≥ 20),
allele balance (AB) (hetero-variants AB ≥ 0.2 and ≤0.8, homo-variants
AB ≥ 0.9), and read depth (DP) (autosomal DP ≥ 10 and ≤200, chrX DP
(female) ≥10 and ≤200, chrX DP (male) ≥5 and ≤200, chrY DP ≥ 5 and
≤200).
For sample QC, we used high-confidence variants based on the criteria
including bi-allelic variants, high call rates (>0.95), and common
single-nucleotide variants (allele frequency >0.1%). We excluded
samples with low coverage (mean depth ≥15) and low sample-level call
rate (missingness ≥0.9). Samples with unmatched sex (f stat for female
<0.2, f stat for male >0.8) or ambiguous sex (f stat >0.3 and <0.8)
were excluded. For the next steps, we applied different variant QC
criteria including only autosomal variants, bi-allelic variants, high
call rate (>0.95), and allele frequency (>5%). Relatedness was
calculated using KING^[158]39, and samples up to the second degree were
excluded, keeping only one sample. The sample removal process followed
a specific prioritization order. Samples with a diagnosis of AD were
prioritized, followed by those of older age. Additionally, samples from
earlier sequencing batches were prioritized for inclusion. After
removing related samples, principal component analysis (PCA) was
performed using PLINK 2.0^[159]47.
After sample QC, we removed samples that failed sample QC from the raw
VCF file. We repeated pre-filtering and genotype QC, including VQSR,
LCRs, allele balance, GQ, and DP. We excluded variants with excess
heterozygosity (inbreeding coefficient < −0.3), high missing rate (call
rate <0.9), and high HWE of control samples (HWE >1e−09). We divided
variants into SNPs and Indels, and QC procedures were conducted
separately. SNPs were filtered based on QD ≥ 2, SOR ≤ 3, FS ≤ 60,
MQ ≥ 50, MQRankSum ≥ −12.5, and ReadPosRankSum ≥ −8.0. Indels were
filtered based on QD ≥ 2, FS ≤ 200, ReadPosRankSum ≥ −20, MQ ≥ 50,
MQRankSum ≥ −12.5.
In the final step, after merging SNP and Indel variants, we conducted
the final sample QC. Samples exceeding 5 SDs from criteria, such as the
number of SNPs, Indels, transition/transversion (Ti/Tv),
hetero/homo-variants, and the ratio of Indels, hetero/homo-variants,
and Ti/Tv, were excluded. After QC, 1338 samples remained available for
genetic association analyses in stage 1, and 129 samples remained in
stage 2.
Identification of the genetic ancestry group
To confirm the ancestral background of our study sample, we conducted
PCA for the 1000 Genomes Project samples and projected 4042
participants (K-ROAD) from stages 1 (n = 3387) and 2 (n = 655) onto the
PCA plot to check the ancestral distribution.
Genome-wide association study and meta-analysis
We conducted GWASs for Aβ using a linear regression model with age,
sex, and the first five PCs of ancestry as covariates. To account for
the non-normal distribution of each amyloid PET level and to ensure the
robustness of our results, the rank-based inverse normal transformation
was applied to each amyloid PET level. An inverse variance-weighted
fixed-effects meta-analysis of each EAS and meta-GWAS result was
performed using METAL^[160]13.
Cis-eQTL mapping and colocalization
To test for cis-eQTLs, we used the tensorQTL v.1.0.2 cis_nominal mode
with genotypes and gene expression matrix with cis-eQTL mapping
software ([161]https://github.com/RajLabMSSM/QTL-mapping-pipeline). We
filtered out the individuals under 10 nuclei by each major cell type.
Pseudo-bulk gene expression matrices were averaged in all counts for
each gene in each individual by each cell type. As input covariates to
analysis, we included PEER^[162]48 factors from 30 to 70 by each brain
cell type and the first four PCA of genotypes. Each SNP–gene pair used
a 1-Mb window within the transcription start site of a gene. We used
the tensorQTL cis permutation pass per gene with 1000 permutations. To
test whether SNPs from the GWAS colocalized with bulk RNA-seq or
single-nucleus RNA-seq expression QTLs^[163]14,[164]49, we used the
COLOC package (v3.2-1). We extracted a significant genome-wide locus
within 1 Mb on either side of the lead SNP (2 Mb-wide region total) in
the GWAS. In each QTL dataset, we filtered all SNPs by each gene
matched with a significant genome-wide locus within 100 kb to test
colocalization. Missing MAF was used by reference values from the 1000
Genomes (phase 3) EUR superpopulations. Matching sets of SNPs
colocalized by comparing the p value distributions.
Korean single-nucleus RNA sequencing
Korean single-cell transcriptomics data were generated using the
dorsolateral prefrontal cortex from 15 individuals who underwent
autopsy at Samsung Medical Center. All collected tissue samples were
frozen and stored at −80 °C. The tissue pieces were homogenized and
sorted using a FACSMelody cell sorter (BD) via flow cytometry
(Supplementary Fig. [165]9). The sorted nuclei were used in the library
construction following Chromium 3’ v3.1 protocol (10x Genomics,
Cat-No.100268). The quality of library was confirmed using Agilent
TapeStation (Agilent) and Qubit fluorometer (Thermo Fisher Scientific).
Sequencing was performed on a NovaSeq6000 platform (Illumina) under
conditions specified in the 10x Genomics protocol. Of the individuals,
9 were Aβ-positive and 6 were Aβ-negative by Consortium to Establish a
Registry for Alzheimer’s disease (CERAD) neuropathological category
(none to sparse: negative, moderate to frequent: positive). To obtain
the gene count, we used the Cell Ranger software^[166]50 (v.6.1.2) (10x
Genomics) with the GRCh38 assembly. The Cell Ranger count pipeline,
including the pre-mRNA, was processed to account for unspliced nuclear
transcripts. The gene-count matrix of all libraries was generated by
the Cell Ranger aggr pipeline with Cell Ranger 3.0 default parameters
to call cell barcodes.
To process snRNA-seq, we used SCANPY (v1.9.8)^[167]51. First, we
excluded the outlier nuclei (the range [Q1 − 3(Q3 − Q1),
Q3 + 3(Q3 − Q1)], with Q1 as the lower quartiles and Q3 as the upper
quartiles) in terms of the number of genes, total counts, and
percentage of mitochondrial genes. Second, we removed the doublets
labeled cells using Scrublet (v.0.2.3)^[168]52. After filtering out the
11,780 nuclei, 88,622 nuclei were retained. The integration method to
remove single-nucleus platform, dataset-specific batch effects was
performed by Harmony^[169]53, which is the external function of SCANPY
using individual and batches with normalized gene expression. After
that, to annotate major brain cell types based on published snRNA-seq
data, we projected annotations of brain cell types defined by the Allen
Brain Institute
([170]https://portal.brain-map.org/atlases-and-data/rnaseq/human-multip
le-cortical-areas-smart-seq) onto this study.
Differential gene expression analysis
Differential gene expression of genes comparing Aβ positivity by CERAD
neuropathological category was done using the MAST algorithm
(v.1.16.0)^[171]48, which implements a two-part hurdle model. The genes
with expression in greater than 10% of nuclei were required to be
considered for differential gene expression analysis by each cell type.
The fixed-effect covariates included batch, age, sex, and status (Aβ
positivity) on each cell type. The individuals are a random-effect
covariate in the model for the interindividual correlation of
expression data. We tested differential expression analysis across all
major cell types to estimate the effect of disease on expression. The
significant differential express genes filtered by FDR < 0.05 using
Benjamini–Hochberg correction. Using the discrete coefficient of MAST,
we defined the coefficient estimate of the effect of Aβ positivity on
expression.
Reporting summary
Further information on research design is available in the [172]Nature
Portfolio Reporting Summary linked to this article.
Supplementary information
[173]Supplementary Information^ (1.9MB, pdf)
[174]41467_2025_57751_MOESM2_ESM.pdf^ (95.6KB, pdf)
Description of Additional Supplementary Files
[175]Supplementary Dataset 1^ (13.3KB, xlsx)
[176]Supplementary Dataset 2^ (14.5KB, xlsx)
[177]Supplementary Dataset 3^ (14.4KB, xlsx)
[178]Supplementary Dataset 4^ (17.6KB, xlsx)
[179]Supplementary Dataset 5^ (16.7KB, xlsx)
[180]Reporting Summary^ (2.8MB, pdf)
[181]Transparent Peer Review file^ (3.1MB, pdf)
Source data
[182]Source Data^ (209.2KB, xlsx)
Acknowledgements