Abstract
Background
New genetic and genomic resources have identified multiple genetic risk
factors for late-onset Alzheimer’s disease (LOAD) and characterized
this common dementia at the molecular level. Experimental studies in
model organisms can validate these associations and elucidate the links
between specific genetic factors and transcriptomic signatures. Animal
models based on LOAD-associated genes can potentially connect common
genetic variation with LOAD transcriptomes, thereby providing novel
insights into basic biological mechanisms underlying the disease.
Methods
We performed RNA-Seq on whole brain samples from a panel of
six-month-old female mice, each carrying one of the following
mutations: homozygous deletions of Apoe and Clu; hemizygous deletions
of Bin1 and Cd2ap; and a transgenic APOEε4. Similar data from a
transgenic APP/PS1 model was included for comparison to early-onset
variant effects. Weighted gene co-expression network analysis (WGCNA)
was used to identify modules of correlated genes and each module was
tested for differential expression by strain. We then compared mouse
modules with human postmortem brain modules from the Accelerating
Medicine’s Partnership for AD (AMP-AD) to determine the LOAD-related
processes affected by each genetic risk factor.
Results
Mouse modules were significantly enriched in multiple AD-related
processes, including immune response, inflammation, lipid processing,
endocytosis, and synaptic cell function. WGCNA modules were
significantly associated with Apoe^−/−, APOEε4, Clu^−/−, and APP/PS1
mouse models. Apoe^−/−, GFAP-driven APOEε4, and APP/PS1 driven modules
overlapped with AMP-AD inflammation and microglial modules; Clu^−/−
driven modules overlapped with synaptic modules; and APP/PS1 modules
separately overlapped with lipid-processing and metabolism modules.
Conclusions
This study of genetic mouse models provides a basis to dissect the role
of AD risk genes in relevant AD pathologies. We determined that
different genetic perturbations affect different molecular mechanisms
comprising AD, and mapped specific effects to each risk gene. Our
approach provides a platform for further exploration into the causes
and progression of AD by assessing animal models at different ages
and/or with different combinations of LOAD risk variants.
Keywords: Alzheimer’s disease, Animal models, Transcriptomic analysis
Background
Alzheimer’s disease (AD) is the most common adult neurodegenerative
disorder and accounts for around 60–80% of all dementia cases [[35]1].
Neuropathologically, Alzheimer’s disease is generally characterized by
the presence of extracellular amyloid plaques composed of amyloid-β
(Aβ) surrounded by dystrophic neurites, neurofibrillary tangles (NFTs),
and neuronal loss [[36]2, [37]3]. Clinically, AD is classified into two
subtypes: early onset with Mendelian inheritance, and late onset (or
sporadic) AD [[38]1, [39]4]. Early-onset Alzheimer’s disease (EOAD)
strikes prior to the age of 65 and accounts for approximately 5% of all
AD cases, while the much more common late-onset Alzheimer’s disease
(LOAD) is diagnosed at later life stages (> 65 years) [[40]2, [41]5].
In comparison to rare casual variants in three genes: amyloid precursor
protein (APP), presenilin 1 (PSEN1), and presenilin 2 (PSEN2) that
contribute to EOAD [[42]1, [43]6, [44]7], the genetics factors
influencing LOAD are complex due to the interplay of genetic and
environmental factors that influence disease onset, progression and
severity [[45]8, [46]9]. Before the era of large-scale genome wide
association studies, the e4 allele of the apolipoprotein E (APOE) gene
was the only well-established major risk factor for LOAD, accounting
for about 30% of genetic variance [[47]10, [48]11]. APOEε4 was inferred
to have moderate penetrance [[49]11] with homozygous carriers having a
roughly five-times-increased risk compared to those who inherit only
one e4 allele of APOE [[50]1, [51]12].
Identification of new AD-related genes is important for better
understanding of the molecular mechanisms leading to neurodegeneration
[[52]7]. Genome-wide association studies (GWAS) have identified dozens
of additional genetic risk loci for LOAD, with candidate genes
including clusterin (CLU), bridging integrator 1 (BIN1), and CD2
associated protein (CD2AP) [[53]1, [54]2, [55]7, [56]13]. These novel
risk genes cluster in functional classes suggesting prominent roles in
lipid processing, the immune system, and synaptic cell function such as
endocytosis [[57]1, [58]14]. Although these risk variants are often of
small effect size, investigation of their functionality can reveal the
biological basis of LOAD [[59]1].
Despite recent advances in genetic and genomic resources to identify
genetic risk factors, the disease mechanisms behind LOAD remain opaque.
Most transgenic animal models are based on rare, early-onset AD genes
which do not reflect the complete neuropathology or transcriptomic
signatures of LOAD [[60]15]. Although these transgenic mouse models
were helpful to understand early molecular changes underlying Aβ and
tau pathology, the corresponding genetic factors only account for a
small fraction of AD. Thus, animal models based on LOAD-associated
genes are necessary to connect common genetic variation with LOAD
transcriptomes.
To better understand the molecular mechanism underlying LOAD, we
performed transcriptome profiling and analyses from brain hemispheres
of 6 month old female mice carrying mutations in LOAD-relevant genes
Apoe, Clu, Bin1, and Cd2ap. Weighted gene co-expression network
analysis identified several mouse modules significantly driven by
Apoe^−/− and Clu^−/− mouse strains. Moreover, we have compared mouse
modules with human postmortem brain modules from the Accelerating
Medicine’s Partnership for AD (AMP-AD) to determine the AD relevance of
risk genes. We observed enrichment of multiple AD-related pathways in
these modules such as immune system, lipid metabolism, and neuronal
system. This study of LOAD-relevant mice provides a basis to dissect
the role of AD risk genes in AD pathologies.
Methods
Mouse strains and data generation
All mouse strains were obtained from The Jackson Laboratory and
maintained in 12/12-h light/dark cycle (Table [61]1). All experiments
were approved by the Animal Care and Use Committee at The Jackson
Laboratory. RNA-Seq data were obtained from whole left hemisphere brain
samples from a panel of six-month-old female mice carrying one of the
following mutations in LOAD associated genes: homozygous deletion in
Apoe and Clu; heterozygous deletion in Cd2ap and Bin1; and a transgenic
APOEε4 driven by a GFAP promoter on a Apoe^−/− background (herein
referred to as Apoe^−/−, Clu^−/−, Cd2ap^+/−, Bin1^+/− and APOEε4)
(Table [62]1, [[63]16–[64]21]). There were six biological replicates
for each late-onset model and control B6 mice. To minimize gene
expression variation between mice, all mice in experimental cohorts
were bred in the same mouse room and were aged together (to the extent
possible). Cohorts were generated either by intercrossing heterozygous
mice or in the case of Bin1^+/− and Cd2ap^+/− by crossing heterozygous
mice to C57BL/6 J (B6) mice, as homozygosity in these two genes is
lethal. Data were also included from five whole left hemisphere brain
samples from 6-month-old female mice from an early-onset AD model
(APP/PS1, Table [65]1) [[66]22] as well as seven additional B6 control
replicates to account for batch effects.
Table 1.
Study population. Whole-brain left hemispheres were collected at 6
months of age from female mice
Model Genetic Construct JAX Strain name JAX Stock #
B6 none; control C57BL/6 J 000664
APOEε4 Homozygous transgene of human APOEε4 allele with GFAP promoter
B6.Cg-Apoe^tm1Un Cdh18^Tg(GFAP-APOE_i4)1Hol/J 004631
Apoe^−/− Homozygous gene knockout B6.129P2-Apoe^tm1Unc/J 002052
Clu^−/− Homozygous gene knockout B6.Cg-Clu^tm1Jakh/J 005642
Bin1^+/− Heterozygous gene knockout B6.129S6-Bin1^tm2Gcp/J 021145
Cd2ap^+/− Heterozygous gene knockout B6.129X1-Cd2ap^tm1Shaw/J 008907
APP/PS1 Homozygous transgenic B6.Cg-Tg (APPswe,PSEN1dE9) 85Dbo/Mmjax
MMRC stock # 34,832-JAX
[67]Open in a new tab
For sample collection, mice were anesthetized with a lethal dose of
ketamine/xylazine, transcardially perfused with 1X phosphate buffered
saline (PBS), brains carefully dissected and hemisected in the
midsagittal plane. The left hemisphere was snap frozen. RNA extraction
was performed using TRIzol (Invitrogen, cat #: 15596026) according to
manufacturer’s instructions. Total RNA was purified from the aqueous
layer using the QIAGEN miRNeasy mini extraction kit (QIAGEN) according
to the manufacturer’s instructions. RNA quality was assessed with the
Bioanalyzer 2100 (Agilent Technologies). Poly(A) selected RNA-Seq
sequencing libraries were generated using the TruSeq RNA Sample
preparation kit v2 (Illumina) and quantified using qPCR (Kapa
Biosystems). Using Truseq V4 SBS chemistry, all libraries were
processed for 125 base pair (bp) paired-end sequencing on the Illumina
HiSeq 2000 platform according to the manufacturer’s instructions.
Quality control of RNA-Seq data
Sequence quality of reads was assessed using FastQC (v0.11.3,
Babraham). Low-quality bases were trimmed from sequencing reads using
Trimmomatic (v0.33) [[68]23]. After trimming, reads of length longer
than 36 bases were retained. The average quality score was greater than
30 at each base position and sequencing depth were in range of 35–40
million reads.
Read alignments and gene expression
All RNA-Seq samples were mapped to the mouse genome (assembly 38) using
ultrafast RNA-Seq aligner STAR (v2.5.3) [[69]24]. First, a STAR index
was built from mm10 reference sequence (Ensembl Genome Reference
Consortium, build 38) for alignment, then STAR aligner output
coordinate-sorted BAM files for each sample was mapped to mouse genome
using this index. Gene expression was quantified in two ways, to enable
multiple analytical methods: transcripts per million (TPM) using RSEM
(v1.2.31) [[70]25], and raw read counts using HTSeq-count (v0.8.0)
[[71]26].
Differential expression analysis
Differential expression in mouse models was assessed using Bioconductor
package DESeq2 (v1.16.1) [[72]27].. DESeq2 take raw read counts
obtained from HTSeq-count as input and has its own normalization
approach. The significance of differential expression was determined by
the Benjamini-Hochberg corrected p-values. The threshold for
significance was set to an adjusted p = 0.05. We included batch as a
covariate in DESeq2 analysis to account for batch effect.
Principal component analysis and batch correction
We analyzed 48 RNA-Seq samples originating from three experimental
batches: 1) all late-onset genetic models (N = 36); 2) one biological
replicate of the APP/PS1 strain with seven biological replicates of B6
control mice (N = 8); and 3) four additional biological replicates of
APP/PS1 (N = 4). First, we filtered out genes with TPM less than 10 for
more than 90% of samples and then log-transformed to log2(TPM + 1) for
downstream analysis. We then used the plotPCA function of Bioconductor
package EDASeq [[73]28] to observe the differences in distribution of
samples due to batch effects. Finally, we implemented COMBAT [[74]29]
on above RNA-Seq datasets to remove known batch effects.
Network construction and mouse module detection
Modules (clusters) of correlated genes were identified using Weighted
gene co-expression network analysis (WGCNA) implemented in R [[75]30].
We used the step-by-step construction approach for network construction
and module identification, which allows customization and alternate
methods. The default unsigned network type was used, and a soft
thresholding power of 8 was chosen to meet the scale-free topology
criterion in the pickSoftThreshold function [[76]31]. For module
identification, WGCNA uses a topological overlap measure to compute
network interconnectedness in conjunction with average linkage
hierarchical clustering method. Modules correspond to branches of
resulting clustering and are identified by cutting branches using
dynamic tree cutting. To avoid small modules and ensure separation, we
set the minimum module size to 30 genes and the minimum height for
merging modules to 0.25. Each module is represented by the module
eigengene (ME), defined as first principal component of the gene
expression profiles of each module. Further, we have carried out
one-way ANOVA (R function: aov) tests to determine differential
expression between strains for each module eigengene. Modules with
significant (p < 0.05) strain differences were analyzed for
contributing strains using Tukey HSD (Tukey Honest Significant
Differences, R function: TukeyHSD) for multiple pairwise-comparison
between group means. The reported p-values were adjusted for multiple
comparisons with Benjamini-Hochberg false discovery rate.
Functional enrichment analysis
Functional annotations and enrichment analysis were performed using the
R package clusterProfiler [[77]32]. Gene Ontology terms and KEGG
pathways enrichment analysis were performed using functions enrichGO
and enrichKEGG, respectively, from the clusterProfiler package. The
function compareCluster from this package was used to compare enriched
functional categories of each gene module. The significance threshold
for all enrichment analyses was set to 0.05 using Benjamini-Hochberg
adjusted p-values.
Calculation and significance of Jaccard indices
Jaccard indices were computed to find overlap strengths between mouse
modules and AMP-AD human modules. The Jaccard index is measure of
similarity between sample sets and defined as ratio of size of the
intersection to the size of the union of two sample sets. Further, to
test the significance of the Jaccard index for each pair of mouse-human
module overlap, we performed permutation analysis by random sampling
the equivalent number of genes in each mouse module from the union of
all genes in the mouse modules. This was performed 10,000 times to
generate null distributions of Jaccard index values. Cumulative
p-values were then calculated empirically.
Mouse-human orthologous genes
Mouse-human orthologous genes were identified using the genomic
information on orthologous groups from the latest ENSEMBL build for the
human genome version GRCh38. All orthologous gene relationships were
retrieved from BioMart based on the Ensembl Compara Gene Tree
comparison with the latest mouse genome build ([78]biomart.org).
Phylogenetic gene trees represent the evolutionary history of distinct
gene families, which evolved from a common ancestor. Reconciliation of
these gene trees against the mouse genome was used to distinguish
duplication and speciation events across species, thus inferring
distinct orthologue and paralogue gene pairs based on the method
inferred by Cunningham et al. [[79]33].
Transcription factor analyses
Transcription factors in mouse module were identified using iRegulon
(v1.3) [[80]34] in Cytoscape (v3.2.0) [[81]35] and the Enrichr webtool
that contains ENCODE and ChEA consensus transcription factor
annotations from Chip-X library [[82]36].
Human post-mortem brain cohorts and co-expression module identification
Whole-transcriptome data for human post-mortem brain tissue was
obtained from the Accelerating Medicines Partnership for Alzheimer
Disease-(AMP-AD) consortium, which is a multi-cohort effort to
harmonize genomics data from human LOAD patients. Harmonized
co-expression modules from the AMP-AD data sets were obtained from
Synapse (DOI: 10.7303/syn11932957.1). The human co-expression modules
derive from three independent LOAD cohorts, including 700 samples from
the ROS/MAP cohort, 300 samples from the Mount Sinai Brain bank and 270
samples from the Mayo cohort. A detailed description on post-mortem
brain sample collection, tissue and RNA preparation, sequencing, and
sample QC has been provided elsewhere [[83]37–[84]39]. As part of a
transcriptome-wide meta-analysis to decipher the molecular architecture
of LOAD, 30 co-expression modules from seven different brain regions
across the three cohorts have been recently identified [[85]40].
Briefly, Logsdon et al. identified 2978 co-expression modules using
multiple techniques across the different regions after adjusting for
co-variables and accounting for batch effects (10.7303/syn10309369.1).
A total of 660 co-expression modules were selected based on a specific
enrichment in LOAD cases when compared to controls
(10.7303/syn11914606). Finally, multiple co-expression module
algorithms were used to identify a set of 30 aggregate modules that
were replicated by the independent methods [[86]40].
Correlation analysis
Standard gene set overlap tests are quick and easy, but do not account
for direction of gene expression changes or coherence of changes across
all genes in a module. To assess the directionality of genetic variants
in model mice, we have computed the Pearson correlation across all
genes in a given AMP-AD modules to determine human-mouse concordance.
To determine the effects of each genetic variant, we fit a multiple
regression model as:
[MATH: logexpr=β0+∑iβ
mi>i+ε :MATH]
Where i denotes the genetic variants (Apoe^−/−, APOEε4, APP/PS1,
Bin1^+/−, Cd2ap^+/−, and Clu^−/−), and expr represents gene expression
measured by RNA-Seq transcripts per million (TPM).
We have computed the Pearson correlation between log fold change gene
expression in human AD cases versus controls (Log[2]FC (AD/controls)
and the effect of each mouse perturbation as determined by the linear
model (β) for the mouse orthologs genes within an AMP-AD module.
Log[2]FC values for human transcripts were obtained via the AMP-AD
knowledge portal ([87]https://www.synapse.org/#!Synapse:syn11180450).
Correlation coefficients were computed using cor.test function built in
R as:
cor.test (log[2]FC (AD/control), β).
cor.test returns both the correlation coefficient and the significance
level (p-value) of the correlation. Resulting p-values were corrected
for multiple hypothesis testing using the Benjamini-Hochberg (BH)
procedure.
Results
Expression of target genes was modified by genetic perturbations
First, we have examined the relative expression (compared to control B6
mice) of LOAD associated genes to validate each strain. Expression of
the mouse Apoe gene was downregulated in Apoe^−/− mice
(p < 1.00 × 10^− 60) as well as in transgenic APOEε4
(p < 1.00 × 10^− 258) mice, which harbor human APOE4 transcript driven
by the GFAP promotor (Fig. [88]1a). Expression of Clu gene was also
downregulated (p < 1.00 × 10^− 30) in Clu^−/− mice, while change in the
expression of Bin1 was significant but very small (log[2]FC = − 0.3;
p = 8.72 × 10^− 12) in Bin1^+/− mice (Fig. [89]1a). The change in
expression of Cd2ap gene was not significant (log[2]FC = − 0.07;
p = 0.7) in Cd2ap^+/− mice (Fig. [90]1a). Overall, in each mouse
strain, we observed significant downregulation in the expression of
respective LOAD associated gene except in Cd2ap^+/− models.
Fig. 1.
[91]Fig. 1
[92]Open in a new tab
Expression of LOAD associated genes in mice. a Expression of AD
associated risk genes in LOAD-relevant mice and the APP/PS1 transgenic
model compared to B6 (control) mice. X-axis shows AD-associated risk
genes and Y-axis represents average log fold change expression of above
genes in genetically perturbed mice compare to controls. b Principal
component analysis of batch corrected RNA-seq data from mouse strains.
The APOEε4 (red circle) and Apoe KO (green circle) samples are most
similar to each other. Samples from mice carrying only one copy of
either Bin1 (magenta circle) or Cd2ap (orange circle) occupy similar
regions, which might be due to their related functions. APP/PS1 samples
(brown circle) were separated from mice with late-onset perturbations
by the first PC
Transcriptional signatures from mice carrying different mutations in
LOAD-relevant genes clustered into different groups by PCA
Principal component analysis (PCA) was performed on batch-corrected,
log-transformed, and mean-centered TPM for 10,704 genes (Methods). The
first principal component accounted for 13% of total variance and
separated models of different types of AD: LOAD associated models and
EOAD associated APP/PS1 transgenic models cluster separately (Fig.
[93]1b), and thus might be affecting different AD-related processes. In
other hand, within LOAD associated models, samples from the Clu^−/−
mice grouped together and separately from all other LOAD associated
models in the second principal component (10% of variance) (Fig.
[94]1b). Across all strains, APOEε4 transgenic and Apoe^−/− mice were
most similar to each other (Fig. [95]1b). Hemizygous Bin1^+/−, and
Cd2ap^+/− mice grouped closely to each another, suggesting functional
similarity, and were the mutant strains in closest proximity to control
(B6) mice (Fig. [96]1b).
Pathway analysis of differentially expressed genes identifies enrichment of
different LOAD-related pathways in each mouse model
A total of 120 genes were significantly differentially expressed
(p < 0.05) in APOEε4 transgenic mice, out of which 57 genes were
upregulated and 63 genes were downregulated (Table [97]2;
Additional file [98]1: Table S1). We did not observe any pathway
enrichment for differentially expressed genes in APOEε4 transgenic
mice. In Apoe^−/− mice, 219 genes were identified significantly
differentially expressed (p < 0.05), 154 genes were upregulated and 65
genes were downregulated (Table [99]2; Additional file [100]1: Table
S1). Inflammation/immune response related pathways were enriched in the
upregulated list of DE genes in Apoe^−/− mice (Additional file [101]2:
Table S2), as well as osteoclast differentiation that is related to
TREM2 and TYROBP. We did not observe any enrichment for downregulated
genes in Apoe^−/− mice. In Clu^−/− mice, a total of 1759 genes were
identified significantly differentially expressed (762 genes were
upregulated and 997 genes were downregulated) (p < 0.05; Table [102]2;
Additional file [103]1: Table S1). Pathway analysis of DE genes
identified spliceosome, RNA transport, and ubiquitin mediated
proteolysis as enriched pathways in downregulated genes of Clu^−/−
mice, while notch signaling as the enriched pathway in upregulated
genes of Clu^−/− mice (Additional file [104]2: Table S2). Only 16 and
34 genes were significantly differentially expressed (p < 0.05) in
Bin1^+/− and Cd2ap^+/− mice, respectively (Table [105]2; Additional
file [106]1: Table S1). Pathway analysis identified endocytosis,
phagosome, autoimmune, type I diabetes as enriched pathways in
downregulated genes of Cd2ap^+/− mice (Additional file [107]2: Table
S2), while there was no pathway enrichment in upregulated genes of
Cd2ap^+/− mice. Downregulated genes of Bin1^+/− mice were enriched in
endocytosis and FC gamma R-mediated phagocytosis pathways (Additional
file [108]2: Table S2). In the APP/PS1 transgenic mice, 250 genes were
differentially expressed (67 and 183 genes were up and downregulated,
respectively) (Table [109]2). Pathway analysis of these DE genes
identified ribosome, oxidative phosphorylation, and Alzheimer’s disease
as significantly enriched pathways (Additional file [110]2: Table S2).
Table 2.
Differentially expressed genes by strain. Number of differentially
expressed genes identified in each mouse strain compared to control
mice (B6)
Mouse Model Upregulated Downregulated
APOEε4 57 63
Apoe^−/− 154 65
Clu^−/− 762 997
Bin1^+/− 10 6
Cd2ap^+/− 21 13
APP/PS1 67 183
[111]Open in a new tab
Co-expression network analysis identified mouse modules enriched for multiple
LOAD-related pathways driven by APOE and CLU strains
Weighted gene co-expression network analysis (WGCNA) [[112]30]
identified 26 distinct modules of co-expressed genes (Fig. [113]2a,
Additional file [114]3: Table S3). Further, we have carried out one-way
ANOVA test followed by Tukey-HSD (see methods) to determine if there
was differential expression between strains for each module eigengene.
We identified that 13 out of 26 modules were significantly driven by
one or more of Apoe^−/−, APOEε4, Clu^−/−, and APP/PS1 models
(Additional file [115]3: Table S3). Pathway enrichment analysis
identified that multiple AD-related pathways were significantly
enriched in these mouse modules. Apoe^−/− mice were significantly
associated with ivory module (N = 64, p = 9.7 × 10^− 6), while the
skyblue3 (N = 80, p = 4.6 × 10^− 13) (Fig. [116]3; Fig. [117]4;
Additional file [118]3: Table S3) module were significantly associated
with both Apoe^−/− and APOEε4 strains. Pathway analysis identified that
the ivory mouse module was enriched in inflammation and microglia
related pathways such as osteoclast differentiation, staphylococcus
aures infection, phagosome, and endocytosis (Fig. [119]2b), implicating
an important role of Apoe in inflammatory and microglia related
functions [[120]41–[121]43]. Brown (N = 1778, p = 3.1 × 10^− 7),
lightcyan1 (N = 1206, p = 1.9 × 10^− 5), black (N = 685,
p = 2.0 × 10^− 2), plum1 (N = 80, p = 1.0 × 10^− 2), and brown4
(N = 55, p = 0.04) modules were significantly associated with Clu^−/−
(Fig. [122]3; Fig. [123]4; Additional file [124]3: Table S3). The
steelblue module was driven by both Clu^−/− (p = 5.02 × 10^− 13) and
Cd2ap^+/− models (p = 9.5 × 10^− 13) (Fig. [125]3; Fig. [126]4;
Additional file [127]3: Table S3). These mouse modules were enriched in
many different pathways particularly related to synaptic cell function,
endocytosis, and RNA transport (Fig. [128]2b). This suggest the role of
Clu gene in synaptic/neuronal related functions, which is in consistent
with findings that reduced expression of Clu may results to aberrant
synaptic development and neurodegeneration [[129]44]. The darkorange2
(N = 61, p = 1.0 × 10^− 6), darkorange (N = 312, p = 0.03), orange
(N = 142, p = 4.64 × 10^− 13), and lightgreen (N = 1456,
p = 1.0 × 10^− 12) modules were found to be driven by APP/PS1 (Fig.
[130]3; Fig. [131]4; Additional file [132]3: Table S3). The lightyellow
module (N = 163) was observed to be associated with both APP/PS1
(p = 8.7 × 10^− 5) and Clu^−/− mice (p = 1.4 × 10^− 2), but more
significantly with APP/PS1 (Fig. [133]3; Fig. [134]4; Additional file
[135]3: Table S3). APP/PS1-driven modules (lightyellow, lightgreen,
darkorange2) were enriched in lipid-processing and metabolism related
pathways (Fig. [136]2b). None of the modules were observed to be
associated with Bin1^+/− and Cd2ap^+/− mice alone.
Fig. 2.
[137]Fig. 2
[138]Open in a new tab
Mouse Modules Identified through WGCNA. a Twenty-six distinct mouse
modules were identified from 10,704 mouse genes using WGCNA. Mouse
modules of various sizes represented by different color names. b KEGG
Pathway enrichment analysis (p < 0.05) in mice using enrichKEGG
function build under clusterprofiler R package
Fig. 3.
[139]Fig. 3
[140]Open in a new tab
Mouse Modules Significantly driven by specific mouse strains.
Expression of module eigengenes in mouse modules significantly driven
by Apoe^−/−, APOEε4, Clu^−/− and APP/PS1 mice (arbitrary units)
Fig. 4.
[141]Fig. 4
[142]Open in a new tab
Overlaps between strain-associated mouse modules and human AMP-AD
modules. a Mouse modules significantly driven by one or more of
Apoe^−/−, APOEε4, APP/PS1, Cd2ap^+/− and Clu^−/− mouse strains. The
horizontal scale bar represents the average eigengene expression of
mouse strains in mouse modules. b Overlaps between mouse modules and 30
human AMP-AD modules. The vertical scale bar represents Jaccard indices
between mouse modules and AMP-AD modules. Jaccard indices were computed
between each mouse and AMP-AD human modules
Comparison of mouse and AMP-AD modules
Finally, we compared mouse modules with the 30 human postmortem brain
modules from the Accelerating Medicine’s Partnership for AD (AMP-AD).
We computed Jaccard indices and its significance for each mouse - human
module pair to identify which mouse module significantly overlap with
human modules in order to identify AD-relevance of risk genes
(Additional file [143]5: Table S5). Since each human module was derived
from a specific brain region and study cohort, there are significant
similarity between AMP-AD modules. Overlapping modules were therefore
grouped into Consensus Clusters [[144]40].
Apoe-driven mouse module overlapped with AMP-AD inflammation and microglial
consensus cluster
The ivory mouse module driven by Apoe^−/− significantly overlapped with
AMP-AD inflammation and microglia modules in Consensus Cluster B
[[145]40] (Fig. [146]4; p < 0.05) and ranked among top ten mouse-human
modules overlap (based on Jaccard indices) (Additional file [147]4:
Table S4). These findings imply the significant role of Apoe in
inflammation and microglia-related pathways. Furthermore, we identified
that 22 genes were present in all AMP-AD microglial modules in
Consensus Cluster B as well as in the Apoe^−/−-driven ivory module
(Fig. [148]5), as these genes were expressed from all human brain
regions and therefore might be playing the important role in
inflammation and microglia associated pathways. In order to identify
transcriptional changes in these genes due to any AD-relevance genetic
alteration, we assessed differential expression of these 22 genes in
each mouse model (Additional file [149]1: Table S1). Nine out of these
22 genes (TREM2, CSF1R, C1QA, C1QB, C1QC, PTGS1, AIF1, LAPTM5 and LY86)
were significantly upregulated (p < 0.05) in Apoe^−/− mice and one gene
(TYROBP) was significantly downregulated (p < 0.05) in Clu^−/− mice.
Some of these genes (TREM2, TYROBP, C1QA, and CSF1R) have been
associated with AD and reported to be potential drug targets
([150]https://agora.ampadportal.org/). We did not find a significant
overlap between the skyblue3 mouse module and any AMP-AD module.
Fig. 5.
[151]Fig. 5
[152]Open in a new tab
Overlaps between AMP-AD and key mouse modules: a Overlap between AMP-AD
microglia modules in Consensus Cluster B and Apoe^−/−-driven ivory
module (shown in blue). We identified 22 genes which were present in
all AMP-AD microglia modules in Consensus Cluster B and the mouse ivory
module (red vertical bar). b Overlap between AMP-AD neuronal modules in
Consensus Cluster C and Clu^−/− driven brown module (shown in blue). We
identified 122 genes which were present in all AMP-AD neuronal modules
in Consensus Cluster C and mouse brown module (red vertical bar)
Clu-driven modules overlapped with AMP-AD neuronal system consensus cluster
Clu^−/−-driven mouse modules (brown, lightcyan1, and plum1) prominently
overlapped with AMP-AD neuronal system modules in Consensus Cluster C
[[153]40], while black, lightcyan1, and brown modules overlapped with
organelle biogenesis associated AMP-AD modules in Consensus Cluster E
(Fig. [154]4; p < 0.05). The Clu^−/−-driven brown4 module showed
association with cell cycle associated AMP-AD modules in Consensus
Cluster D (Fig. [155]4; p < 0.05). Also, we have observed that the top
five mouse-human module overlaps (based on Jaccard indices) were
between the brown module and AMP-AD neuronal system modules in
Consensus Cluster C (Additional file [156]4: Table S4). Further, we
also identified that 122 genes were common between the Clu^−/−-driven
brown mouse module and all AMP-AD neuronal system modules in Consensus
Cluster C (Fig. [157]5b). We assessed these 122 genes for differential
expression in each mouse strain (Additional file [158]1: Table S1) and
found that 35 out of these 122 genes were differentially expressed (30
genes were upregulated and 5 genes were downregulated) only in Clu^−/−
mice, while three out of these 122 genes were differentially expressed
only in APP/PS1 transgenic mice (one gene was upregulated and two were
downregulated). One of these 122 genes (Syt7) was upregulated in both
Clu^−/− mice and the APP/PS1 transgenic mice. These finding support the
likely role of CLU in neuronal function.
APP/PS1-driven modules overlapped with inflammation, lipid-processing, and
metabolism AMP-AD modules
The APP/PS1-driven orange and darkorange modules overlapped with lipid
processing and metabolism associated AMP-AD modules in Consensus
Cluster E, the lightgreen module overlapped with immune system modules
Consensus Cluster B, and the lightyellow module overlapped with both
microglia and organelle biogenesis related AMP-AD modules in Consensus
Clusters B and E, respectively (Fig. [159]4; p < 0.05). We found
significant overlap for the darkorange2 mouse module with AMP-AD
modules in Consensus Cluster E, which are in turn enriched in organelle
biogenesis related pathways (Fig. [160]4; p < 0.05).
Correlation analysis provides directional coherence between mouse models and
AMP-AD consensus clusters
The gene set overlap analysis identified mouse modules that are
significantly overlapped with AMP-AD modules, but it does not assess
directional coherence between AMP-AD modules and the effects of genetic
perturbations in mice. To address this issue, we computed the Pearson
correlation between log fold change gene expression in human AD cases
versus controls (Log[2]FC) and the effect of each mouse perturbation on
mouse orthologs as determined by the linear model (β) for the genes
within an AMP-AD module. Apoe^−/− and APOEε4 mice showed significant
positive correlation (r = 0.1–0.3, p < 0.05) with immune associated
AMP-AD modules in Consensus Cluster B and significant negative
correlation (r = − 0.05, p < 0.05) with AMP-AD neuronal modules in
Consensus Cluster C (Fig. [161]6). Furthermore, Clu^−/− and Cd2ap^+/−
mice showed significantly positive association (r = 0.1, p < 0.05) with
AMP-AD neuronal modules in Consensus Cluster C and negative correlation
(r = − 0.15, p < 0.05) with AMP-AD immune related modules in Consensus
Cluster B (Fig. [162]6). Bin1^−/− and APP/PS1 mice showed significant
positive correlation (r = 0.1–0.2, p < 0.05) with AMP-AD immune
response associated modules in Consensus Cluster B as well as AMP-AD
neuronal modules in Consensus Cluster C. The cell cycle and RNA
non-mediated decay pathways enriched AMP-AD modules in Consensus
Cluster D were significantly negatively correlated (r = − 0.2,
p < 0.05) with Apoe^−/−, APOEε4, Clu^−/−, Cd2ap^+/, and APP/PS1 mice,
but Bin1^+/− mice showed significant positive correlation (r = 0.11,
p > 0.05) with AMP-AD cell cycle module in the cerebellum (Fig.
[163]6). Most of the AMP-AD modules in Consensus Cluster E that is
enriched for organelle biogenesis associated pathways showed
significant negative correlation (r = − 0.1, p < 0.05) with all strains
except the Apoe^−/− models (r = 0.12, p < 0.05), while the AMP-AD
modules of Consensus Cluster E in the frontal pole (FPbrown) and
parahippocampal gyrus (PHGblue) showed significant positive association
(r = 0.05–0.2, p < 0.05) with all strains (Fig. [164]6).
Fig. 6.
[165]Fig. 6
[166]Open in a new tab
Correlation between mouse strains and 30 AMP-AD modules. Pearson
correlation coefficients between 30 human AMP-AD modules and mouse
strains. AMP-AD modules are grouped into five previously-identified
consensus clusters describing the major functional groups of AD-related
alterations. The vertical axis represents AMP-AD modules and the
horizontal axis represents mouse strains. Positive correlations are
shown in blue and negative correlations in red color. Color intensity
and size of the circles are proportional to the correlation
coefficient. Correlations with adjusted p-value > 0.05 are considered
non-significant and not included
Apoe-associated modules are enriched in SPI1 regulatory targets
Transcription regulation play an important role in the initiation and
progression of AD [[167]45]. Our results provide evidence of the AD
relevance of risk genes, but it is also important to identify the
regulatory elements and transcriptional factors that regulate the
expression of these genes for molecular dissection of disease etiology
[[168]45, [169]46]. Recent study have shown that APOEε4 genotype
suppress transcription of autophagy mRNA’s by competing with
transcription factor EB for binding to coordinated lysosomal expression
and regulation(CLEAR) DNA motifs [[170]47]. TFs were identified for
each module with high normalized enrichment scores (NES ≥ 4) from
iRegulon (Methods), which correspond to an estimated false discovery
rate of less than 0.01 [[171]34] (Additional file [172]5: Table S5).
The SPI1 transcription factor was enriched for regulatory targets in
the Apoe^−/− driven ivory and skyblue3 modules (Table S6). It has been
previously reported that SPI1 responds to inflammatory signals and
regulates genes that can contribute to neurodegeneration in AD
[[173]48]. We also observed that transcription factors from ELF, ETS,
TCF, PEA3, GABP, and ERF sub-family of the E26 transformation-specific
(ETS) family were enriched in the Clu^−/−-driven modules (Additional
file [174]5: Table S5). ETS-domain proteins play a role in the
regulation of neuronal functions [[175]49]. ETS family members ELK1 and
ETS1 have been reported to expressed in neuronal cells and activate
transcription of early onset AD candidate gene PSEN1 [[176]45,
[177]46]. This transcription factor analysis was based solely on
bioinformatics and general data resources, and therefore require
experimental validation in specific AD-related contexts. Nevertheless,
understanding the role of these and other transcription factors in
regulating AD associated genes can provide a molecular basis for
potential therapeutic development.
Conclusions
In this study, we have performed transcriptomic analysis of mouse
strains carrying different mutations in genes linked to AD by GWAS to
better understand the genetics and basic biological mechanisms
underlying LOAD. We have also performed a comprehensive comparison at
the transcriptomic level between mouse strains and human postmortem
brain data from LOAD patients. This study of LOAD-relevant mouse models
provides a basis to dissect the role of AD risk genes in relevant AD
pathologies. We determined that different genetic perturbations affect
different molecular mechanisms underlying AD, and mapped specific
effects to each risk gene. In our study, we observed that Apoe^−/− and
Clu^−/− mice at the relatively early age of 6 months show
transcriptomic patterns similar to human AD cases. Pathway analysis
suggested that Apoe^−/− driven mouse modules specifically affect
inflammation/microglia related pathways, while Clu^−/− driven mouse
modules have affected neurosignaling, lipid transport, and endocytosis
related pathways. These findings suggest that APOE and CLU risk genes
are associated with distinct AD-related pathways. We have also
identified that 22 genes were co-expressed in the Apoe^−/−-driven ivory
mouse module and in AMP-AD modules from all human brain regions in
Consensus Cluster B that were enriched in inflammation and microglia
associated pathways. Further, some of these genes (Tyrobp, Trem2, and
Csf1r) were differentially expressed in Apoe^−/− mice. Previous studies
have already implicated the role of TREM2 in AD susceptibility due to
association of heterozygous rare variants in TREM2 with elevated risk
of AD [[178]50] and higher cortical TREM2 RNA expression with increased
amyloid pathology [[179]51]. TYROBP has been also previously reported
as key regulator of immune/microglia associated pathways, which is
strongly associated with LOAD pathology [[180]14]. These genes have
been also proposed as potential drug targets
([181]https://agora.ampadportal.org/) and our findings supports the
role of these genes with pathophysiology of LOAD.
Correlation analysis also identified that mice carrying different
mutations capture distinct transcriptional signatures of human LOAD.
Moreover, we have observed contrasting correlations of APOEε4,
Apoe^−/−, and Clu^−/− mice with AMP-AD modules, implicating that these
genetic perturbations might affect LOAD risk through different
physiological pathways. It has been speculated that absence of both
Apoe and Clu resulted in accelerated disease onset, and more extensive
amyloid deposition in the PDAPP transgenic mice brain [[182]52].
Furthermore, APOE and CLU proteins interact with amyloid-beta (Aβ) and
regulates its clearance from brain. In particular, the presence of CLU
and the APOEε2 allele promotes Aβ clearance from brain, whereas APOEε4
reduces the clearance process [[183]44]. These observations also
suggest a protective role of CLU [[184]44, [185]53, [186]54],
consistent with our transcriptome-based anti-correlation of Clu^−/−
mice LOAD modules (Fig. [187]6). Understanding of the complex
interaction between these genes is essential to interpret molecular
mechanisms underlying AD. Hence, it would be interesting to analyze
mice models carrying different combinations of genetic variants.
We did not observe any striking responses in brain gene expression
patterns in APOEε4, Bin1^+/−, and Cd2ap^+/− mice based on the small
subset of differentially expressed genes, as opposed to effects
observed in the Clu^−/− and Apoe^−/− models (Table [188]2). Nor did we
observe any mouse modules significantly driven by these perturbations
alone. We note that these models were limited to heterozygous mutations
in Bin1 and Cd2ap and astrocyte-specific expression of APOEε4. The
latter limitation may be insufficient to capture the role of APOE
variants in microglia and disease risk [[189]55]. However, our
human-mouse comparison revealed significant correlation of these mouse
models with multiple human-derived AMP-AD co-expression modules. We
interpret this as these models expression global changes relevant to
human cases, while few individual gene expression changes are large
enough to be captured by differential expression analysis. This may
suggest region-specific and/or cell-specific signals that are diluted
by our bulk whole-brain analysis. We have observed that Bin1^+/− models
were significantly associated with multiple AMP-AD co-expression
modules, which in turn were enriched in immune response, inflammation,
and synaptic functioning pathways, which is in concordance with other
studies [[190]56, [191]57]. Furthermore, Cd2ap^+/− mice captured
similar human AD signatures as Clu^−/− mice, it may be due to their
involvement in similar pathways like blood-brain carrier, and loss of
function in Cd2ap may contribute to genetic risk of AD by facilitating
age related blood-brain barrier breakdown [[192]58]. In-depth
investigation of the functional variants of these high-risk AD genes
will be essential to evaluate their role in LOAD onset and progression.
The molecular mechanisms of AD driven by rare mutations in APP, PSEN1,
and PSEN2 are relatively well understood, but the functional impact of
LOAD associated risk factors still remain unclear. Although early-onset
models have provided critical insights into amyloid accumulation,
pathology, and clearance, they do not reflect the full transcriptomic
signatures and complete neuropathology of LOAD. Indeed, the primary
transcriptomic signatures from mice carrying major early-onset and
late-onset genetic factors are distinct (Fig. [193]1b), although our
functional analysis in the context of human disease modules also
detected some common neuroimmune effects (Fig. [194]6). Many of these
differences are likely due to the presence of amyloid deposition in
APP/PS1 mice that drives gene expression signatures [[195]22]. In this
context, the common neuroimmune response suggests similar signatures
arising in the absence of amyloid. It therefore remains unclear whether
the relatively uncommon EOAD cases and the more common late-onset AD
cases proceed through similar disease mechanisms. Understanding these
distinctions motivates the development and characterization of new
models for the late onset of AD. In this study, we have analyzed mice
carrying alterations in LOAD candidate genes and found that different
AD risk genes are associated with different AD-related pathways. Our
approach provides a platform for further exploration into the causes
and progression of LOAD by assessing animal models at different ages
and/or with different combinations of LOAD risk variants. This study
highlighted that implementing state-of-the-art approaches to generate
and characterize LOAD-associated mouse models might be helpful to
identify variants and pathways to understand complete AD mechanisms and
ultimately develop effective therapies for AD.
Supplementary information
[196]13024_2019_351_MOESM1_ESM.xlsx^ (270.8KB, xlsx)
Additional file 1: Table S1. Differentially expressed genes in LOAD
mouse strains. The attached table depicts the differentially expressed
genes in LOAD mouse strains compared to C7BL/6 J mice.
[197]13024_2019_351_MOESM2_ESM.xlsx^ (19.9KB, xlsx)
Additional file 2: Table S2. KEGG pathway annotation in LOAD mouse
strains. The attached table depicts the KEGG pathway annotations for
the up and down-regulated genes in LOAD mouse strains compared to
C7BL/6 J mice.
[198]13024_2019_351_MOESM3_ESM.xlsx^ (1,017.8KB, xlsx)
Additional file 3: Table S3. Mouse modules of co-expressed genes.
Summaries of the 26 mouse modules of co-expressed genes. In sheet 1,
genes in each mouse modules were listed. Sheet 2 depicts mouse modules
were observed to be significantly (p < 0.05) driven by at-least one of
the mouse strains. Sheets 3 and 4 illustrate enriched KEGG pathways and
enriched GO terms in each mouse modules.
[199]13024_2019_351_MOESM4_ESM.xlsx^ (55.4KB, xlsx)
Additional file 4: Table S4. Jaccard indices between Mouse and AMP-AD
modules. The attached table contains Jaccard indices and its
significance (p-value) for each mouse-human module pair.
[200]13024_2019_351_MOESM5_ESM.xlsx^ (141.5KB, xlsx)
Additional file 5: Table S5. Transcriptional factor annotations in LOAD
mouse modules. The attached table illustrate transcriptional factor
enriched in each 26 mouse modules of co-expressed genes.
Acknowledgements