Abstract
Metabolites, lipids, and glycans are fundamental but interconnected
classes of biomolecules that form the basis of the metabolic network.
These molecules are dynamically channeled through multiple pathways
that govern cellular physiology and pathology. Here, we present a
framework for the simultaneous spatial analysis of the metabolome,
lipidome, and glycome from a single tissue section using mass
spectrometry imaging. This workflow integrates a computational
platform, the Spatial Augmented Multiomics Interface (Sami), which
enables multiomics integration, high-dimensional clustering, spatial
anatomical mapping of matched molecular features, and metabolic pathway
enrichment. To demonstrate the utility of this approach, we applied
Sami to evaluate metabolic diversity across distinct brain regions and
to compare wild-type and Ps19 Alzheimer’s disease (AD) mouse models.
Our findings reveal region-specific metabolic demands in the normal
brain and highlight metabolic dysregulation in the Ps19 model,
providing insights into the biochemical alterations associated with
neurodegeneration.
Subject terms: Metabolomics, Metabolomics, Molecular imaging
__________________________________________________________________
Clarke et al. presents a framework for spatial analysis of the
metabolome, lipidome, and glycome from a single tissue section using
mass spectrometry imaging. Applying this approach, they revealed
region-specific metabolic diversity and dysregulation in both normal
and diseased mouse brains.
Introduction
Metabolomics^[59]1–[60]3, lipidomics^[61]4,[62]5, and
glycomics^[63]6–[64]8 are three distinct facets of omics methodologies,
each offering a unique window into the connected and complex
biochemical processes collectively known as the metabolic network. The
current state of these fields lacks a unified, integrated analyses that
offer a broad overview of the interconnected metabolic landscape. The
development of an integrated spatially resolved metabolomics,
lipidomics, and glycomics is crucial for advancing our knowledge of
biological systems and has the potential to transform our understanding
of the complex tissue metabolic heterogeneity, uncover novel hypotheses
for the etiology of a range of metabolic disorders. Nevertheless, the
development of such integrated approaches is challenged by the inherent
differences in physicochemical properties and analytical requirements
of each molecular class such as size, solubility and charge.
Matrix-assisted laser desorption/ionization (MALDI) mass spectrometry
imaging emerged as a powerful tool for spatially resolved multiomics
analysis, offering the possibility to overcome major limitations
associated with pooled sample analysis^[65]9,[66]10. Individual
developments in spatial-metabolomics^[67]11, -lipidomics^[68]12–[69]14,
and -glycomics^[70]15–[71]19 methodology are underway even at the
single cell level^[72]20,[73]21. To this end, recent advances in MALDI
imaging have enabled multiplexed analysis of diverse
biomolecules^[74]22, such as the co-analysis of lipid, N-linked glycans
and proteins, or N-linked glycans and storage carbohydrates such as
glycogen by optimizing sample preparation, enzyme application, matrix
selection, and instrumental parameters^[75]23–[76]26. These
developments have poised MALDI imaging for broad adaptation in
metabolism research, as they facilitate the acquisition of
comprehensive and spatially resolved omics datasets. Further, the
complexity and high dimensional nature of the datasets calls for the
need of a robust computational pipeline to extract actionable and
biologically relevant information for hypothesis generation^[77]17. An
integrated workflow combining metabolomics, lipidomics, and glycomics
from a single tissue section would offer unprecedented insights into
the spatial and heterogenous metabolic landscape of mammalian tissues,
driving the next wave of metabolism research in health and diseases.
In this study, we present an experimental-computational workflow for
the integration of spatial metabolomics, lipidomics, and glycomics
datasets from a single mouse brain section. Our minimally destructive
sequential sample preparation strategy enables the collection of
sub-mesoscale spatial datasets for metabolites, lipids, and glycans
while preserving spatial integrity. Furthermore, we leverage the
Spatial Augmented Multiomics Interface (Sami) framework to perform
advanced clustering and pathway analyses. This approach facilitates the
identification of spatially resolved molecular patterns and key
metabolic pathways across brain regions, providing actionable insights
for investigators using a reductionist approach to dissect metabolism
at the regional and cellular level. The triple-omics workflow not only
enhances the resolution and scope of spatial biomolecular profiling but
also accelerates hypothesis generation for spatial biology research. We
anticipate that this platform will be widely adopted in metabolism
research, enabling the discovery of novel molecular associations and
driving the next wave of spatially resolved studies focused on
understanding metabolic pathways in complex biological systems.
Results
A Workflow for spatial triple-omics by mass spectrometry imaging
The application of spatial metabolome, lipidome, and glycome on a
single tissue section would present a multitude of benefits, primarily
in maintaining the spatial continuity of the original tissue
microenvironment and conserving tissue resources^[78]27–[79]30. A
spatially resolved multi-omics approach facilitates a more holistic
understanding of the metabolic network underpinnings tissue
heterogeneity and biology. Using a mouse brain as a model, we
successfully acquired multiomics data from a single tissue section
(Fig. [80]1). A single 10 µm thick brain section coated with
N-(1-naphthyl) ethylenediamine dihydrochloride (NEDC) matrix was first
subjected to sequential spatial metabolome and lipidome scans using
MALDI mass spectrometry imaging (Fig. [81]1A). NEDC matrix and negative
ionization mode are used for both spatial metabolome^[82]31 and
lipidome^[83]32 analyses. Due to differences in ion optics tuning
required for optimal transmission of low m/z metabolites (< 500 m/z)
and higher m/z lipids (500–1500 m/z), metabolomics and lipidomics are
performed as two separate runs (see methods, Supplemental Fig. [84]1).
Following spatial lipidome imaging, the NEDC matrix was removed, and
the tissue was fixed, then by Peptide - N -Glycosidase F (PnGase F) and
Isoamylase digestion to release complex carbohydrates for spatial
glycome and glycogen analysis^[85]30,[86]31. Spatial glycomics was
performed in positive mode using α-cyano-4-hydroxycinnamic acid (CHCA)
as the ionization matrix (Fig. [87]1A). We annotated metabolites,
lipids, glycogen, and N-glycans through monoisotopic mass, isotope
pattern, and matching through MetaboScape as well as from our own
library^[88]17,[89]33 and previously published
assignments^[90]11,[91]34,[92]35 (Supplemental Dataset [93]1). We
successfully demonstrated acquisition spatial images for molecular
features in all three omics classes from both coronal and sagittal
mouse brain sections (Fig. [94]1B). These include metabolites, lipids,
and glycans with unique spatial distributions among different regions
of a coronally or sagittal cut mouse brain (Fig. [95]1B). This method
allows for a more holistic perspective of the metabolic network of the
tissue and combines three omic analyses on a single tissue section.
Fig. 1. High-content spatial imaging of metabolome, lipidome, and glycome in
a single mouse brain tissue.
Fig. 1
[96]Open in a new tab
A Workflow for the sequential spatial analysis of the metabolome,
lipidome, and glycome in mouse brain tissue sections using MALDI
imaging. The process initiates with the slicing of fresh frozen mouse
brains into 10 μm thick sections, either coronal or sagittal, followed
by the application of the NEDC matrix for initial metabolite and lipid
imaging. The same tissue sections are subsequently processed for
glycomic analysis, involving isoamylase and PNGase F treatments before
the application of the CHCA matrix, facilitating the visualization of
complex carbohydrates like N-glycans and glycogen. Created in
BioRender. Sun, R. (2025) [97]https://BioRender.com/k02u528. B Spatial
heatmaps/ion images that highlight the distribution of select
biomolecules identified during the MALDI imaging of metabolome,
lipidome, and glycome (n = 2, 1 for coronal section and 1 for sagittal
section). The images represent both coronal and sagittal sections of
mouse brains and are generated from the same tissue section following
the workflow outlined in panel a. Biomolecule of interest is labeled
above each heatmap/ion image, including o-phosphoethanolamine (OPE),
Phosphatidic Acid (PA) (36:1), Taurine, Phosphatidylserine (PS) (O-
45:6), and Glycogen. Color scale is not absolute but is relative to the
highest local concentration of the biomolecule within each specific
image, allowing for spatial visualization within the sample but not
between different biomolecules or samples.
To assess the reproducibility of the workflow, we performed spatial
lipidomics on a sagittal brain section and compared it to lipidomics
following metabolomics from an immediate adjacent brain section
(Supplemental Fig. [98]2). Initial assessment revealed overall lower
ion abundance in the total ion spectra in lipidomics following
metabolomics compared to the lipidomics-only analysis as expected
(Supplemental Figs. [99]2A, B). Despite the observed lower overall ion
abundance between the spectra, the relative ratios of different m/z
corresponding to lipid features have remained consistent. For example,
the relative ratio for PI (38:4), LPE (22:6), and PE (38:4) when
normalized to PE (40:6) remains consistent between both assays
(Supplemental Fig. [100]2C). This is further highlighted as these
lipids from both assays displayed similar spatial distribution after
normalization to PE (40:6). It is worth noting that PS (40:6) showed a
slight reduction even after normalization to PE (40:6), but the spatial
distribution remain consistent between the two assays. Further, in
parallel, we performed a similar comparison for glycomics. We again
observed overall lower ion abundances in the glycomics analysis after
metabolomics/lipidomics (Supplemental Figs. [101]3A, B), but the
spatial distribution as well as ratios of glycans in the pixel-by-pixel
analyses remained consistent between the two assays (Supplemental
Fig. [102]3C). Collectively, these data demonstrate that while relative
ion abundance decreases by performing the sequential workflow, the
ratio of molecular features and spatial distribution within each assay
remains relatively consistent and would have minimal impact spatial
biological interpretation. To further demonstrate that metabolomics
should be performed before lipidomics, we performed another control
experiment with metabolomics after lipidomics scans (in reverse order),
metabolomics performed after lipidomics analysis shown major reductions
in ion abundance across the board, however, many metabolites such as
aspartic acid and N-acetyl-L-aspartic acid still trend lower even after
normalization (Supplemental Figs. [103]4A–C). It is worth noting that
we observed a reduction in ion abundance from repeat sampling in both
lipid and glycan scans as part of the sequential workflow. However,
this reduction appears to occur consistently across the m/z range,
preserving the relative ratios of lipids and glycans as well as their
spatial distributions, as demonstrated by comparisons to single-method
scans.
Co-registering and integration of spatial multiomics datasets
With > 10^5 pixels scanned per biological sample, the translation of
MALDI multiomics datasets into actionable information to aid
hypothesis-driven research presents a significant computational
challenge. Recognizing this critical need, we developed a bioinformatic
pipeline termed Spatial Augmented Multiomics Interface (Sami) that
performs multiomics integration, high dimensionality reduction and
clustering, spatial clustering, annotation, and pathway enrichment in
one platform for the comprehensive analysis of spatial metabolome,
lipidome, and glycome datasets (Fig. [104]2A). The first step of Sami
is to perform multiomics integration, which consolidates the disparate
biological features of metabolomics, lipidomics, and glycomics data
into a singular, harmonized input suitable for high-dimensionality and
pathway analyses (Fig. [105]2A). The smartbeam3 laser technology within
the MALDI instrument registers x/y coordinates of the laser shots and
preserve a unified x and y coordinates for each pixel across each omics
modalities. By cross-referencing the x and y coordinates, Sami
processing can determine the exact spatial location of each pixel in
all datasets, effectively superimposing these multiple layers of omics
information on a single spatial metadata location.
Fig. 2. Spatial augmented multiomics interface (Sami) framework.
[106]Fig. 2
[107]Open in a new tab
A Sami computational framework: integration and spatial analysis of
metabolomic, lipidomic, and glycomic data. The process begins with the
co-registration of individual omics datasets, utilizing their spatial
coordinates. Following this, correlation networks are established to
identify low and high positive correlations between molecules, which
then undergo spatial clustering to reveal the molecular architecture of
the tissue. The final phase involves pathway enrichment analysis, which
is complemented by manual annotation for data interpretation. Created
in BioRender. Sun, R. (2025) [108]https://BioRender.com/f55v050. B
Scatter plots showing the strong Pearson correlation coefficient
between paired metabolome and lipidome datasets after multiomics
integration, demonstrating the quantitative relationship between these
two omics layers. C Paired spatial heatmap images of selected lipids
and metabolites, such as 862.606 m/z and 256.995 m/z, reveal the
distribution similarities between these molecules within the brain
tissue. The color scale indicates the relative abundance from minimum
(blue) to maximum (red). The monoisotopic masses and annotations are
provided beneath each heatmap. The color scale corresponds to the
highest pixel intensity detected for each individual biomolecule in
each respective brain region. D Network analysis depicting the
intricate intra- and interdomain connections of the integrated
multiomics dataset. The metabolome (green), lipidome (blue), and
glycome (red) nodes are connected within their respective domains
(intradomain) by grey lines, while interdomain connections across
different omic layers are shown in yellow.
To test the degree of fit, the x and y coordinates were
cross-referenced using enhanced correlation coefficient (ECC) for
geometric transformation, which is a statistical measure of imaging
alignment employed by computer vision^[109]36. We leveraged these data
by selecting m/z features in overlapping m/z regions of two modalities
(i.e., 400–700 m/z range between metabolomics and lipidomics)
(Supplemental Fig. [110]5A). Ion images of four m/z ratios were tested
that included on-tissue and off-tissue m/z features (Supplemental
Fig. [111]5B). We performed ECC analysis of both ion images from
different modalities after pixel-to-pixel matching, or manually
shifting one modality either 1, or 5 pixels in both x and y directions
to demonstrate miss-alignment (Supplemental Fig. [112]5C). Direct
pixel-to-pixel matching yielded a near perfect ECC score of 0.95 (out
of 1), while shifting pixels by 1 or 5 registers resulted in a score of
0.82 and 0.67, respectively (Supplemental Fig. [113]5D). Finally, we
demonstrated the glycan signals overlay with the original tissue scan
to show spatially unique regions and accurate co-registration
(Supplemental Figure [114]6). These data support that direct
pixel-to-pixel integration provides excellent alignment between
different modalities.
Successful integration opens the opportunity for additional analyses
such as correlation and network analyses across modalities. We found a
number of features in different modalities shown high levels
co-expression, their abundance and spatial distribution (Fig. [115]2B,
C). For example, 256.995 m/z from metabolomics and 862.606 m/z from the
lipidomics analysis exhibit high levels of spatial co-expression
identified after multiomics integration (Fig. [116]2B). Finally, we
demonstrated multiomics integration through inter and intra domain
connectivity via network analysis (Fig. [117]2D). Network analysis
reveals connectivity between metabolites across different modalities,
indicating effective integration of pixels across the three omics runs.
This integration will enable co-analysis of metabolomics, lipidomics,
and glycomics datasets, providing a systems-level understanding of
metabolic networks and allowing for downstream pathway analysis
(Fig. [118]2D). Collectively, our data support the generation of a
single robust, integrated, single spatially resolved dataset from three
different omics modalities.
High dimensionality reduction, spatial clustering, and annotation
High-dimensionality reduction as part of the Sami framework serves as a
powerful strategy to distill high dimensional MALDI MSI datasets to
actionable and manageable information (Fig. [119]3). To this end, UMAP
has the highest computational efficiency without sufficing
dimensionality reduction clustering comparing to other
methods^[120]37–[121]39 and has been demonstrated to perform well in
MALDI imaging datasets^[122]40. Therefore, we tested whether performing
UMAP clustering on the integrated triple-omics dataset is beneficial
over individual UMAP analysis of single modalities. Indeed, UMAP
analysis of the triple-omics dataset (metabolomics, lipidomics, and
glycomics) encompassed unique clusters present in individual UMAP
analysis of individual modalities (Supplemental Figure. [123]7),
highlighting the importance of combined triple-omics analysis. Further,
we confirmed that combining all three modalities did not reduce
clustering efficiency comparing to single modality using the Silhouette
score^[124]41 (Supplemental Figure. [125]7E). Upon performing the
spatial clustering of the coronal brain, we found that the spatial
clusters correspond well with the anatomically distinct brain regions
similar to previous finding^[126]42–[127]44 (Fig. [128]3B). We manually
annotated each spatial cluster using Mouse Brain Atlas from the Allen
Institute as a guide^[129]42 (Fig. [130]3B, C). This comparison
revealed a high degree of similarity between the spatial clusters and
the recognized brain regions, offering evidence of the biological
relevance of UMAP clustering that relates to the biochemistry of the
brain. To confirm this finding, we performed similar UMAP clustering
and spatial annotation in the sagittal brain section as well
(Fig. [131]3D–F). Similar to the coronal brain section, UMAP spatial
clustering revealed multiple brain anatomical regions (Fig. [132]3F).
The number of clusters displayed by UMAP in Sami can be adjusted by
modifying parameters such as the clustering resolution, number of
nearest neighbors and minimum distance, which influence the number and
size of clusters observed in the results (Supplemental Figure. [133]8).
Fig. 3. Spatial dimensionality reduction, and manual annotation in brain
tissues.
[134]Fig. 3
[135]Open in a new tab
A Application of high-dimensionality reduction via UMAP on a dataset
derived from a coronal brain section that has been processed through a
triple-omics workflow. The UMAP plot visualizes the clustering of
multidimensional omics data into a two-dimensional space, facilitating
the identification of distinct clusters within the brain tissue. B The
subsequent spatial clustering map displays the localized distribution
of these clusters based on the precise x and y coordinates captured by
the MALDI imaging laser spots. Each color on the map corresponds to a
specific cluster, with the legend on the right side denoting the
cluster numbers. C The clusters are then manually annotated with their
respective anatomical regions by a neuropathologist using the Allen
Brain Atlas as a reference. D Similar to panel a, a UMAP plot presents
the high-dimensionality reduction of omics data obtained from a
sagittal brain section subjected to the same triple-omics workflow,
delineating the diverse molecular landscapes across different brain
structures. E The subsequent spatial clustering map displays the
localized distribution of these clusters based on the precise x and y
coordinates captured by the MALDI imaging laser spots. Each color on
the map corresponds to a specific cluster, with the legend on the right
side denoting the cluster numbers. F Each cluster number is again
correlated with specific anatomical regions identified in the sagittal
section, guided by manual annotation by a neuropathologist with the
Allen Brain Atlas.
To further enhance the utility of the spatial multiomics map, we then
established a reference multiomics feature set for the annotation of
brain regions. Selected features for represented clusters are shown as
Radial trees and density plots (Supplemental Figs. [136]9A, B).
Annotated clusters can serve as a reference guide, aiding in the
assignment and interpretation of multiomics features across different
studies. To confirm the robustness of the annotations derived through
Sami, we first performed spatial triple-omics in a separate coronal
brain section (test brain) and performed unsupervised spatial
clustering (Supplemental Figs. [137]10A–D). Each cluster in the test
brain were matched to one cluster in the original reference brain
section through feature matching (See “Methods”), resulting in a
supervised cluster matching. Using this methodology, the annotation
from the clusters in the reference brain section can be directly used
for the test brain section (Supplemental Figs. [138]10A–G). We then
further tested this method in four different sagittal mouse brain
sections, and the method provided excellent matching for sagittal brain
sections as well among all clusters. We were able to match 100% of all
clusters to the reference brain (Supplemental Figure. [139]10E).
Further we used Jaccard index to assess the degree of matching for each
individual cluster, we found an average of 0.8 (max 1) Jaccard index
score (range 0.55–0.89) for all clusters tested indicating high degree
of similarity among matched clusters and high reproductivity in our
methods (Supplemental Figs. [140]10G and Supplemental Dataset [141]2).
Spatially resolved pathway enrichment analysis with Sami
Subsequent to spatial clustering and annotation, Sami performs pathway
analysis, a key step to distill down large omics datasets and to
provide actionable next steps for hypothesis generating research.
First, we tested whether there are metabolic differences between
intra-brain regions highlighted by clustering analysis within the WT
brain. We performed differential expression analysis by comparing each
cluster against the rest of the clusters as illustrated by volcano
plots (Fig. [142]4A); Based on differential expression analysis, the
top 50 annotated multiomics features of each cluster exhibiting
significant alterations were selected for metabolic pathway enrichment
analysis via the MetaboAnalyst 3.2 pipeline embedded in Sami.
Specifically, we conducted enrichment analyses using the Small Molecule
Pathway Database (SMPDB)^[143]45, within MetaboAnalyst 3.2
pipeline (Supplemental Figure. [144]11). An example of pathway
enrichment analysis for the cluster corresponds to the CA3 region of
the hippocampus, which predominantly consists of excitatory pyramidal
neurons, highlights the Citric Acid Cycle and Glutamate Metabolism as
the top two enriched pathways (Fig. [145]4B, C). In contrast, the top
two enriched pathways for the Corpus Callosum showed unique enrichment
for gluconeogenesis, Warburg effect, and glycolysis (Supplemental
Figure. [146]11). These findings demonstrate that multi-omics
integration combined with spatial clustering can reveals
region-specific metabolic signatures within the brain. In addition,
Sami can also export metabolites that are over or underrepresented in
each pathway (see “methods”), as well calculating a score for overall
directionality of each enriched pathway to facilitate data
interpretation. For example, the metabolites enriched for the citric
acid cycle are aconitate, malate, pyruvate, succinate, and ADP, and it
is considered down regulated overall based on Sami output (supplemental
Dataset [147]3).
Fig. 4. Metabolic diversity in distinct brain regions of a normal mouse
brain.
[148]Fig. 4
[149]Open in a new tab
A Spatial clustering map depicting anatomical regions in a normal mouse
brain, with a focus on the CA3 region, compared to the rest of the
brain. A corresponding volcano plot elucidates the significant
metabolic differences between the CA3 area and the rest of the brain,
using adjusted p-value threshold of 0.05 to denote significance
(Wilcoxon Rank-Sum test, Benjamini-Hochberg (BH) procedure for multiple
comparison). Metabolomic features that are elevated in the CA3 region
are indicated in red, while those that are decreased are shown in blue.
B Metabolic pathway enrichment analysis, constructed using the
MetaboAnalyst 3.2 R package integrated within the Sami, pinpoints the
specific metabolic pathways that are significantly enriched in the CA3
region (Wilcoxon Rank-Sum test, Benjamini-Hochberg (BH) procedure for
multiple comparison). The analysis is complemented by a metabolic
network visualization, which connects the metabolic pathways identified
by pathway enrichment analysis with more than 25% share metabolites,
enhancing our understanding of the metabolic interplays and diversity
within the CA3 region. Nodes represent biochemical pathways. The size
of the nodes and color correlate with the degree of enrichment. C
relative abundance of metabolites changed in the citric acid cycle, bar
graphs represent pixel by pixel data from CA3 (n = 527 pixels) and
non-CA3 (n = 20,000 pixels) regions of the same brain (n = 1).
(mean ± SEM; p-values indicated; two-tailed t-test). Source data are
provided as a Source Data file.
To further demonstrate the utility of the spatial pathway enrichment
analysis by Sami, we performed spatial metabolomics, lipidomics, and
glycomics and applied Sami to probe the metabolic dysfunction between
WT and the mouse model of tauopathy (Ps19). The Ps19 Alzheimer’s
disease mouse model expresses the human MAPT (P301S) mutant tau gene
under the prion promoter, leading to progressive tau pathology,
neurodegeneration, and cognitive deficits resembling frontotemporal
dementia^[150]46. This model is widely used to study tauopathies,
therapeutic interventions, and mechanisms of tau-driven
neurodegeneration^[151]47. We perform triple-omics following Sami
analysis that include co-UMAP clustering followed by pathway mapping
between 9-month-old female WT and Ps19 mice (Fig. [152]5A–C). All three
omics modalities were used for UMAP clustering similar to above, and
since both WT and Ps19 mouse brains were clustered together, both
brains give rise to the same (20) non-overlapping clusters
(Supplemental Figure. [153]12). In addition, similar brain regions can
be matched to the same cluster (visualized by color). i.e., cerebellum,
pink; cortex, blue; corpus callosum, brown. We performed metabolic
pathway enrichment (Ps19/WT) using the cluster correspond to the
isocortex and cerebellum of both WT and Ps19 mouse brains. In the
isocortex and we identified changes in citric acid cycle, warburg
effect, carnitine synthesis, and transfer of acetyl groups to
mitochondria as major pathways altered in the cortex of mouse model of
tauopathy (Fig. [154]5D–F). In contrast, cerebellum in the tauopathy
model shown changes in Starch and sucrose metabolism, Glycerolipid
metabolism, glycolysis, and triacylglycerol biosynthesis (Supplemental
Figure. [155]13). It is worth noting that metabolites such as
glucose-6-phosphate, glucose, and glycogen are classified as part of
starch and sucrose metabolism by the Kegg pathway.
Fig. 5. Pathway enrichment analysis for Ps19 and WT mouse brains.
[156]Fig. 5
[157]Open in a new tab
A UMAP plot displays the integration of data from Ps19 (brown) and
matched WT (blue) sagittal mouse brains (n = 1 each). B Separate UMAP
plots for Ps19 and WT brains are shown, detailing the full range of
clusters within each brain type. C Spatial clustering maps overlay UMAP
clusters onto the anatomical structures of both Ps19 and WT mouse
brains. D Spatial UMAP plots highlight a specific cluster corresponding
to the frontal cortex region in sagittal sections of both Ps19 and WT
brains. E Pathway enrichment analysis, using the MetaboAnalyst 3.2 R
package integrated within Sami, identifies metabolic pathways that are
differentially enriched in the Ps19 brain as compared to WT
(Hypergeometric test, Benjamini-Hochberg (BH) procedure for multiple
comparison). F A metabolic network diagram connects the enriched
pathways, highlighting the relationships and potential dysregulation
occurring in Ps19 brains relative to WT. Nodes represent biochemical
pathways. The network graphically summarizes the interconnected
pathways with more than 25% overlapping metabolites. G Relative
abundance of metabolites changed in the Citric acid cycle, bar graphs
represent pixel by pixel data from the isocortex region of the WT
(n = 2043 pixels) and Ps19 (n = 2295 pixels) brains. H Representative
immunofluorescence staining images show the localization of Citrate
Synthase, Aconitase, Isocitrate Dehydrogenase, and Glycogen in brain
sections from WT and Ps19 mice, scale bar = 400 µm. Whole brain image
is in the insert on the top right. Each row displays the enzyme or
glycogen marker in both WT (top) and Ps19 (bottom) sections, with the
corresponding quantification bar graphs (% positive and positive area)
presented on the right (mean ± SEM, n = 3 WT and Ps19 animals; p-values
indicated; two-tailed t-test). Source data are provided as a Source
Data file.
While both the citric acid cycle and the Warburg effect were
highlighted in both intra-brain (Fig. [158]4B) and inter-brain regional
analyses (Fig. [159]5E). Similarly, aconitate, malate, pyruvate,
succinate were all identified as altered in both analyses, however,
directionality analysis suggests that citric acid cycle is down in
intra-brain analysis, but is shown as up in the inter-brain regional
analyses between WT and Ps19 mice (Supplemental Dataset [160]4). We
further demonstrate this, we extracted pixel-by-pixel relative
abundance data for metabolites associated with these pathways in the
intra-brain analysis of the WT brain and the WT vs. Ps19 comparison in
the isocortex. Notably, although both pathways were enriched, the
changes at the metabolite level differed. For example, in the CA3
region, citrate and aconitate levels were increased, while malate,
succinate, and GSH levels were decreased relative to the rest of the
brain (Fig. [161]4C). In contrast, in the Ps19 isocortex, citrate and
aconitate were decreased, and malate was increased compared to WT
(Fig. [162]5G). Additionally, glycogen, identified through glycomics
analysis, was significantly elevated in the Ps19 mouse isocortex. To
validate these spatial multi-omics findings, we performed multiplexed
immunofluorescence analysis targeting key enzymes in the citric acid
cycle and the metabolite glycogen, which is part of the starch and
sucrose metabolism pathway. This analysis revealed significant
reductions in the protein levels of citrate synthase, aconitase, and
isocitrate dehydrogenase in the Ps19 brain (Fig. [163]5H). Conversely,
glycogen levels were significantly increased in the Ps19 cortex, as
confirmed using a previously validated anti-glycogen
antibody^[164]39,[165]40. These results support the findings from
spatial multi-omics analyses and highlight the utility of integrating
spatially resolved multi-omics data with targeted validation approaches
for mapping region-specific metabolic changes in both healthy and
diseased brains.
Discussion
Metabolomics, lipidomics, and glycomics are corner stone techniques
driving biological discovery in the last several
decades^[166]48–[167]50. In this study, we present a roadmap to perform
simultaneous assessments of the metabolome, lipidome, and glycome from
a single tissue section. This represents a major advancement compared
to traditional pooled omics analyses, providing a broad and
comprehensive assessment of the metabolic landscape or network. Our
approach overcomes several limitations of conventional pooled methods,
offering insights into the spatial distributions of a wide array of
biomolecules, multimodal and integrated analysis, thereby fostering a
nuanced understanding of the metabolic processes within complex tissue
microenvironments. In addition, by direct sampling of whole tissue
sections, the triple-omics workflow minimizes sample handling (such as
tissue pulverization) and fractionation steps (phase
separation)^[168]51 of pooled omics methods. The entire pipeline can be
completed within a span of 72 h when imaging at a resolution of 50 µm
and Sami offers a promising platform for high-throughput omics studies
for the future.
In addition, we also report the robust computational framework of Sami,
a series of tools customized and designed specifically to optimize
multimodal datasets produced by triple-omics MALDI imaging runs. Sami’s
ability to integrate three different modalities and reduce
high-dimensional data into spatial clusters that correlate with brain
anatomy. This is followed by metabolic pathway enrichment analyses for
unique spatial clusters in both healthy and disease brain tissues.
Collectively, the combination of triple-mics workflow and Sami
represents a leap forward in the quest to understand metabolic
diversity of different brain regions. It is worth noting that these
data are by nature hypothesis generating in nature and can be further
tested in human and genetic mouse models by targeting specific
metabolic pathways. One finding from our study is the efficacy of
metabolism as a classifier of brain regions, a concept that could drive
our understanding of brain functionality and neurological
disorders^[169]52–[170]54. The differential metabolic profiles unveiled
through our pathway enrichment analyses, demarcate specific brain
regions, emphasizing the spatial heterogeneity of metabolic processes
within the brain. For example, The CA3 region of the hippocampus is
predominantly composed of excitatory pyramidal neurons. Our pathway
enrichment and network analysis highlight multiple pathways converging
to mitochondrial function, presumed to be associated with pyramidal
neurons. e., citric acid cycle is a core component of mitochondrial
metabolism, which is critical for providing the energy required to
modulate pyramidal neuronal activity and excitability^[171]55,[172]56.
Notably, glutamate, the primary excitatory neurotransmitter in the
brain^[173]57, is synthesized from the Citric Acid Cycle intermediate
α-ketoglutarate. This metabolic linkage emphasizes the dual role of
mitochondrial metabolism in both supporting the energy demands of
neurons and providing the substrates necessary for neurotransmitter
synthesis. Similarly, the cluster corresponding to corpus callosum are
enriched for gluconeogenesis, Warburg effects, and glycolysis. It has
been demonstrated the preference of glycolysis (i.e Warburg effect) by
different cell types within the corpus callosum, i.e.,
astrocytes^[174]58, oligodendrocytes^[175]59, and microglia^[176]60.
Together, these pathway analyses highlighted the potential unique
metabolic demands within different regions of the brain. These findings
take the first step in improving understanding of brain metabolism,
paving the way for novel research avenues into the complex interplay
between brain function and metabolic activity.
Sami offers spatial pathway enrichment analysis for uncovering
region-specific metabolic signatures within the brain or metabolic
dysregulation between WT and disease mouse brains. However, the lack of
directional information from pathway enrichment highlights the
importance of metabolite-level analysis. For instance, while the citric
acid cycle was highlighted in both intra-regional analysis within the
WT brain and WT/Ps19 isocortex analyses, the specific metabolite
changes were distinct, with opposing alterations observed in citrate,
aconitate, and malate. These findings underscore the necessity of
extracting and evaluating the most significantly altered metabolites
driving pathway enrichment to pinpoint the exact metabolic alterations
in each context. Detailed metabolite-level analyses are critical to
planning hypothesis-driven experiments that will advance these
observations into actionable insights, such as manipulating unique
metabolic enzymes in the Ps19 mouse model to improve neurodegeneration.
The current study presents the Sami workflow, which integrates
metabolomics, lipidomics, and glycomics from a single tissue section
into a unified data analysis pipeline designed for
hypothesis-generating research. Using WT and Ps19 mouse brains as
examples, this pipeline focuses on exploring regional heterogeneities,
including intra-regional differences within WT brains and
inter-regional comparisons between WT and Ps19 brains. During the
sequential MALDI scan, there will be a preferential loss of
lower-abundance ions, which can be mitigated through additional
targeted workflows by optimizing acquisition parameters and focusing on
specific molecular features based on the users interest. Strategies to
address this issue should include increasing laser energy, adjusting
pixel sizes, or modify quadrupole settings to boost ion intensity in
targeted analyses. One limitation of this workflow is that it does not
yet provide cellular-level insights. Emerging multimodal imaging
approaches, such as the integration of MALDI imaging with cellular
functions through techniques like Imaging Mass Cytometry (IMC)^[177]61,
CODEX^[178]62, Xenium^[179]63, VISIUM (spatial
transcriptomics)^[180]64, are beginning to bridge this gap.
Additionally, the development of MALDI immunohistochemistry holds
significant promise for advancing cellular annotation on a single
tissue section^[181]65. Future methodological developments will aim to
assign metabolomic, lipidomic, and glycomics data to specific cellular
annotations, enabling deeper biological insights into tissue function
and pathology. Additional future development will focus on expanding
omics feature coverage such as extracellular matrix proteins, o-linked
glycans, and other small molecule metabolites^[182]66 and improving
laser raster size of 5 µm (required better co-registration algorithms),
which is near single-cell level^[183]67,[184]68. We anticipate 5 µm
multiomics imaging coupled to cellular annotation would offer deeper
insights into metabolic cross talk and heterogeneity at the single-cell
level. Additionally, while our current annotations focus on the brain,
future work should seek to identify metabolic classifiers for other
tissues, such as liver, kidney, and lung, and extend these analyses to
metabolic diseases such as cancers, neurological disorders, and inborn
errors of metabolism. The Sami framework is designed to accommodate
these updates, functioning as an expandable system that can be updated
as new improvements emerge. In conclusion, our data show that Sami is a
powerful tool for multiomics research; we envision an even brighter
future as we continue to refine and expand this innovative platform.
Methods
Chemicals, reagents, antibodies, and cell lines
High-performance liquid chromatography (HPLC)-grade acetonitrile,
ethanol, methanol, water, trifluoroacetic acid (TFA), N-(1-Naphthyl)
ethylenediamine dihydrochloride (NEDC), and recombinant isoamylase were
purchased from Sigma-Aldrich. α-cyano-4-hydroxycinnamic acid (CHCA)
matrix was purchased from Cayman Chemical. Histological-grade xylenes
were purchased from Spectrum Chemical. Citraconic anhydride for antigen
retrieval was obtained from Thermo Fisher Scientific. Recombinant
PNGaseF Prime was obtained from N-Zyme Scientifics (Doylestown, PA,
USA). Bruker IntelliSlides (Bruker Daltonics). The following antibodies
were used: Alexa Fluor® 488 anti-Citrate synthetase (1:200, Abcam,
ab197488); Alexa Fluor® 647 anti-Aconitase 2 (1:200, Abcam, ab198050);
Alexa Fluor® 594 anti-Isocitrate dehydrogenase (1:200, Abcam,
ab311797); mouse anti-glycogen (IV58B6) (1:500, in-house hybridoma).
Secondary antibody used are: Alexa Fluor 647-conjugated anti-mouse
(1:200, Invitrogen, A21238).
Mouse models
Mice were housed in a climate-controlled environment with a 14
(light)/10 (dark) hours light/dark cycle with temperature (18–23 °C)
and humidity (50–60%) control. Water and solid diet provided ad libitum
throughout the study (Tekad #2018). Wild-type (WT) C57BL/6 J mice and
Ps19 mice were purchased from Jackson Laboratory. The University of
Florida Institutional Animal Care and Use Committee has approved all of
the animal procedures under the protocol number IACUC202200000586.
Sample collection and preparation
Mice were euthanized by cervical dislocation and decapitation.
Immediately following euthanasia, the brains were surgically resected
within 30 s. Mouse brains were dissected into two hemispheres coronally
to expose regions to be scanned by MALDI. The brain tissue was first
rinsed in 1x Phosphate-Buffered Saline (PBS) and subsequently rinsed
twice with deionized water. The rinsed tissues were blotted dry, then
slow-frozen over isopentane chilled with dry-ice for seven minutes,
according to previously established protocols^[185]69,[186]70, to
ensure tissue stability and optimal preservation of analytes.
Post-freezing, the samples were stored at −80 °C until further
processing.
MALDI MSI tissue preparation
The frozen tissues were sectioned by a cryostat at an ambient
temperature of −15 °C, chilled to −20 °C to prepare for cutting. A
Leica CM1860 cryostat was utilized to obtain coronal sections of 10 µm
thickness. Tissues were mounted onto a frozen Leica chuck using Optimal
Cutting Temperature (OCT) compound as a binding agent for sectioning.
The brains were shaved until the target region was exposed. The
sections were then thaw-mounted onto a Bruker IntelliSlides and stored
at −80 °C for future use. Post-sectioning, the tissue samples were
fixed by dehydration using a vacuum desiccator for one hour. The slides
were processed sequentially for small molecules, lipids, glycans and
glycogen for triple omics data acquisition by MALDI imaging.
Matrix and enzyme application for imaging
Following an hour of vacuum desiccation, N-(1-Naphthyl) ethylenediamine
dihydrochloride (NEDC) matrix was applied and sprayed using an HTX M5
pneumatic sprayer, using a matrix solvent of HPLC-grade methanol and
water in a 70:30 mix. Then slides were sprayed with 7 mg/mL NEDC matrix
in 70% methanol solvent, using 14 passes at a rate of 0.06 mL/min, a
3 mm offset, and a velocity of 1200 mm/min. The conditions were
maintained at 30 °C and 10 psi, with a heated tray set at 50 °C.
Following this stage, metabolites and lipids were imaged on the slides
using respective instrument methods and a standardized flexImaging
(v7.2).mis file to ensure consistent pixel integration across all omic
runs. After lipid imaging, the slides were removed from the instrument
and the NEDC matrix was stripped off using an ice-cold, 100% methanol
wash for 5 min, and fixed overnight with 10% (v/v) neutral buffered
formalin (37% Formaldehyde and 1x PBS). The protocol for CHCA matrix
and enzyme application for glycans and glycogen was previously
described^[187]53,[188]71,[189]72. Briefly, the tissue sections were
dehydrated and rehydrated through a sequence of immersions once in 70%
ethanol (1 min) and twice in water (3 min each). Antigen retrieval at
95 °C for 25 min, with citraconic anhydride buffer prepared using 25 μL
citraconic anhydride in 50 mL water, and pH adjusted to 3 using
hydrochloric acid (HCl). An HTX spray station (HTX) was used to coat
the slide with a 0.2 ml aqueous solution of isoamylase (3U/slide) and
PNGase F (20 mg total/slide). The spray nozzle was heated to 45 °C with
a spray velocity of 900 mm/min. Following enzyme application, slides
were incubated at 37 °C for 2 h in a humidified chamber, and dried in a
vacuum desiccator prior to matrix application (0.04 g CHCA in 5.7 ml
50% acetonitrile/50% water and 5.7 µL 25%TFA) applied with HTX sprayer.
MALDI imaging
All MALDI MSI experiments were performed on a Bruker timsTOF fleX using
a 46 μm × 46 μm laser raster to produce 50 μm × 50 μm pixels. All image
acquisition and setup are performed through the autopilot feature with
FlexImaging 6.0 to avoid human errors. All omic runs require a
standardized flexImaging .mis file to be used as a consistent masked
template denoting the regions for imaging. To generate this file,
slides were scanned, and high-resolution .tif files were imported to
flexImaging to manually mask the regions of tissue for imaging. To
optimize post-run data processing and ensure consistent pixel
integration, the same .mis file was employed for all runs, paired with
the appropriate omic-specific method. The metabolomics run initiated
the omics workflow, during which the first and only flexImaging.mis
file was created. This file was subsequently used for all future runs,
ensuring aligned pixel integration across conditions while modifying
the method to suit the respective omic. Small molecule metabolite
imaging was conducted in negative ion mode with the following detailed
settings: MS1 scan, scan range from m/z 20–750, negative polarity,
summation rate mode, and a rate value of 544. The laser configuration
included 50 μm × 50 μm pixels from a 46 μm × 46 μm raster scan range,
90% laser power, 1 burst of 396 shots, a frequency of 10,000 Hz, and a
transfer configuration of 30 V MALDI plate offset, −60 V deflection 1
delta, 200 Vpp Funnel 1 RF, 200 Vpp Funnel 2 RF, and 200 Vpp Multipole
RF. Collision energy was set to 7 eV, with a collision RF of 700 Vpp.
Lipidomic imaging was performed immediately after the metabolomics run,
maintaining negative ion mode. The lipidomics MS1 scan parameters
included a scan range of m/z 300–2000, negative polarity, and a similar
summation rate mode. The laser was configured identically with
50 μm × 50 μm pixels from a 46 μm × 46 μm raster scan range, 80% laser
power, 1 burst of 300 shots, a frequency of 10,000 Hz, and
corresponding transfer settings. Collision energy and collision RF
parameters mirrored those used in the metabolomics imaging. Glycomic
imaging data were collected after preparing the slides as described and
imaging in positive ion mode with these settings: MS1 scan, scan range
from m/z 700–4000, positive polarity, summation rate mode, and a rate
value of 210. The laser configuration featured 50 μm × 50 μm pixels
from a 46 μm × 46 μm raster scan range, 37% laser power, 1 burst of 320
shots, a frequency of 10,000 Hz, and transfer parameters of 50 V MALDI
plate offset, 70 V deflection 1 delta, 500 Vpp Funnel 1 RF, 500 Vpp
Funnel 2 RF, and 500 Vpp Multipole RF. Collision energy was set to
10 eV, with a collision RF of 4000 Vpp.
Peak annotations and export
MALDI imaging data analysis was conducted using MetaboScape® software
for exporting MS data acquired in negative ion mode into the.srd
format, followed by data upload from Bruker Scils ™ Software
Solutions.T-ReX® 2D algorithm was applied, focusing exclusively on MS
data. Subsequent analysis incorporated parameters such as an intensity
threshold of 500, speckle width and height of 3, and a maximum of 200
speckles, with ion deconvolution targeting [M-H]⁻, [M+Cl]⁻, and
[M-H2O]⁻ ions. Annotation employed a hierarchical approach, starting
with a curated list of 7k MS1 mammalian endogenous metabolites,
followed by lipid identification using the MCube spectral library
integrated into MetaboScape. Further annotation utilized a combination
of in-house and public spectral libraries, including an in-house
library of over 50,000 endogenous metabolites (curated through
standards), the latest HMDB Metabolite Library, MoNA, MassBank,
FiehnLib LipidBlast library, MetaboBASE® Personal Library 2023, the
Bruker Sumner MetaboBASE®, and NIST 2022. Annotation accuracy was
ensured through stringent m/z tolerances (5–10 ppm), mSigma values
(25–250), and comprehensive parallel library searches, with unannotated
features explored via SmartFormula for molecular formula prediction.
Results were exported in.mca format to Scils for peak width refinement
and ion image quality assessment before final export in.csv format.
Export to a pixel-by-pixel tabular format was performed using the SCiLS
lab R package and corresponding API (available from Bruker). Peak area
was used and no normalization method was applied at this stage.
Exported tabular metadata for each omics MALDI imaging can be found as
part of the submission files.
Immunofluorescence
Immediately adjacent tissue section from the MALDI imaging analysis is
used for immunofluorescent analysis. Fresh frozen tissue sections were
fixed for 30 min at room temperature in 1% paraformaldehyde and were
rehydrated in a series of graded ethanol (95%, 70%, 50%) until water
was used. Subsequently antigen retrieval was performed by immersing
sections in 10 mM citrate buffer (pH 6.0) in Water Bath for 1 h at
96 °C then cool at room temperature for 30 min. Sections were blocked
for 1 h in 2% normal goat serum and 5% (w/v) Bovine Serum Albumin (BSA)
in 0.05% TBS/Octyl β-D-Glucopyranoside. Primary antibodies were
incubated overnight at 4 °C, while secondary antibodies were incubated
for 1 h at room temperature. All images are acquired using the Akoya
PhenoImager slide scanner.
Data co-registration and integration
Untargeted spatial metabolomics, lipidomics, and glycomics datasets,
derived from MALDI imaging, were first loaded into SCiLS Lab software
to check image quality and assess overall data integrity. Features with
an m/z count > 400 were exported for further analysis. These selected
features were exported to a tabular format using the SCiLS Lab
software’s API. Exported individual tabular triple omics data files are
accessible as supplementary materials. Following the export, each omics
file was individually co-registered based on their X and Y coordinate
information embedded within the tabular file type. For the actual data
integration, the co-registered omics files were converted from tabular
data files (.csv) to annotated data matrices (.h5ad), normalized by
total ion count and log-transformation, and merged into a single
annotated data matrix by csv2h5ad() and pooldata() functions in a
custom script (preprocess.py) as part of the Spatial Augmented
Multiomics Interface (Sami) framework (available at
[190]https://github.com/XinBiostats/SAMI). The final output was a
comprehensive annotated data matrix containing the spatially aligned
m/z features from the metabolomics, lipidomics, and glycomics datasets.
Once the data integration was complete, we proceeded with further
analyses including feature correlation analysis, high-dimensional
clustering, spatial mapping, cluster marker identification, and pathway
enrichment analysis. More detailed methods for each of these analyses
follow in subsequent sections.
Correlation analysis
The correlation analysis was executed using the Pearson correlation
coefficient, with a specific lipid m/z value serving as the reference
variable. The calculate_corr() function was employed to compute the
correlation coefficients between the reference lipid m/z feature and
all the other metabolomic features in the integrated dataset across all
available pixels. Following the acquisition of correlation
coefficients, scatter plots were produced using corr_plot() to visually
interpret these correlations. Each pair of features was plotted, with
the m/z value of metabolite on the x-axis and the m/z value of lipid on
the y-axis. Correlation coefficient (r) is calculated and presented on
each graph. Both calculate_corr() and corr_plot() functions are
available in custom Python script “correlation.py” in Sami.
Network analysis of inter- and intra-domain omics interactions
The elucidation of interactions between and within each omics domain
was achieved via a comprehensive network analysis. Firstly, the
DataFrame.corr() function in the “pandas” Python library version 1.5.3
was used to calculate the Pearson correlation coefficients among all
pairs of m/z features, both within and between each omics domain. This
provided a comprehensive matrix of correlation values that capture the
potential inter- and intra-omic associations. Next, Pearson correlation
larger than 0.5 on a pixel scale was retained. The corr_network()
function based on “networkX” Python library 3.0 was then utilized to
transform the correlation matrix into a network graph, with m/z
features represented as nodes and significant correlations as edges
between them. Nodes were annotated to indicate the omics domain to
which the feature belongs, providing a clear picture of the interplay
among metabolomics, lipidomics, and glycomics features. Both
DataFrame.corr() and corr_network() functions are available in custom
Python script “correlation.py” in Sami.
High dimensional clustering and spatial mapping
High-dimensional clustering was performed in two steps: 1) dimension
reduction and 2) clustering based on the reduced dimensions. To reduce
the dimensionality of the feature space while preserving critical
information, scanpy.pp.pca() in “scanpy” Python library 1.9.1
(n_comps = 20) was used to perform principal component analysis (PCA)
to retain top 20 principal components (PCs). Subsequently, the Leiden
algorithm was employed on the 20 PCs to assign cluster memberships to
pixels by using scanpy.tl.leiden() function in “scanpy” Python library
1.9.1 (resolution = 1.4). The uniform manifold approximation and
projection (UMAP) from scanpy.tl.umap() in “scanpy” Python library
1.9.1 is used to visualize the clusters. Spatial mapping of these
clusters was performed by linking the clusters back to their original
spatial coordinates (X and Y) within the sample. This allowed us to
visualize the spatial distribution of these omics clusters and for
exploring their spatial relationships. Three functions perform
high-dimensional clustering and spatial mapping are available in custom
script clustering.py in Sami: plot_umap_cluster() function generates
UMAP and spatial mapping in parallel: plot_cluster() function creates
the spatial mapping only; plot_select_cluster() function selectively
plots clusters of interest in the spatial mapping. To relate these
spatial clusters with known anatomical structures, we annotated each
spatial cluster against the brain regions presented in the Allen
Institute Mouse Brain Atlas (Available from atlas.brain-map.org). This
was accomplished by overlaying the coordinates of our clusters with the
spatial coordinates of the brain regions in the Allen Mouse Brain
Atlas. Any spatial cluster that coincided with the coordinates of a
particular brain region in the Atlas was annotated with the name of
that region.
To demonstrate the benefits of multi-omics integration for clustering,
we evaluate the clustering results of single-omics and multi-omics data
using the Silhouette Score. The Silhouette Score is a widely used
metric to assess the quality of clustering by measuring the cohesion
and separation within and between clusters. For each data point
[MATH: i :MATH]
, the Silhouette Score s(i) is defined as:
[MATH: si=b<
/mi>i−aimaxai,bi :MATH]
1
where a(i) is the average distance of sample i to all other points in
its own cluster and b(i) is the average distance of sample i to points
in the nearest neighboring cluster. We calculated the average
Silhouette Score across all data points as the final clustering
evaluation score. The Silhouette Score ranges from −1 to 1, where a
higher score indicates better clustering quality, as the data points
are more compact within their clusters and better separated from other
clusters. The Silhouette Score was computed using the
silhouette_score() function from sklearn library in Python.
Volcano plot, density plot and circular plot
Volcano plot was implemented by volcano_plot() function. Specifically,
each point represents an m/z feature with its log2 fold change on the
x-axis and negative log10 p-value on the y-axis. Red points represent
overrepresentation, blue points represent underrepresentation, and gray
points indicate non-significant pixels. To visualize the distribution
and relative abundance of each m/z feature across different clusters,
we generated density plots using density_plot() function, which is
similar to the ones utilized in single-cell RNA sequencing (scRNA-seq)
analyses. These plots depict the difference of pixel abundance of a
particular m/z marker in a specific cluster compared to its pixel
abundance in other clusters, providing an insight into the
cluster-specific distribution of these omics entities. Furthermore, top
5 markers of selected clusters from three different omics were
displayed as a circular tree. All three functions are available in
custom Python script “marker.py” in Sami.
Supervised clustering with reference tissue section
To validate that the protocol and computational pipeline was robust for
different tissue sections, we generated a new triple omics dataset from
MALDI imaging performed on a fresh brain section. Similar to the
previous steps, spatial clusters and cluster-specific markers were
delineated based on the new dataset. We denoted the previous tissue
section as the “reference” and the new tissue section as the
“validation section”. To evaluate if the two tissue sections share a
similar clustering pattern for reproducible finding, we used the
cluster-specific markers to align the clusters between the two sections
as the cluster IDs were generated randomly. For each cluster in the
validation section, we computed the number of shared markers (50
markers for each omics set, a total of 150 markers for triple omics)
between the reference clustering and each cluster in the
discovery/validation section. This process was completed by
cluster_mapping() function in clustering.py in Sami. We defined the
matched cluster in the discovery section as the cluster with the
highest number of common markers.
Cluster marker identification
A differential abundance analysis was conducted to identify
cluster-specific m/z by contrasting the abundance of each m/z feature
in a specific cluster to the abundance of the same m/z feature in the
rest of the clusters combined. A Wilcoxon Rank-Sum test is performed to
compare the abundance in the two groups. The resulting p-values were
adjusted for multiple testing to control the false discovery rate
(FDR). For each cluster, the top 50 statistically significant features
( Adjusted pvalue < 0.05) ranked by absolute log[2] transformed fold
change are selected as the cluster-specific markers. These markers
constituted a unique signature for each cluster, serving as identifiers
in subsequent analyses such as metabolic pathway enrichment analysis.
Findmarkers() function for cluster marker identification is available
in custom Python script “marker.py” in Sami.
Pathway enrichment analysis
Pathway enrichment analysis was done using top 50 features of each
cluster, which can be annotated and exhibiting significant alterations
(Adjusted pvalue < 0.05 and ranked by absolute fold change) were
selected for metabolic pathway enrichment analysis. The pathway
enrichment analysis was performed via the MetaboAnalystR 3.2 pipeline
embedded in Sami. This pipeline employs a hypergeometric test for
over-representation analysis. Due to the absence of a comprehensive
database encompassing all three - metabolome, lipidome, and glycome—for
pathway mapping, we conducted separate enrichment analyses: one for
combined metabolomics and glycomics via the Small Molecule Pathway
Database (SMPDB)^[191]45, and another for lipids using the lipid
library integrated within MetaboAnalyst. After identifying significant
pathways via selected markers, we assessed their regulation by
computing the log₂ fold change (log₂FC) of all measured compounds
within each pathway. A pathway is classified as up-regulated if
log₂FC > threshold, down-regulated if log₂FC < -threshold, and neutral
if it falls within [-threshold, threshold], where the threshold (e.g.,
0.1) is predefined. The function plot_dot() was used to show the
enriched pathways. The function plot_bar_for_direction() was utilized
to display up-regulated and down-regulated pathways separately. We
further explore and visualize the relationship of cluster-specific
metabolic pathways by performing a pathway network analysis. In the
pathway network, nodes represent metabolic pathways and two nodes are
connected if the corresponding two pathways have more than 25% shared
features among combined sets. The network was generated by
pathway_network() function based on Networkx package. All functions are
available in custom script pathway.py in Sami (available at
[192]https://github.com/XinBiostats/SAMI).
Reporting summary
Further information on research design is available in the [193]Nature
Portfolio Reporting Summary linked to this article.
Supplementary information
[194]Supplementary Information^ (13.7MB, pdf)
[195]41467_2025_59487_MOESM2_ESM.pdf^ (86.4KB, pdf)
Description of Additional Supplementary Files
[196]Supplementary Data 1^ (26.5KB, xlsx)
[197]Supplementary Data 2^ (77.5KB, xls)
[198]Supplementary Data 3^ (157.4KB, csv)
[199]Supplementary Data 4^ (148.2KB, csv)
[200]Reporting Summary^ (2.4MB, pdf)
[201]Transparent Peer Review file^ (8.1MB, pdf)
Source data
[202]Source Data^ (6.6MB, zip)
Acknowledgements