Abstract

Background

   Recent advancements in spatially resolved transcriptomics (SRT) have
   opened up unprecedented opportunities to explore gene expression
   patterns within spatial contexts. Deciphering spatial domains is a
   critical task in spatial transcriptomic data analysis, aiding in the
   elucidation of tissue structural heterogeneity and biological
   functions. However, existing spatial domain detection methods ignore
   the consistency of expression patterns and spatial arrangements between
   spots, as well as the severe gene dropout phenomenon present in SRT
   data, resulting in suboptimal performance in identifying tissue spatial
   heterogeneity.

Results

   In this paper, we introduce a novel framework, spatially regularized
   deep graph networks (SR-DGN), which integrates gene expression profiles
   with spatial information to learn spatially-consistent and informative
   spot representations. Specifically, SR-DGN employs graph attention
   networks (GAT) to adaptively aggregate gene expression information from
   neighboring spots, considering local expression patterns between spots.
   In addition, the spatial regularization constraint ensures the
   consistency of neighborhood relationships between physical and embedded
   spaces in an end-to-end manner. SR-DGN also employs cross-entropy (CE)
   loss to model gene expression states, effectively mitigating the impact
   of noisy gene dropouts.

Conclusions

   Experimental results demonstrate that SR-DGN outperforms
   state-of-the-art methods in spatial domain identification across SRT
   data from different sequencing platforms. Moreover, SR-DGN is capable
   of recovering known microanatomical structures, yielding clearer
   low-dimensional visualizations and more accurate spatial trajectory
   inferences.

Supplementary Information

   The online version contains supplementary material available at
   10.1186/s12864-024-11072-w.

   Keywords: Spatial resolved transcriptomics, Graph attention network,
   Spatial regularization constraint, Cross-entropy loss, Spatial domains

Introduction

   Complex tissues are typically composed of cells performing different
   biological functions [[38]1]. Understanding the spatial arrangement of
   different cells within tissues is crucial for comprehending
   cell-to-cell communication and the tissue microenvironment [[39]2].
   Recent advances in spatially resolved transcriptomics (SRT) have
   enabled the detection of gene expression profiles while preserving the
   spatial context of cells [[40]3]. SRT techniques primarily fall into
   two categories: imaging-based and sequencing-based methods.
   Imaging-based SRT methods, such as MERFISH [[41]4], seqFISH [[42]5],
   osmFISH [[43]6], can detect hundreds to thousands of transcripts in
   tissue sections with high resolution at the cellular level.
   Sequencing-based methods, like 10x Visium [[44]7], Stereo-seq [[45]8],
   utilize spatial barcoding to cover the whole transcriptome. These
   techniques provide unprecedented opportunities to study spatial gene
   expression, aiding in a deeper understanding of cellular interactions
   and spatial heterogeneity within complex tissues.

   Spatial domains refer to higher-order functional units in tissues
   characterized by similar spatial gene expression patterns, which are
   coordinated by various cellular states with distinct gene expression
   patterns [[46]9, [47]10]. Identifying spatial domains is a standard
   initial step in downstream SRT data analysis, and accurately detecting
   different tissue domains contributes to better depict tissue anatomy
   and functional characteristics. Traditionally, the golden standard for
   identifying spatial domains remains manual annotation based on
   anatomical knowledge. This method is a time-consuming and
   labor-intensive process that cannot meet the growing data demands in
   clinical and biological fields. In contrast, computational methods
   based on SRT data can achieve rapid and automatic spatial domain
   identification.

   Currently, computational methods for detecting spatial domains in
   spatial transcriptomics can be divided into two categories. The first
   category includes non-spatial clustering methods, such as Louvain
   [[48]11] and SCANPY [[49]12], which perform unsupervised clustering
   based solely on gene expression within spots. While these methods are
   straightforward, they overlook the relationships between neighboring
   spots, resulting in spatially discontinuous domains. The second
   category involves spatial clustering methods that integrate spatial
   information with gene expression profiles to improve domain
   identification. For example, SpaceFlow [[50]13], GraphST [[51]14], and
   stGCL [[52]15] combine graph convolutional networks with contrastive
   learning to effectively consider spot interactions and learn latent
   embeddings. STAGATE [[53]9] and DeepDomain [[54]16] employ graph
   attention autoencoders to aggregate spatial and gene expression
   information for domain identification. MENDER [[55]10] relies on the
   representation of the cellular context to rapidly identify tissue
   structures. SEDR [[56]17] integrates variational graph autoencoder
   (VGAE) with a masked self-supervised learning mechanism to learn the
   low-dimensional latent representation of spots. SpaDAC [[57]18], stMMR
   [[58]19], and SpaGCN [[59]20] also incorporate histology images,
   identifying spatial domains by combining molecular-level gene
   expression features with morphological features.

   Despite the good performance of these methods in spatial domain
   identification, they still exhibit some limitations. First, most
   existing methods neglect spatial consistency. For instance, when
   distant spots of the same type exhibit strong expression similarity in
   different parts of a tissue slice, spatial clustering methods should
   ensure that the latent embeddings reflect both expression similarity
   and spatial proximity. Second, in SRT studies, the “dropout pattern”
   (undetected genes) of spots is as informative as quantitative
   expression levels [[60]21]. Therefore, modeling gene dropout can
   improve spatial domain identification.

   In this study, we propose a novel framework called spatially
   regularized deep graph networks (SR-DGN) for effective spatial domain
   detection. SR-DGN utilizes graph attention network (GATs) to aggregate
   gene expression data from neighboring spots. Simultaneously, it
   reconstructs gene expression profiles using a fully-connected decoder.
   This process takes into account the inherent gene expression and
   spatial at-tributes in SRT data. To retain spatial neighbor information
   and ensure the continuity of the spatial domain, SR-DGN introduces the
   spatial regularization constraint, bringing spatially neighboring spots
   closer and spatially non-neighboring spots further apart in the latent
   space. This allows the learned embeddings reflect both spatial
   proximity and gene expression similarity, achieving spatially
   consistent embeddings. Additionally, SR-DGN incorporates cross-entropy
   (CE) loss to learn the expression state information of spots, thereby
   reducing the impact of noisy gene dropouts. To validate the
   effectiveness of SR-DGN, we applied it to multiple SRT tissue datasets
   generated from different platforms, including the mouse olfactory bulb,
   hippocampus, somatosensory cortex, and embryos at various developmental
   stages. Experimental results demonstrate that SR-DGN outperforms
   existing methods in spatial clustering tasks. We further validated the
   identified spatial domains using known marker genes, confirming their
   biological significance. SR-DGN also enhances spatial visualization and
   trajectory inference. In summary, SR-DGN facilitates the accurate
   detection of complex spatial domains and promotes the acquisition of
   valuable biological insights.

Materials and methods

Overview of SR-DGN

   In this study, we propose a novel computational method, SR-DGN,
   leveraging spatially regularized deep graph networks to identify
   spatial domains. The framework of SR-DGN is illustrated in Fig. [61]1.
   For spatial transcriptomics data, we first perform data preprocessing
   to construct the gene expression matrix and the spatial location-based
   adjacency matrix. Using a GATs-based graph encoder, SR-DGN aggregates
   the gene expression information from neighboring spots to learn latent
   embeddings, and then a decoder is adopted to reconstructs the
   expression profiles. Next, spatial regularization constraint is
   employed to better retain expression similarity and spatial proximity.
   Additionally, SR-DGN employs CE loss to model gene dropout, ensuring
   that the learned embeddings contain information about gene expression
   states. Finally, SR-DGN generates accurate and robust
   spatially-consistent spot representations, which are used for spatial
   domain identification with mclust [[62]22], visualization with UMAP
   [[63]23], and trajectory inference with PAGA [[64]24].

Fig. 1.

   [65]Fig. 1
   [66]Open in a new tab

   Overview of SR-DGN. SR-DGN preprocesses the input spatial
   transcriptomics data to generate the gene expression matrix and the
   adjacency matrix. The GAT encoder generates embeddings by adaptively
   learning from its neighbors’ information. A fully connected network
   (FCN) decoder is used to reconstruct the original gene expression
   profiles. The spatial regularization constraints focus on the embedding
   distances between neighboring spots to ensure spatial consistency in
   the embeddings. To model gene dropouts, we use the CE loss, which
   further allows the latent features to contain information about gene
   status. The spatially-consistent spot representations learned by SR-DGN
   are applied to identify spatial domains, visualize and trajectory
   inference

Expression data preprocessing

   For all datasets, the raw gene expression counts are first normalized
   and log-transformed using the SCANPY package [[67]12]. Subsequently,
   the top 3000 highly variable genes (HVGs) are selected as inputs for
   SR-DGN.

Adjacency matrix construction for spatial transcriptomics data

   In spatial transcriptomics studies, the spatial location of spots or
   cells is crucial information. To represent this information, we
   constructed a spatial neighbor graph Inline graphic with its adjacency
   matrix denoted as Inline graphic . Specifically, if the Euclidean
   distance Inline graphic between spot Inline graphic and spot Inline
   graphic is less than a predefined radius Inline graphic , then Inline
   graphic . Empirically, we recommend setting Inline graphic such that
   each spot has an average of 3 to 20 neighbors. Therefore, the
   similarity between two given spots can be calculated as:
   graphic file with name M9.gif 1

Spatially regularized deep graph networks

   The proposed spatially regularized deep graph networks framework
   consists of two components: a GAT-based encoder and a fully connected
   network decoder. SR-DGN takes the gene expression matrix and the
   spatial adjacent matrix as inputs and generates low-dimensional
   representations of spots for identifying spatial heterogeneity within
   tissues.

Encoder

   We designed a GAT-based encoder to integrate spatial coordinates and
   gene ex-pression information. GAT [[68]25] is a neural network model
   for graph-structured data. It introduces an attention mechanism that
   allows each node to focus on its neighboring nodes with different
   weights, enabling more flexible capture of spot expression patterns and
   the local cellular microenvironment. The embedding of spot Inline
   graphic in the Inline graphic -th layer can be represented as:
   graphic file with name M12.gif 2

   where Inline graphic represents the weight matrix of the Inline graphic
   -th layer. Inline graphic is a nonlinear activation function. Inline
   graphic represents the neighbors of spot Inline graphic in the spatial
   graph Inline graphic . Inline graphic is the normalized edge weight
   between spot Inline graphic and spot Inline graphic in the Inline
   graphic -th graph attention layer, computed as:
   graphic file with name M23.gif 3

   where Inline graphic is the attention coefficient, which is formulated
   as:
   graphic file with name M25.gif 4

   where Inline graphic and Inline graphic denote trainable weight
   vectors. These vectors are crucial for computing the attention
   coefficient, which influences how neighboring spots contribute to the
   embedding of each spot.

Decoder

   To efficiently reconstruct the raw gene expression data, SR-DGN adopts
   a fully connected network as the decoder. By treating the output of the
   encoder as the input of the decoder, the embedding of spot Inline
   graphic in the Inline graphic -th layer can be represented as:
   graphic file with name M30.gif 5

   where Inline graphic is the trainable weight matrix, Inline graphic is
   the bias vector, and ReLU is the activation function ensuring that the
   reconstructed gene expression Inline graphic is positive.

Loss function

Reconstruction Loss

   SR-DGN considers the consistency between the original gene expression
   data and the reconstructed gene expression data, and the reconstruction
   loss is defined as:
   graphic file with name M34.gif 6

   where Inline graphic is the number of spots in the spatial
   transcriptomics dataset.

Spatial regularization constraint loss

   In real biological tissues, cells that are spatially distant, even if
   they exhibit similar gene expression profiles, usually belong to
   different functional areas or tissue domains. Therefore, SR-DGN needs
   to capture this biological difference in the embedding space. To
   enhance the spatial consistency of the embeddings, spatially
   neighboring spots are close in the embedding space and spatially
   non-neighboring spots are distant in the embedding space [[69]13]. The
   spatial regularization constraint loss is calculated by considering the
   embedding distance and spatial distance between spots:
   graphic file with name M36.gif 7

   where Inline graphic represents the similarity between spot Inline
   graphic and spot Inline graphic in the embedding space.

Cross-entropy loss

   In spatial transcriptomics data, gene dropout refers to the phenomenon
   where certain genes are not detected in some cells or spots, even
   though they might be expressed. This dropout is common in spatial
   transcriptomics due to technical limitations, and it poses a challenge
   for accurately modeling gene expression patterns. And gene dropouts are
   as informative as the measured genes [[70]26]. To handle gene dropout,
   SR-DGN transforms the original gene expression data into a binary
   matrix, where values are set to 1 if a gene is detected (i.e., non-zero
   expression), and 0 otherwise. This transformation allows us to model
   the dropout pattern effectively, where missing values (zeros) are
   treated as informative data. The binarization is performed as follows:
   graphic file with name M40.gif 8
   graphic file with name M41.gif 9

   where Inline graphic and Inline graphic are expression state matrices,
   representing the binarized original and reconstructed gene expression
   profiles respectively.

   To effectively model gene dropouts, SR-DGN employs the cross-entropy
   loss, which further allows the latent features to contain information
   about whether the gene is expressed. The cross-entropy loss is defined
   as:
   graphic file with name M44.gif 10

   where Inline graphic is the number of genes in the spatial
   transcriptomics dataset. By learning from gene dropout patterns, the
   cross-entropy loss mitigates the impact of technical noise on the
   model’s outcomes and enhances the robustness of SR-DGN.

Loss function of SR-DGN

   The embedding feature learning process of SR-DGN is trained by
   minimizing the reconstruction loss, spatial regularization constraint
   loss and cross-entropy loss. In short, the overall loss function of
   SR-DGN is defined as:
   graphic file with name M46.gif 11

   where Inline graphic , Inline graphic and Inline graphic are
   regularization parameters.

Results and discussion

Datasets

   We evaluated the performance of SR-DGN using six publicly available SRT
   datasets (Table [71]1). The first dataset is the mouse olfactory bulb
   [[72]8], measured by Stereo-seq platform. Xu et al. [[73]17] annotated
   this dataset into six laminar organization: rostral migratory stream
   (RMS), granular cell layer (GCL), internal plexiform layer (IPL),
   mitral cell layer (MCL), external plexiform layer (EPL), and olfactory
   nerve layer (ONL). The second dataset is the mouse hippocampus obtained
   using the Slide-seqV2 technology [[74]27], which allows spatial
   expression analysis at single-cell resolution. The third dataset, based
   on osmFISH technology [[75]6], represents the mouse somatosensory
   cortex and is manually annotated into 11 distinct tissue regions:
   Hippocampus, Internal capsule caudoputamen, Layer 2–3 lateral, Layer
   2–3 medical, Layer 3–4, Layer 4, Layer 5, Layer 6, Pia layer 1,
   Ventricle, White matter. The fourth dataset consists of mouse embryos
   at four developmental stages (E9.5, E10.5, E11.5, and E12.5). This
   dataset is obtained using Stereo-seq technology with subcellular
   resolution and the annotations provided by the original study [[76]8].
   The fifth dataset is from mouse olfactory bulb tissue [[77]27], which
   has a complex structure and was obtained using the Slide-seqV2
   platform. The final dataset comprises two slices of the mouse frontal
   cortex (Donor 3_0 and Donor 3_1), measured using MERFISH [[78]28].

Table 1.

   Characteristics of SRT datasets utilized in the study
   Datasets Organisms Platforms # of Spots # of Slices # of Genes
   Olfactory bulb Mouse Stereo-seq 19,109 1 14,376
   Hippocampus Mouse Slide-seqV2 20,139 1 21,220
   Somatosensory cortex Mouse OsmFISH 5328 1 33
   Embryos Mouse Stereo-seq

   5913,

   18,408,

   30,124,

   51,365
   4

   25,568,

   25,201,

   26,854,

   27,810
   Olfactory bulb Mouse Slide-seqV2 53,172 1 23,264
   Frontal cortex Mouse MERFISH

   19,503,

   15,844
   2

   374,

   374
   [79]Open in a new tab

Experiment settings

   We implemented the SR-DGN method on the PyTorch 1.11.0 platform,
   running on an Ubuntu 20.04 server equipped with an NVIDIA Quadro GV100
   GPU. For the predefined radius Inline graphic , we adjust it so that
   each spot has an average of 3 to 20 nearest neighbors, tailored to
   different SRT scenarios and platforms. We set hyperparameters based on
   development experience and a grid search algorithm, and strive to keep
   hyperparameters consistent across all datasets. For SR-DGN, the encoder
   is a two-layer neural network (512 − 30) with graph attention layers,
   and the decoder is a two-layer FCN (30–512). We used a learning rate of
   0.001, a weight decay of 1e-4, and optimized the model using the Adam
   optimizer [[80]29]. The number of iterations was set to a default of
   1000. Additionally, the regularization parameters were set as follows:
   Inline graphic , Inline graphic , and Inline graphic . For all
   benchmark methods, we used the default parameters provided in their
   respective original papers.

Evaluation metrics

   For the spatially annotated SRT datasets, we used the adjusted Rand
   index (ARI) [[81]30], Normalized Mutual Information (NMI) [[82]31],
   Silhouette Coefficient (SC) [[83]32] and the Davies-Bouldin (DB)
   [[84]32] as clustering metrics to quantify the performance of our
   method. ARI measures the consistency between true labels and predicted
   labels, and its range is Inline graphic . Higher values indicate better
   alignment of the clustering results with the ground truth. NMI measures
   the similarity between two clusterings, and ranges from 0 to 1. SC
   evaluates how well each spot is matched to its own cluster compared to
   other clusters. Higher SC values indicate better-defined clusters. DB
   assesses the ratio of within-cluster similarity to between-cluster
   separation. Lower DB scores reflect better clustering performance.

SR-DGN enhances the identification of tissue structures on the mouse
olfactory bulb dataset

   In the analysis of SRT data, the same tissue often exhibits significant
   discrepancies under different sequencing technologies. This variation
   arises not only from tissue heterogeneity but also from the inherent
   characteristics of each sequencing method. This problem directly
   impacts the accuracy of subsequent analyses. Therefore, to validate the
   robustness of SR-DGN, we applied it to mouse olfactory bulb tissues
   generated using the Stereo-seq and Slide-seqV2 platforms.

   Stereo-seq is an emerging spatial omics technology based on DNA
   nanoball-patterned arrays, achieving subcellular spatial resolution
   (∼220 nm) [[85]8]. In contrast, Slide-seqV2 captures the whole
   transcriptome with near single-cell resolution (∼10 μm) [[86]27].
   Additionally, we comprehensively compared SR-DGN with a non-spatial
   clustering method (SCANPY) and three state-of-the-art spatial
   clustering methods (STAGATE, GraphST, SpaceFlow, and SEDR).

   First, we tested SR-DGN on the mouse olfactory bulb dataset described
   by Stereo-seq. Figure [87]2A shows the structural annotations of the
   coronal olfactory bulb in a DAPI-stained image, including seven
   domains. SR-DGN identified the laminar tissue structures that
   correspond well with the annotation distribution (Fig. [88]2B). SR-DGN
   obtains higher SC (SC = 0.13) and lower DB (DB = 2.69) values, showing
   significant improvement compared to other methods. SR-DGN, SpaceFlow
   and SEDR clearly distinguish the outer layers ONL, GL, and EPL, whereas
   SCANPY, STAGATE, and GraphST mix them to varying degrees. Except for
   SCANPY, the remaining methods clearly identify the MCL structure.
   Notably, SR-DGN delineated the inner layers RMS, GCL, and IPL with less
   noise and clearer boundaries. Importantly, only SR-DGN clearly
   identifies the narrow tissue structure RMS. As expected, SR-DGN
   achieves the best clustering performance. We further validated the
   spatial domain identification results of SR-DGN with known
   domain-specific marker genes. We found that the expression patterns of
   marker genes were significantly correlated with the spatial domains
   detected by SR-DGN (Fig. [89]2C). For example, the mitral cell marker
   Gabra1 is highly expressed in the identified MCL domain [[90]33]. Mbp
   showed strong expression in the RMS structure, consistent with previous
   reports [[91]34]. The UMAP visualization based on SR-DGN embeddings
   reveals clear separation and well-organized clustering of spots within
   each spatial domain (Fig. [92]2D). The PAGA plot depicts that these
   domains form an approximately linear trajectory (Fig. [93]2E). We also
   conducted domain-specific expression analysis on the structures
   identified by SR-DGN. Figure [94]2F displays the top three
   domain-specific genes. We found that the expression of these genes
   varies across different spatial domains. For instance, the gene Ptn is
   highly expressed in the ONL and is associated with neuronal growth and
   synapse formation [[95]35]. These results indicate that SR-DGN
   effectively identifies spatial domains.

Fig. 2.

   [96]Fig. 2
   [97]Open in a new tab

   SR-DGN accurately detects spatial domains in the Stereo-seq mouse
   olfactory bulb dataset. (A) Annotation of the stereo-seq mouse
   olfactory bulb tissue in the DAPI-stained image. (B) Spatial domains
   identified by SR-DGN, SCANPY, STAGATE, GraphST, SpaceFlow, and SEDR in
   the stereo-seq mouse olfactory bulb dataset. (C) Visualizations of
   spatial domains identified by SR-DGN (top). Known marker gene
   expression pattern for each spatial domain (bottom). (D) Visualization
   of UMAP generated by SR-DGN. (E) Visualization of the PAGA graph
   generated by SR-DGN. (F) Dot plot of the gene expression of
   domain-specific markers

   Next, we applied SR-DGN to the mouse olfactory bulb dataset obtained by
   Slide-seqV2. We observed that the laminar structures partitioned by
   SR-DGN are more agree with the annotations in the Allen Reference Atlas
   (Fig. [98]3A, B). In contrast, the domains generated by SCANPY are
   mixed together, due to its lack of spatial information consideration.
   GraphST captures only a few laminar structures, such as the RMS, GCL,
   and EPL regions. SR-DGN outperforms STAGATE, SpaceFlow and SEDR in
   recognizing the complex structures of the olfactory bulb. SR-DGN
   distinctly identifies clusters 9 and 7 corresponding to the accessory
   olfactory bulb (AOB) and the accessory olfactory bulb granular layer
   (AOBgr). We observed that the domains identified by SR-DGN are strongly
   supported by known gene markers (Fig. [99]3C), indicating the
   biological significance of the structures partitioned by SR-DGN. SR-DGN
   identified spatial substructures named GCL1 and GCL2, which are
   predominantly characterized by the dominant expression of Pcp4 and
   Nrgn, respectively. Additionally, the UMAP and PAGA plots derived from
   SR-DGN reveal that the trajectories between the mouse olfactory bulbs
   are consistent with the spatial topological structure (Fig. [100]3D,
   E). The visualization of differential genes in Fig. [101]3F further
   aids in understanding these spatial patterns. Overall, the results on
   these two mouse olfactory bulb datasets demonstrate that SR-DGN can
   identify complex structures of the same tissue generated by different
   sequencing technologies, confirming its robustness in SRT analysis.

Fig. 3.

   [102]Fig. 3
   [103]Open in a new tab

   SR-DGN enhances the identification of known tissue structures in the
   Slide-seqV2 mouse olfactory bulb dataset. (A) Allen Reference Atlas of
   mouse olfactory bulb. (B) Spatial clustering results of six methods in
   the Slide-seqV2 mouse olfactory bulb dataset. (C) Visualization of
   spatial domains identified by SR-DGN and corresponding marker genes.
   (D) UMAP visualization of SR-DGN. (E) PAGA trajectory graph of SR-DGN.
   (F) Dot plot of the top 2 domain-specific genes

SR-DGN dissects fine-grained structures on the mouse hippocampus dataset

   In this section, we applied SR-DGN to the mouse hippocampus profiled by
   Slide-seqV2 [[104]27]. Figure [105]4A shows the annotation of
   hippocampus structures in the Allen Reference Atlas. As expected,
   SR-DGN accurately delineates the “cord-like” and “arrow-like”
   structures (Fig. [106]4B), which are the main structures of the
   hippocampus tissue. Specifically, the “cord-like” structure corresponds
   to the pyramidal layers within Ammon’s horn, and SR-DGN identifies the
   cornu ammonis 1 (CA1) and cornu ammonis 3 (CA3) domains. The
   “arrow-like” structure corresponds to the dentate gyrus (DG), and this
   fine structure is also captured by SR-DGN. However, SCANPY, SpaceFlow
   and SDER failed to distinguish between the CA and DG structures,
   mistakenly identifying them as a single cluster. Although GraphST and
   STAGATE can determine the CA1, CA3, and DG structures, the remaining
   domains identified contain noise and have unclear boundaries. In
   contrast, the domains delineated by SR-DGN exhibit clear spatial
   separation and more similar to the anatomical annotations (SC = 0.21,
   DB = 1.23). In addition, the expression of known gene markers confirmed
   the spatial clustering results of the SR-DGN (Fig. [107]4C). For
   example, the CA1 gene marker Wsf1 is specifically expressed in the
   identified CA1 domain [[108]36]. The secreted synaptic organizer
   molecule C1ql2, as expected, shows high expression in the DG structure
   [[109]37].

Fig. 4.

   [110]Fig. 4
   [111]Open in a new tab

   SR-DGN identifies complex tissue structures in the Slide-seqV2 mouse
   hippocampus dataset. (A) Annotation of the Slide-seqV2 mouse
   hippocampus structure from the Allen Reference Atlas. (B) Spatial
   clustering results of SR-DGN, SCANPY, STAGATE, GraphST, SpaceFlow, and
   SEDR in the Slide-seqV2 mouse hippocampus dataset. (C) Visualization of
   spatial domains identified by SR-DGN and the corresponding marker
   genes. (D) Violin plots of the two genes. (E) Functional enrichment
   results of cluster 2

   To further dissect the heterogeneity of hippocampal tissue, we
   conducted differential gene expression analysis between the domains
   identified by SR-DGN (Fig. [112]4D). We found that the Hpca gene is
   enriched in the CA1 and CA3 regions, where it binds calcium ions and
   regulates calcium-dependent signaling pathways affecting CA1 and CA3
   [[113]38]. The Mbp gene is highly expressed in cluster 4 (corpus
   callosum) and is involved in the formation and maintenance of myelin
   [[114]39]. Additionally, we focused on clusters 2 and 6, conducting
   pathway enrichment analysis (Fig. [115]4E). The significantly regulated
   pathways are related to neuron projection development and modulation of
   chemical synaptic transmission, suggesting that cluster 2 represents
   the cerebral cortex. These results demonstrate that SR-DGN can depict
   the fine-grained structure of hippocampus tissue and its spatial
   expression patterns.

SR-DGN uncovers the layered pattern on the mouse somatosensory cortex dataset

   We also evaluated the performance of different clustering methods on
   the mouse somatosensory cortex dataset measured by the osmFISH platform
   [[116]6]. Ground truth data and the results of the six different
   methods are visualized (Fig. [117]5A, B). SR-DGN achieves an ARI of
   0.47 and an NMI of 0.56, which are the highest among all the methods.
   This shows that SR-DGN has higher accuracy in segmenting anatomical
   regions and can better restore the ground truth data. Visually, the
   SR-DGN method clearly distinguishes the various layer structures, with
   well-defined boundaries for each anatomical region, closely matching
   the ground truth data, demonstrating its superiority in detecting
   spatial domains (Fig. [118]5C). All six methods successfully identified
   Pia layer 1 and Internal capsule caudoputamen. STAGATE and SpaceFlow
   achieved ARIs of 0.43 and 0.40, respectively, which are close to SR-DGN
   in terms of accuracy, but they did not accurately identify Layer 5. The
   ARIs of GraphST and SCANPY were 0.36 and 0.35, respectively, and their
   domain identification results were more scattered and their boundaries
   were not smooth. Interestingly, Layer 4 was divided into two
   substructures by SR-DGN. We further investigated the differential
   expression between cluster 1 and cluster 8 (Fig. [119]5D). And we
   observed that Lamp5 is highly expressed in cluster 1, which is
   consistent with the spatial distribution of Lamp5 enriched in the
   superficial layers of the cortex (such as Layer 1 and Layer 2/3)
   reported previously [[120]40]. This validates the significance of the
   substructures detected by SR-DGN. Additionally, we used diffusion
   pseudotime (DPT) [[121]41] on SR-DGN embeddings to calculate the
   progression process in the somatosensory cortex. Figure [122]5E shows
   that the internal capsule caudoputamen develops first, followed by
   outward development from the white matter towards both sides. These
   results highlight that SR-DGN can dissect spatial heterogeneity and
   further reveal spatial expression patterns.

Fig. 5.

   [123]Fig. 5
   [124]Open in a new tab

   SR-DGN improves spatial domain identification performance in the
   osmFISH mouse somatosensory cortex dataset. (A) Manual annotation of
   osmFISH mouse somatosensory cortex. (B) Barchart of clustered ARI and
   NMI values for six methods. (C) Visualization of spatial domains
   detected by SR-DGN, SCANPY, STAGATE, GraphST, SpaceFlow, and SEDR. (D)
   6 domain-specific genes between domains 1 and 8. (E) Spatial
   visualization of pseudotime calculated by SR-DGN

SR-DGN consistently performs well across all developmental stages of mouse
embryos dataset

   Finally, we applied SR-DGN to a mouse embryo dataset obtained by
   Stereo-seq, encompassing four developmental stages (E9.5, E10.5, E11.5,
   E12.5). Stereo-seq is a technique that measures a large number of cells
   at high spatial resolution. We first evaluated the clustering
   performance of SR-DGN and five advanced methods: SCANPY, STAGATE,
   GraphST, SpaceFlow, and SEDR (Supplementary Figure [125]S1).
   Figure [126]6A depicts the tissue domain annotations of mouse embryos
   from the original study. SR-DGN performed well at all developmental
   stages, especially achieving the highest ARI score (ARI = 0.48) at the
   E11.5 stage, and also reaching an ARI score of 0.39 at the E12.5 stage
   (Fig. [127]6B). Compared to other methods, SR-DGN demonstrated stable
   and reliable clustering performance at all stages. Specifically, SR-DGN
   was more visually consistent with manual annotations and was able to
   accurately identify various anatomical regions, including brain, heart,
   liver, and lung primordium. In contrast, SCANPY performed unevenly at
   various stages. Although it achieved an ARI of 0.36 at E11.5, it had
   ARI scores of 0.27 and 0.28 at E10.5 and E12.5, respectively, showing
   low clustering accuracy. STAGATE showed relatively stable performance
   but had the lowest ARI at the E9.5 stage (ARI = 0.25). Although it
   improved at E12.5 (ARI = 0.35), it still lagged behind SpaceFlow,
   GraphST, SEDR and SR-DGN overall. GraphST achieved its highest ARI
   score at the E11.5 stage (ARI = 0.47), surpassing SpaceFlow at this
   stage but had moderate performance at other stages, with ARI scores
   ranging from 0.29 to 0.35. Although SpaceFlow achieved good visual
   segmentation results, its clustering accuracy was not as good as that
   of SR-DGN. These results highlight the advantages of SR-DGN in
   processing and analyzing spatial transcriptome data, which not only
   provides highly accurate clustering results, but also shows stability
   and reliability at various developmental stages. The application of
   SR-DGN provides a powerful tool for developmental biology research,
   helping researchers to study complex tissue structures and
   developmental processes more accurately.

Fig. 6.

   [128]Fig. 6
   [129]Open in a new tab

   SR-DGN identifies different organs on the Stereo-seq mouse embryos
   dataset. (A) Manual annotation of mouse embryo data at four
   developmental stages (E9.5, E10.5, E11.5, E12.5). (B) Spatial domain
   identification results of six methods on four tissue sections

Ablation study

   To better understand the working mechanism of SR-DGN, we conducted
   ablation experiments to evaluate the contribution of its key
   components. We performed three ablation experiments by systematically
   removing the reconstruction loss, spatial regularization constraint,
   and cross-entropy loss:
     * SR-DGN-w/o-R: We removed the reconstruction loss to observe the
       impact of reconstructing gene expression on the overall
       performance.
     * SR-DGN-w/o-S: We removed the spatial regularization constraint to
       assess its effect on maintaining spatial consistency among
       neighboring spots in the embedding space.
     * SR-DGN-w/o-C: We removed the cross-entropy loss, which handles gene
       dropout, to evaluate its significance for the SR-DGN model.

   Figure [130]7 demonstrates that SR-DGN consistently outperforms its
   variants on the mouse somatosensory cortex dataset in terms of ARI and
   NMI. SR-DGN-w/o-S produced the worst spatial domain identification
   results (ARI = 0.374, NMI = 0.495). The poorer performance of
   SR-DGN-w/o-R compared to SR-DGN further validates the importance of the
   reconstruction loss. Meanwhile, SR-DGN-w/o-C, which does not account
   for gene dropout, resulted in reduced clustering accuracy and increased
   noise within domains. These findings indicate that the reconstruction
   loss, spatial regularization constraint, and cross-entropy loss are all
   crucial for enhancing SR-DGN’s ability to identify spatial domains. The
   combination of these components gives SR-DGN a significant performance
   advantage.

Fig. 7.

   [131]Fig. 7
   [132]Open in a new tab

   Spatial clustering results of SR-DGN and its variants on the osmFISH
   mouse somatosensory cortex dataset

Conclusions

   In this study, we developed a novel spatial regularized deep graph
   network (SR-DGN) method for accurate identification of spatial domains.
   By effectively utilizing gene expression and spatial information,
   SR-DGN achieves high-content low-dimensional embeddings. Firstly, by
   adopting a GAT-based encoder and a fully connected network decoder,
   SR-DGN captures the spatial dependencies between spots. The application
   of spatial regularization constraints ensures that the learned
   embeddings reflect both spatial proximity and gene expression
   similarity. SR-DGN uses cross-entropy loss to ensure that the learned
   low-dimensional embeddings capture essential information about gene
   expression states. Experimental results on six publicly available SRT
   datasets demonstrate that SR-DGN significantly outperforms existing
   methods. The spatial domains identified by SR-DGN are biologically
   meaningful, as verified by known marker genes. SR-DGN is also capable
   of producing clearer low-dimensional visualizations and more accurate
   spatial trajectory inferences. Additionally, we extended SR-DGN to
   integrate and analyze multi-slice data (Supplementary Figure [133]S2).
   Therefore, SR-DGN provides a powerful tool for identifying tissue
   heterogeneity and exploring spatial gene expression patterns.

   Since SR-DGN employs a GAT-based encoder, it faces challenges related
   to computational cost when applied to large-scale spatial
   transcriptomics datasets. To address this issue, we will explore
   techniques such as graph sampling or utilizing sparse adjacency
   matrices to reduce the computational burden. Additionally, significant
   amounts of noisy or incomplete spatial transcriptomics data may impact
   the performance of SR-DGN. In the future, we expect to further enhance
   the clustering performance of SR-DGN by integrating spatial omics data,
   such as proteomics and metabolomics. This integration will provide a
   more comprehensive understanding of tissue architecture and function.
   Additionally, we intend to extend SR-DGN to incorporate histology
   images by integrating multimodal information through heterogeneous
   graph networks [[134]42, [135]43]. This will help improve the
   performance of the model in spatial domain identification.

Electronic supplementary material

   Below is the link to the electronic supplementary material.
   [136]Supplementary Material 1^ (1.1MB, docx)

Acknowledgements