Abstract Background Recent advancements in spatially resolved transcriptomics (SRT) have opened up unprecedented opportunities to explore gene expression patterns within spatial contexts. Deciphering spatial domains is a critical task in spatial transcriptomic data analysis, aiding in the elucidation of tissue structural heterogeneity and biological functions. However, existing spatial domain detection methods ignore the consistency of expression patterns and spatial arrangements between spots, as well as the severe gene dropout phenomenon present in SRT data, resulting in suboptimal performance in identifying tissue spatial heterogeneity. Results In this paper, we introduce a novel framework, spatially regularized deep graph networks (SR-DGN), which integrates gene expression profiles with spatial information to learn spatially-consistent and informative spot representations. Specifically, SR-DGN employs graph attention networks (GAT) to adaptively aggregate gene expression information from neighboring spots, considering local expression patterns between spots. In addition, the spatial regularization constraint ensures the consistency of neighborhood relationships between physical and embedded spaces in an end-to-end manner. SR-DGN also employs cross-entropy (CE) loss to model gene expression states, effectively mitigating the impact of noisy gene dropouts. Conclusions Experimental results demonstrate that SR-DGN outperforms state-of-the-art methods in spatial domain identification across SRT data from different sequencing platforms. Moreover, SR-DGN is capable of recovering known microanatomical structures, yielding clearer low-dimensional visualizations and more accurate spatial trajectory inferences. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-024-11072-w. Keywords: Spatial resolved transcriptomics, Graph attention network, Spatial regularization constraint, Cross-entropy loss, Spatial domains Introduction Complex tissues are typically composed of cells performing different biological functions [[38]1]. Understanding the spatial arrangement of different cells within tissues is crucial for comprehending cell-to-cell communication and the tissue microenvironment [[39]2]. Recent advances in spatially resolved transcriptomics (SRT) have enabled the detection of gene expression profiles while preserving the spatial context of cells [[40]3]. SRT techniques primarily fall into two categories: imaging-based and sequencing-based methods. Imaging-based SRT methods, such as MERFISH [[41]4], seqFISH [[42]5], osmFISH [[43]6], can detect hundreds to thousands of transcripts in tissue sections with high resolution at the cellular level. Sequencing-based methods, like 10x Visium [[44]7], Stereo-seq [[45]8], utilize spatial barcoding to cover the whole transcriptome. These techniques provide unprecedented opportunities to study spatial gene expression, aiding in a deeper understanding of cellular interactions and spatial heterogeneity within complex tissues. Spatial domains refer to higher-order functional units in tissues characterized by similar spatial gene expression patterns, which are coordinated by various cellular states with distinct gene expression patterns [[46]9, [47]10]. Identifying spatial domains is a standard initial step in downstream SRT data analysis, and accurately detecting different tissue domains contributes to better depict tissue anatomy and functional characteristics. Traditionally, the golden standard for identifying spatial domains remains manual annotation based on anatomical knowledge. This method is a time-consuming and labor-intensive process that cannot meet the growing data demands in clinical and biological fields. In contrast, computational methods based on SRT data can achieve rapid and automatic spatial domain identification. Currently, computational methods for detecting spatial domains in spatial transcriptomics can be divided into two categories. The first category includes non-spatial clustering methods, such as Louvain [[48]11] and SCANPY [[49]12], which perform unsupervised clustering based solely on gene expression within spots. While these methods are straightforward, they overlook the relationships between neighboring spots, resulting in spatially discontinuous domains. The second category involves spatial clustering methods that integrate spatial information with gene expression profiles to improve domain identification. For example, SpaceFlow [[50]13], GraphST [[51]14], and stGCL [[52]15] combine graph convolutional networks with contrastive learning to effectively consider spot interactions and learn latent embeddings. STAGATE [[53]9] and DeepDomain [[54]16] employ graph attention autoencoders to aggregate spatial and gene expression information for domain identification. MENDER [[55]10] relies on the representation of the cellular context to rapidly identify tissue structures. SEDR [[56]17] integrates variational graph autoencoder (VGAE) with a masked self-supervised learning mechanism to learn the low-dimensional latent representation of spots. SpaDAC [[57]18], stMMR [[58]19], and SpaGCN [[59]20] also incorporate histology images, identifying spatial domains by combining molecular-level gene expression features with morphological features. Despite the good performance of these methods in spatial domain identification, they still exhibit some limitations. First, most existing methods neglect spatial consistency. For instance, when distant spots of the same type exhibit strong expression similarity in different parts of a tissue slice, spatial clustering methods should ensure that the latent embeddings reflect both expression similarity and spatial proximity. Second, in SRT studies, the “dropout pattern” (undetected genes) of spots is as informative as quantitative expression levels [[60]21]. Therefore, modeling gene dropout can improve spatial domain identification. In this study, we propose a novel framework called spatially regularized deep graph networks (SR-DGN) for effective spatial domain detection. SR-DGN utilizes graph attention network (GATs) to aggregate gene expression data from neighboring spots. Simultaneously, it reconstructs gene expression profiles using a fully-connected decoder. This process takes into account the inherent gene expression and spatial at-tributes in SRT data. To retain spatial neighbor information and ensure the continuity of the spatial domain, SR-DGN introduces the spatial regularization constraint, bringing spatially neighboring spots closer and spatially non-neighboring spots further apart in the latent space. This allows the learned embeddings reflect both spatial proximity and gene expression similarity, achieving spatially consistent embeddings. Additionally, SR-DGN incorporates cross-entropy (CE) loss to learn the expression state information of spots, thereby reducing the impact of noisy gene dropouts. To validate the effectiveness of SR-DGN, we applied it to multiple SRT tissue datasets generated from different platforms, including the mouse olfactory bulb, hippocampus, somatosensory cortex, and embryos at various developmental stages. Experimental results demonstrate that SR-DGN outperforms existing methods in spatial clustering tasks. We further validated the identified spatial domains using known marker genes, confirming their biological significance. SR-DGN also enhances spatial visualization and trajectory inference. In summary, SR-DGN facilitates the accurate detection of complex spatial domains and promotes the acquisition of valuable biological insights. Materials and methods Overview of SR-DGN In this study, we propose a novel computational method, SR-DGN, leveraging spatially regularized deep graph networks to identify spatial domains. The framework of SR-DGN is illustrated in Fig. [61]1. For spatial transcriptomics data, we first perform data preprocessing to construct the gene expression matrix and the spatial location-based adjacency matrix. Using a GATs-based graph encoder, SR-DGN aggregates the gene expression information from neighboring spots to learn latent embeddings, and then a decoder is adopted to reconstructs the expression profiles. Next, spatial regularization constraint is employed to better retain expression similarity and spatial proximity. Additionally, SR-DGN employs CE loss to model gene dropout, ensuring that the learned embeddings contain information about gene expression states. Finally, SR-DGN generates accurate and robust spatially-consistent spot representations, which are used for spatial domain identification with mclust [[62]22], visualization with UMAP [[63]23], and trajectory inference with PAGA [[64]24]. Fig. 1. [65]Fig. 1 [66]Open in a new tab Overview of SR-DGN. SR-DGN preprocesses the input spatial transcriptomics data to generate the gene expression matrix and the adjacency matrix. The GAT encoder generates embeddings by adaptively learning from its neighbors’ information. A fully connected network (FCN) decoder is used to reconstruct the original gene expression profiles. The spatial regularization constraints focus on the embedding distances between neighboring spots to ensure spatial consistency in the embeddings. To model gene dropouts, we use the CE loss, which further allows the latent features to contain information about gene status. The spatially-consistent spot representations learned by SR-DGN are applied to identify spatial domains, visualize and trajectory inference Expression data preprocessing For all datasets, the raw gene expression counts are first normalized and log-transformed using the SCANPY package [[67]12]. Subsequently, the top 3000 highly variable genes (HVGs) are selected as inputs for SR-DGN. Adjacency matrix construction for spatial transcriptomics data In spatial transcriptomics studies, the spatial location of spots or cells is crucial information. To represent this information, we constructed a spatial neighbor graph Inline graphic with its adjacency matrix denoted as Inline graphic . Specifically, if the Euclidean distance Inline graphic between spot Inline graphic and spot Inline graphic is less than a predefined radius Inline graphic , then Inline graphic . Empirically, we recommend setting Inline graphic such that each spot has an average of 3 to 20 neighbors. Therefore, the similarity between two given spots can be calculated as: graphic file with name M9.gif 1 Spatially regularized deep graph networks The proposed spatially regularized deep graph networks framework consists of two components: a GAT-based encoder and a fully connected network decoder. SR-DGN takes the gene expression matrix and the spatial adjacent matrix as inputs and generates low-dimensional representations of spots for identifying spatial heterogeneity within tissues. Encoder We designed a GAT-based encoder to integrate spatial coordinates and gene ex-pression information. GAT [[68]25] is a neural network model for graph-structured data. It introduces an attention mechanism that allows each node to focus on its neighboring nodes with different weights, enabling more flexible capture of spot expression patterns and the local cellular microenvironment. The embedding of spot Inline graphic in the Inline graphic -th layer can be represented as: graphic file with name M12.gif 2 where Inline graphic represents the weight matrix of the Inline graphic -th layer. Inline graphic is a nonlinear activation function. Inline graphic represents the neighbors of spot Inline graphic in the spatial graph Inline graphic . Inline graphic is the normalized edge weight between spot Inline graphic and spot Inline graphic in the Inline graphic -th graph attention layer, computed as: graphic file with name M23.gif 3 where Inline graphic is the attention coefficient, which is formulated as: graphic file with name M25.gif 4 where Inline graphic and Inline graphic denote trainable weight vectors. These vectors are crucial for computing the attention coefficient, which influences how neighboring spots contribute to the embedding of each spot. Decoder To efficiently reconstruct the raw gene expression data, SR-DGN adopts a fully connected network as the decoder. By treating the output of the encoder as the input of the decoder, the embedding of spot Inline graphic in the Inline graphic -th layer can be represented as: graphic file with name M30.gif 5 where Inline graphic is the trainable weight matrix, Inline graphic is the bias vector, and ReLU is the activation function ensuring that the reconstructed gene expression Inline graphic is positive. Loss function Reconstruction Loss SR-DGN considers the consistency between the original gene expression data and the reconstructed gene expression data, and the reconstruction loss is defined as: graphic file with name M34.gif 6 where Inline graphic is the number of spots in the spatial transcriptomics dataset. Spatial regularization constraint loss In real biological tissues, cells that are spatially distant, even if they exhibit similar gene expression profiles, usually belong to different functional areas or tissue domains. Therefore, SR-DGN needs to capture this biological difference in the embedding space. To enhance the spatial consistency of the embeddings, spatially neighboring spots are close in the embedding space and spatially non-neighboring spots are distant in the embedding space [[69]13]. The spatial regularization constraint loss is calculated by considering the embedding distance and spatial distance between spots: graphic file with name M36.gif 7 where Inline graphic represents the similarity between spot Inline graphic and spot Inline graphic in the embedding space. Cross-entropy loss In spatial transcriptomics data, gene dropout refers to the phenomenon where certain genes are not detected in some cells or spots, even though they might be expressed. This dropout is common in spatial transcriptomics due to technical limitations, and it poses a challenge for accurately modeling gene expression patterns. And gene dropouts are as informative as the measured genes [[70]26]. To handle gene dropout, SR-DGN transforms the original gene expression data into a binary matrix, where values are set to 1 if a gene is detected (i.e., non-zero expression), and 0 otherwise. This transformation allows us to model the dropout pattern effectively, where missing values (zeros) are treated as informative data. The binarization is performed as follows: graphic file with name M40.gif 8 graphic file with name M41.gif 9 where Inline graphic and Inline graphic are expression state matrices, representing the binarized original and reconstructed gene expression profiles respectively. To effectively model gene dropouts, SR-DGN employs the cross-entropy loss, which further allows the latent features to contain information about whether the gene is expressed. The cross-entropy loss is defined as: graphic file with name M44.gif 10 where Inline graphic is the number of genes in the spatial transcriptomics dataset. By learning from gene dropout patterns, the cross-entropy loss mitigates the impact of technical noise on the model’s outcomes and enhances the robustness of SR-DGN. Loss function of SR-DGN The embedding feature learning process of SR-DGN is trained by minimizing the reconstruction loss, spatial regularization constraint loss and cross-entropy loss. In short, the overall loss function of SR-DGN is defined as: graphic file with name M46.gif 11 where Inline graphic , Inline graphic and Inline graphic are regularization parameters. Results and discussion Datasets We evaluated the performance of SR-DGN using six publicly available SRT datasets (Table [71]1). The first dataset is the mouse olfactory bulb [[72]8], measured by Stereo-seq platform. Xu et al. [[73]17] annotated this dataset into six laminar organization: rostral migratory stream (RMS), granular cell layer (GCL), internal plexiform layer (IPL), mitral cell layer (MCL), external plexiform layer (EPL), and olfactory nerve layer (ONL). The second dataset is the mouse hippocampus obtained using the Slide-seqV2 technology [[74]27], which allows spatial expression analysis at single-cell resolution. The third dataset, based on osmFISH technology [[75]6], represents the mouse somatosensory cortex and is manually annotated into 11 distinct tissue regions: Hippocampus, Internal capsule caudoputamen, Layer 2–3 lateral, Layer 2–3 medical, Layer 3–4, Layer 4, Layer 5, Layer 6, Pia layer 1, Ventricle, White matter. The fourth dataset consists of mouse embryos at four developmental stages (E9.5, E10.5, E11.5, and E12.5). This dataset is obtained using Stereo-seq technology with subcellular resolution and the annotations provided by the original study [[76]8]. The fifth dataset is from mouse olfactory bulb tissue [[77]27], which has a complex structure and was obtained using the Slide-seqV2 platform. The final dataset comprises two slices of the mouse frontal cortex (Donor 3_0 and Donor 3_1), measured using MERFISH [[78]28]. Table 1. Characteristics of SRT datasets utilized in the study Datasets Organisms Platforms # of Spots # of Slices # of Genes Olfactory bulb Mouse Stereo-seq 19,109 1 14,376 Hippocampus Mouse Slide-seqV2 20,139 1 21,220 Somatosensory cortex Mouse OsmFISH 5328 1 33 Embryos Mouse Stereo-seq 5913, 18,408, 30,124, 51,365 4 25,568, 25,201, 26,854, 27,810 Olfactory bulb Mouse Slide-seqV2 53,172 1 23,264 Frontal cortex Mouse MERFISH 19,503, 15,844 2 374, 374 [79]Open in a new tab Experiment settings We implemented the SR-DGN method on the PyTorch 1.11.0 platform, running on an Ubuntu 20.04 server equipped with an NVIDIA Quadro GV100 GPU. For the predefined radius Inline graphic , we adjust it so that each spot has an average of 3 to 20 nearest neighbors, tailored to different SRT scenarios and platforms. We set hyperparameters based on development experience and a grid search algorithm, and strive to keep hyperparameters consistent across all datasets. For SR-DGN, the encoder is a two-layer neural network (512 − 30) with graph attention layers, and the decoder is a two-layer FCN (30–512). We used a learning rate of 0.001, a weight decay of 1e-4, and optimized the model using the Adam optimizer [[80]29]. The number of iterations was set to a default of 1000. Additionally, the regularization parameters were set as follows: Inline graphic , Inline graphic , and Inline graphic . For all benchmark methods, we used the default parameters provided in their respective original papers. Evaluation metrics For the spatially annotated SRT datasets, we used the adjusted Rand index (ARI) [[81]30], Normalized Mutual Information (NMI) [[82]31], Silhouette Coefficient (SC) [[83]32] and the Davies-Bouldin (DB) [[84]32] as clustering metrics to quantify the performance of our method. ARI measures the consistency between true labels and predicted labels, and its range is Inline graphic . Higher values indicate better alignment of the clustering results with the ground truth. NMI measures the similarity between two clusterings, and ranges from 0 to 1. SC evaluates how well each spot is matched to its own cluster compared to other clusters. Higher SC values indicate better-defined clusters. DB assesses the ratio of within-cluster similarity to between-cluster separation. Lower DB scores reflect better clustering performance. SR-DGN enhances the identification of tissue structures on the mouse olfactory bulb dataset In the analysis of SRT data, the same tissue often exhibits significant discrepancies under different sequencing technologies. This variation arises not only from tissue heterogeneity but also from the inherent characteristics of each sequencing method. This problem directly impacts the accuracy of subsequent analyses. Therefore, to validate the robustness of SR-DGN, we applied it to mouse olfactory bulb tissues generated using the Stereo-seq and Slide-seqV2 platforms. Stereo-seq is an emerging spatial omics technology based on DNA nanoball-patterned arrays, achieving subcellular spatial resolution (∼220 nm) [[85]8]. In contrast, Slide-seqV2 captures the whole transcriptome with near single-cell resolution (∼10 μm) [[86]27]. Additionally, we comprehensively compared SR-DGN with a non-spatial clustering method (SCANPY) and three state-of-the-art spatial clustering methods (STAGATE, GraphST, SpaceFlow, and SEDR). First, we tested SR-DGN on the mouse olfactory bulb dataset described by Stereo-seq. Figure [87]2A shows the structural annotations of the coronal olfactory bulb in a DAPI-stained image, including seven domains. SR-DGN identified the laminar tissue structures that correspond well with the annotation distribution (Fig. [88]2B). SR-DGN obtains higher SC (SC = 0.13) and lower DB (DB = 2.69) values, showing significant improvement compared to other methods. SR-DGN, SpaceFlow and SEDR clearly distinguish the outer layers ONL, GL, and EPL, whereas SCANPY, STAGATE, and GraphST mix them to varying degrees. Except for SCANPY, the remaining methods clearly identify the MCL structure. Notably, SR-DGN delineated the inner layers RMS, GCL, and IPL with less noise and clearer boundaries. Importantly, only SR-DGN clearly identifies the narrow tissue structure RMS. As expected, SR-DGN achieves the best clustering performance. We further validated the spatial domain identification results of SR-DGN with known domain-specific marker genes. We found that the expression patterns of marker genes were significantly correlated with the spatial domains detected by SR-DGN (Fig. [89]2C). For example, the mitral cell marker Gabra1 is highly expressed in the identified MCL domain [[90]33]. Mbp showed strong expression in the RMS structure, consistent with previous reports [[91]34]. The UMAP visualization based on SR-DGN embeddings reveals clear separation and well-organized clustering of spots within each spatial domain (Fig. [92]2D). The PAGA plot depicts that these domains form an approximately linear trajectory (Fig. [93]2E). We also conducted domain-specific expression analysis on the structures identified by SR-DGN. Figure [94]2F displays the top three domain-specific genes. We found that the expression of these genes varies across different spatial domains. For instance, the gene Ptn is highly expressed in the ONL and is associated with neuronal growth and synapse formation [[95]35]. These results indicate that SR-DGN effectively identifies spatial domains. Fig. 2. [96]Fig. 2 [97]Open in a new tab SR-DGN accurately detects spatial domains in the Stereo-seq mouse olfactory bulb dataset. (A) Annotation of the stereo-seq mouse olfactory bulb tissue in the DAPI-stained image. (B) Spatial domains identified by SR-DGN, SCANPY, STAGATE, GraphST, SpaceFlow, and SEDR in the stereo-seq mouse olfactory bulb dataset. (C) Visualizations of spatial domains identified by SR-DGN (top). Known marker gene expression pattern for each spatial domain (bottom). (D) Visualization of UMAP generated by SR-DGN. (E) Visualization of the PAGA graph generated by SR-DGN. (F) Dot plot of the gene expression of domain-specific markers Next, we applied SR-DGN to the mouse olfactory bulb dataset obtained by Slide-seqV2. We observed that the laminar structures partitioned by SR-DGN are more agree with the annotations in the Allen Reference Atlas (Fig. [98]3A, B). In contrast, the domains generated by SCANPY are mixed together, due to its lack of spatial information consideration. GraphST captures only a few laminar structures, such as the RMS, GCL, and EPL regions. SR-DGN outperforms STAGATE, SpaceFlow and SEDR in recognizing the complex structures of the olfactory bulb. SR-DGN distinctly identifies clusters 9 and 7 corresponding to the accessory olfactory bulb (AOB) and the accessory olfactory bulb granular layer (AOBgr). We observed that the domains identified by SR-DGN are strongly supported by known gene markers (Fig. [99]3C), indicating the biological significance of the structures partitioned by SR-DGN. SR-DGN identified spatial substructures named GCL1 and GCL2, which are predominantly characterized by the dominant expression of Pcp4 and Nrgn, respectively. Additionally, the UMAP and PAGA plots derived from SR-DGN reveal that the trajectories between the mouse olfactory bulbs are consistent with the spatial topological structure (Fig. [100]3D, E). The visualization of differential genes in Fig. [101]3F further aids in understanding these spatial patterns. Overall, the results on these two mouse olfactory bulb datasets demonstrate that SR-DGN can identify complex structures of the same tissue generated by different sequencing technologies, confirming its robustness in SRT analysis. Fig. 3. [102]Fig. 3 [103]Open in a new tab SR-DGN enhances the identification of known tissue structures in the Slide-seqV2 mouse olfactory bulb dataset. (A) Allen Reference Atlas of mouse olfactory bulb. (B) Spatial clustering results of six methods in the Slide-seqV2 mouse olfactory bulb dataset. (C) Visualization of spatial domains identified by SR-DGN and corresponding marker genes. (D) UMAP visualization of SR-DGN. (E) PAGA trajectory graph of SR-DGN. (F) Dot plot of the top 2 domain-specific genes SR-DGN dissects fine-grained structures on the mouse hippocampus dataset In this section, we applied SR-DGN to the mouse hippocampus profiled by Slide-seqV2 [[104]27]. Figure [105]4A shows the annotation of hippocampus structures in the Allen Reference Atlas. As expected, SR-DGN accurately delineates the “cord-like” and “arrow-like” structures (Fig. [106]4B), which are the main structures of the hippocampus tissue. Specifically, the “cord-like” structure corresponds to the pyramidal layers within Ammon’s horn, and SR-DGN identifies the cornu ammonis 1 (CA1) and cornu ammonis 3 (CA3) domains. The “arrow-like” structure corresponds to the dentate gyrus (DG), and this fine structure is also captured by SR-DGN. However, SCANPY, SpaceFlow and SDER failed to distinguish between the CA and DG structures, mistakenly identifying them as a single cluster. Although GraphST and STAGATE can determine the CA1, CA3, and DG structures, the remaining domains identified contain noise and have unclear boundaries. In contrast, the domains delineated by SR-DGN exhibit clear spatial separation and more similar to the anatomical annotations (SC = 0.21, DB = 1.23). In addition, the expression of known gene markers confirmed the spatial clustering results of the SR-DGN (Fig. [107]4C). For example, the CA1 gene marker Wsf1 is specifically expressed in the identified CA1 domain [[108]36]. The secreted synaptic organizer molecule C1ql2, as expected, shows high expression in the DG structure [[109]37]. Fig. 4. [110]Fig. 4 [111]Open in a new tab SR-DGN identifies complex tissue structures in the Slide-seqV2 mouse hippocampus dataset. (A) Annotation of the Slide-seqV2 mouse hippocampus structure from the Allen Reference Atlas. (B) Spatial clustering results of SR-DGN, SCANPY, STAGATE, GraphST, SpaceFlow, and SEDR in the Slide-seqV2 mouse hippocampus dataset. (C) Visualization of spatial domains identified by SR-DGN and the corresponding marker genes. (D) Violin plots of the two genes. (E) Functional enrichment results of cluster 2 To further dissect the heterogeneity of hippocampal tissue, we conducted differential gene expression analysis between the domains identified by SR-DGN (Fig. [112]4D). We found that the Hpca gene is enriched in the CA1 and CA3 regions, where it binds calcium ions and regulates calcium-dependent signaling pathways affecting CA1 and CA3 [[113]38]. The Mbp gene is highly expressed in cluster 4 (corpus callosum) and is involved in the formation and maintenance of myelin [[114]39]. Additionally, we focused on clusters 2 and 6, conducting pathway enrichment analysis (Fig. [115]4E). The significantly regulated pathways are related to neuron projection development and modulation of chemical synaptic transmission, suggesting that cluster 2 represents the cerebral cortex. These results demonstrate that SR-DGN can depict the fine-grained structure of hippocampus tissue and its spatial expression patterns. SR-DGN uncovers the layered pattern on the mouse somatosensory cortex dataset We also evaluated the performance of different clustering methods on the mouse somatosensory cortex dataset measured by the osmFISH platform [[116]6]. Ground truth data and the results of the six different methods are visualized (Fig. [117]5A, B). SR-DGN achieves an ARI of 0.47 and an NMI of 0.56, which are the highest among all the methods. This shows that SR-DGN has higher accuracy in segmenting anatomical regions and can better restore the ground truth data. Visually, the SR-DGN method clearly distinguishes the various layer structures, with well-defined boundaries for each anatomical region, closely matching the ground truth data, demonstrating its superiority in detecting spatial domains (Fig. [118]5C). All six methods successfully identified Pia layer 1 and Internal capsule caudoputamen. STAGATE and SpaceFlow achieved ARIs of 0.43 and 0.40, respectively, which are close to SR-DGN in terms of accuracy, but they did not accurately identify Layer 5. The ARIs of GraphST and SCANPY were 0.36 and 0.35, respectively, and their domain identification results were more scattered and their boundaries were not smooth. Interestingly, Layer 4 was divided into two substructures by SR-DGN. We further investigated the differential expression between cluster 1 and cluster 8 (Fig. [119]5D). And we observed that Lamp5 is highly expressed in cluster 1, which is consistent with the spatial distribution of Lamp5 enriched in the superficial layers of the cortex (such as Layer 1 and Layer 2/3) reported previously [[120]40]. This validates the significance of the substructures detected by SR-DGN. Additionally, we used diffusion pseudotime (DPT) [[121]41] on SR-DGN embeddings to calculate the progression process in the somatosensory cortex. Figure [122]5E shows that the internal capsule caudoputamen develops first, followed by outward development from the white matter towards both sides. These results highlight that SR-DGN can dissect spatial heterogeneity and further reveal spatial expression patterns. Fig. 5. [123]Fig. 5 [124]Open in a new tab SR-DGN improves spatial domain identification performance in the osmFISH mouse somatosensory cortex dataset. (A) Manual annotation of osmFISH mouse somatosensory cortex. (B) Barchart of clustered ARI and NMI values for six methods. (C) Visualization of spatial domains detected by SR-DGN, SCANPY, STAGATE, GraphST, SpaceFlow, and SEDR. (D) 6 domain-specific genes between domains 1 and 8. (E) Spatial visualization of pseudotime calculated by SR-DGN SR-DGN consistently performs well across all developmental stages of mouse embryos dataset Finally, we applied SR-DGN to a mouse embryo dataset obtained by Stereo-seq, encompassing four developmental stages (E9.5, E10.5, E11.5, E12.5). Stereo-seq is a technique that measures a large number of cells at high spatial resolution. We first evaluated the clustering performance of SR-DGN and five advanced methods: SCANPY, STAGATE, GraphST, SpaceFlow, and SEDR (Supplementary Figure [125]S1). Figure [126]6A depicts the tissue domain annotations of mouse embryos from the original study. SR-DGN performed well at all developmental stages, especially achieving the highest ARI score (ARI = 0.48) at the E11.5 stage, and also reaching an ARI score of 0.39 at the E12.5 stage (Fig. [127]6B). Compared to other methods, SR-DGN demonstrated stable and reliable clustering performance at all stages. Specifically, SR-DGN was more visually consistent with manual annotations and was able to accurately identify various anatomical regions, including brain, heart, liver, and lung primordium. In contrast, SCANPY performed unevenly at various stages. Although it achieved an ARI of 0.36 at E11.5, it had ARI scores of 0.27 and 0.28 at E10.5 and E12.5, respectively, showing low clustering accuracy. STAGATE showed relatively stable performance but had the lowest ARI at the E9.5 stage (ARI = 0.25). Although it improved at E12.5 (ARI = 0.35), it still lagged behind SpaceFlow, GraphST, SEDR and SR-DGN overall. GraphST achieved its highest ARI score at the E11.5 stage (ARI = 0.47), surpassing SpaceFlow at this stage but had moderate performance at other stages, with ARI scores ranging from 0.29 to 0.35. Although SpaceFlow achieved good visual segmentation results, its clustering accuracy was not as good as that of SR-DGN. These results highlight the advantages of SR-DGN in processing and analyzing spatial transcriptome data, which not only provides highly accurate clustering results, but also shows stability and reliability at various developmental stages. The application of SR-DGN provides a powerful tool for developmental biology research, helping researchers to study complex tissue structures and developmental processes more accurately. Fig. 6. [128]Fig. 6 [129]Open in a new tab SR-DGN identifies different organs on the Stereo-seq mouse embryos dataset. (A) Manual annotation of mouse embryo data at four developmental stages (E9.5, E10.5, E11.5, E12.5). (B) Spatial domain identification results of six methods on four tissue sections Ablation study To better understand the working mechanism of SR-DGN, we conducted ablation experiments to evaluate the contribution of its key components. We performed three ablation experiments by systematically removing the reconstruction loss, spatial regularization constraint, and cross-entropy loss: * SR-DGN-w/o-R: We removed the reconstruction loss to observe the impact of reconstructing gene expression on the overall performance. * SR-DGN-w/o-S: We removed the spatial regularization constraint to assess its effect on maintaining spatial consistency among neighboring spots in the embedding space. * SR-DGN-w/o-C: We removed the cross-entropy loss, which handles gene dropout, to evaluate its significance for the SR-DGN model. Figure [130]7 demonstrates that SR-DGN consistently outperforms its variants on the mouse somatosensory cortex dataset in terms of ARI and NMI. SR-DGN-w/o-S produced the worst spatial domain identification results (ARI = 0.374, NMI = 0.495). The poorer performance of SR-DGN-w/o-R compared to SR-DGN further validates the importance of the reconstruction loss. Meanwhile, SR-DGN-w/o-C, which does not account for gene dropout, resulted in reduced clustering accuracy and increased noise within domains. These findings indicate that the reconstruction loss, spatial regularization constraint, and cross-entropy loss are all crucial for enhancing SR-DGN’s ability to identify spatial domains. The combination of these components gives SR-DGN a significant performance advantage. Fig. 7. [131]Fig. 7 [132]Open in a new tab Spatial clustering results of SR-DGN and its variants on the osmFISH mouse somatosensory cortex dataset Conclusions In this study, we developed a novel spatial regularized deep graph network (SR-DGN) method for accurate identification of spatial domains. By effectively utilizing gene expression and spatial information, SR-DGN achieves high-content low-dimensional embeddings. Firstly, by adopting a GAT-based encoder and a fully connected network decoder, SR-DGN captures the spatial dependencies between spots. The application of spatial regularization constraints ensures that the learned embeddings reflect both spatial proximity and gene expression similarity. SR-DGN uses cross-entropy loss to ensure that the learned low-dimensional embeddings capture essential information about gene expression states. Experimental results on six publicly available SRT datasets demonstrate that SR-DGN significantly outperforms existing methods. The spatial domains identified by SR-DGN are biologically meaningful, as verified by known marker genes. SR-DGN is also capable of producing clearer low-dimensional visualizations and more accurate spatial trajectory inferences. Additionally, we extended SR-DGN to integrate and analyze multi-slice data (Supplementary Figure [133]S2). Therefore, SR-DGN provides a powerful tool for identifying tissue heterogeneity and exploring spatial gene expression patterns. Since SR-DGN employs a GAT-based encoder, it faces challenges related to computational cost when applied to large-scale spatial transcriptomics datasets. To address this issue, we will explore techniques such as graph sampling or utilizing sparse adjacency matrices to reduce the computational burden. Additionally, significant amounts of noisy or incomplete spatial transcriptomics data may impact the performance of SR-DGN. In the future, we expect to further enhance the clustering performance of SR-DGN by integrating spatial omics data, such as proteomics and metabolomics. This integration will provide a more comprehensive understanding of tissue architecture and function. Additionally, we intend to extend SR-DGN to incorporate histology images by integrating multimodal information through heterogeneous graph networks [[134]42, [135]43]. This will help improve the performance of the model in spatial domain identification. Electronic supplementary material Below is the link to the electronic supplementary material. [136]Supplementary Material 1^ (1.1MB, docx) Acknowledgements