Abstract

   The advent of single-cell sequencing has revolutionized the study of
   cellular dynamics, providing unprecedented resolution into the
   molecular states and heterogeneity of individual cells. However, the
   rich potential of exon-level information and junction reads within
   single cells remains underutilized. Conventional gene-count methods
   overlook critical exon and junction data, limiting the quality of cell
   representation and downstream analyses such as subpopulation
   identification and alternative splicing detection. We introduce
   DOLPHIN, a deep learning method that integrates exon-level and junction
   read data, representing genes as graph structures. These graphs are
   processed by a variational graph autoencoder to improve cell
   embeddings. DOLPHIN not only demonstrates superior performance in cell
   clustering, biomarker discovery, and alternative splicing detection but
   also provides a distinct capability to detect subtle transcriptomic
   differences at the exon level that are often masked in gene-level
   analyses. By examining cellular dynamics with enhanced resolution,
   DOLPHIN provides new insights into disease mechanisms and potential
   therapeutic targets.

   Subject terms: Machine learning, Computational models, Software, Data
   processing
     __________________________________________________________________

   Single-cell RNA-seq analysis is conventionally limited to gene-level
   quantification, missing transcript diversity. Here, authors present
   DOLPHIN, a deep learning method that enables exon- and junction-level
   analysis to improve cell representation and detect alternative
   splicing.

Introduction

   Single-cell RNA sequencing (scRNA-seq) has transformed transcriptomics
   by enabling the profiling of gene expression at the level of individual
   cells, a major advance in studying cellular diversity within complex
   tissues^[34]1. This technology has driven significant progress across
   fields such as developmental biology^[35]2,[36]3,
   immunology^[37]4,[38]5, and cancer research^[39]6,[40]7, revealing
   intricate cellular landscapes, elucidating developmental pathways, and
   identifying previously uncharacterized cell types linked to disease
   states^[41]8,[42]9. By enabling high-resolution dissection of cellular
   states and dynamics, scRNA-seq provides insights that bridge basic
   biological understanding with therapeutic applications, reshaping both
   basic and translational research.

   Despite these advancements, conventional scRNA-seq analyses are
   predominantly gene-level, relying on gene count tables for cell
   representation learning and downstream tasks such as cell clustering,
   differential gene expression, and pseudotime trajectory
   inference^[43]10. Numerous computational tools, including
   SCANPY^[44]11, seurat^[45]12, scVI^[46]13, scGPT^[47]14,
   geneFormer^[48]15, scBERT^[49]16, scSemiProfiler^[50]17, and
   Cellar^[51]18 are designed to analyze this gene-level data. However,
   aggregating data at the gene level oversimplifies the transcriptomic
   landscape, as critical biological information encoded in exon-level
   reads and junction reads—reads spanning exon boundaries and capturing
   exon connectivity—is often lost^[52]19,[53]20. This simplification
   masks essential details, including exon-specific expression and
   splicing patterns, which are crucial for accurately representing
   cellular states. Consequently, gene-level aggregation may lead to an
   oversimplified view of cellular characteristics, limiting insights into
   cellular function and regulation and underscoring the need for
   approaches that preserve this fine-grained information^[54]21.

   In addition to cell representation learning, another critical task in
   scRNA-seq analysis is the detection and quantification of alternative
   splicing (AS) events. AS analysis at the gene level poses substantial
   challenges, as gene-level quantification obscures isoform-specific and
   exon-specific variations that are critical for capturing splicing
   dynamics. To address this, various computational tools have been
   developed for AS analysis in scRNA-seq data. Among junction read-based
   approaches, Outrigger^[55]22 constructs a de novo splicing event index
   by pooling junction-spanning reads across all cells and building a
   splice graph to identify and quantify AS events. scQuint^[56]23 adopts
   a different strategy by quantifying intron usage based on junction
   reads. To improve splicing quantification under sparse conditions,
   imputation-based methods such as BRIE2^[57]24 and SCASL^[58]25 have
   been developed. BRIE2 employs a Bayesian hierarchical model to borrow
   information across similar cells and infer more robust Percent
   Spliced-In (PSI) estimates, whereas SCASL uses an iterative weighted
   k-nearest neighbors (KNN) strategy to impute missing PSI values.
   Despite these methodological advances, major gaps remain. Most existing
   tools were developed and benchmarked primarily on full-length scRNA-seq
   datasets, and their performance degrades substantially when applied to
   droplet-based platforms such as 10X Genomics, where coverage is sparse
   and biased toward transcript ends. Furthermore, nearly all methods
   predominantly rely on junction-spanning reads for splicing
   quantification. This reliance can limit sensitivity and robustness,
   especially in the context of scRNA-seq, where sparse coverage and
   frequent dropout render junction reads insufficient for capturing the
   full extent of splicing variability. Additionally, the exclusion of
   exon body reads, which represent a more abundant yet underutilized
   source of information, can reduce the sensitivity of existing methods
   in detecting subtle or complex splicing events that may be missed due
   to the sparsity of junction reads in scRNA-seq data.

   To address these foundational limitations, we introduce DOLPHIN (
   [MATH: <munder accentunder="true"><mrow><mi
   mathvariant="bold">D</mi></mrow><mrow><mo
   stretchy="true">_</mo></mrow></munder><mi mathvariant="normal">eep</mi>
   :MATH]
   [MATH: <mi mathvariant="normal">Ex</mi><munder
   accentunder="true"><mrow><mi mathvariant="bold">o</mi></mrow><mrow><mo
   stretchy="true">_</mo></mrow></munder><mi mathvariant="normal">n</mi>
   :MATH]
   -
   [MATH: <munder accentunder="true"><mrow><mi
   mathvariant="bold">l</mi></mrow><mrow><mo
   stretchy="true">_</mo></mrow></munder><mi
   mathvariant="normal">evel</mi> :MATH]
   [MATH: <mi mathvariant="normal">Gra</mi><munder
   accentunder="true"><mrow><mi mathvariant="bold">ph</mi></mrow><mrow><mo
   stretchy="true">_</mo></mrow></munder> :MATH]
   Neural Network for
   [MATH: <mi mathvariant="normal">S</mi><munder
   accentunder="true"><mrow><mi mathvariant="bold">i</mi></mrow><mrow><mo
   stretchy="true">_</mo></mrow></munder><mi
   mathvariant="normal">ngle</mi> :MATH]
   -cell
   [MATH: <mi mathvariant="normal">Representatio</mi><munder
   accentunder="true"><mrow><mi mathvariant="bold">n</mi></mrow><mrow><mo
   stretchy="true">_</mo></mrow></munder> :MATH]
   Learning and Alternative Splicing), a deep learning framework that
   advances scRNA-seq analysis beyond conventional gene-level
   quantification. DOLPHIN constructs a graph for each gene, representing
   exons as nodes and splice junctions as edges, to model gene
   architecture at single-cell resolution. By integrating exon-level reads
   and junction reads, DOLPHIN captures a richer and more detailed
   transcriptional landscape compared to traditional approaches that rely
   solely on gene-level counts. Built on a variational graph autoencoder
   (VGAE) framework^[59]26,[60]27, DOLPHIN learns cell embeddings that
   preserve fine-grained exon usage patterns and splicing information,
   enabling more accurate and informative representations of cellular
   states. These enhanced embeddings not only improve downstream analyses
   such as cell clustering and differential gene analysis but also support
   more sensitive AS detection. Specifically, DOLPHIN uses the learned
   embeddings to identify neighboring cells with similar exon and splicing
   profiles, aggregates junction reads across neighbors to amplify
   splicing signals, and substantially enhances AS detection under the
   sparse sequencing conditions typical of scRNA-seq. Following
   aggregation, PSI values are calculated using the Outrigger function
   from Expedition^[61]28, providing accurate and robust quantification of
   splicing events across diverse cell populations.

   We demonstrate DOLPHIN’s general applicability by validating its
   performance on a diverse set of scRNA-seq datasets^[62]29–[63]31 that
   encompass distinct sequencing technologies, including full-length and
   droplet-based approaches, as well as a broad spectrum of tissue types
   and biological conditions. These datasets span healthy tissues, normal
   tissues from patients with cancer, and malignant tissues, thereby
   representing a wide range of physiological and pathological contexts.
   This systematic validation highlights DOLPHIN’s robustness and
   adaptability, demonstrating its effectiveness in accurately capturing
   cell heterogeneity and refining complex downstream analyses across
   diverse experimental contexts. Across diverse scRNA-seq datasets and
   simulated data, our model consistently outperforms traditional gene
   count-based methods. By integrating exon-level and junction read
   information with advanced deep learning techniques, DOLPHIN enhances
   the resolution of single-cell transcriptomic analysis, improving cell
   embedding quality and enabling more detailed analyses of AS and
   differential gene expression. Ultimately, DOLPHIN provides an
   analytical framework that addresses the limitations of gene-count-based
   methods, enabling more precise insights into complex cellular processes
   and facilitating the study of disease mechanisms and therapeutic
   targets.

Results

Overview of DOLPHIN

   DOLPHIN is a deep learning framework for exon-level analysis of
   scRNA-seq data, offering higher transcriptomic resolution than
   traditional gene-count methods (Fig. [64]1). Each gene is modeled as an
   exon graph, where nodes represent exons and edges represent their
   connections via junction reads. By integrating exon and junction data,
   DOLPHIN generates integrative cell representations that support
   applications like cell clustering, differential exon analysis, and AS
   detection^[65]19,[66]32,[67]33.

Fig. 1. Method overview of DOLPHIN for exon-level single-cell RNA-seq data
analysis.

   [68]Fig. 1
   [69]Open in a new tab

   a Preprocessing of single-cell RNA-seq data, including quantification
   of exon-mapped reads and exon-exon junction reads. b Construction of
   gene-specific exon graphs, where nodes represent exons and edges
   represent junctions, aggregated to form an exon graph for each cell. c
   Learning cell embeddings from exon-level quantification and junction
   reads through a Variational Graph Autoencoder (VGAE). Each exon graph
   is converted into feature matrices (X[i]) and normalized adjacency
   matrices (AN[i]), which are processed by a Graph Attention Network
   (GAT) layer to capture exon dependencies. The output (H[i]) from the
   GAT layer is then passed to a Variational Autoencoder (VAE) that
   projects graph representations into a latent space (Z), defined by mean
   (μ) and standard deviation (σ) parameters, with a KL divergence term
   weighted by a hyperparameter (β) to regularize the latent space. The
   decoders reconstruct both the feature matrix (
   [MATH: <msubsup><mrow><mi
   mathvariant="bold">X</mi></mrow><mrow><mi>i</mi></mrow><mrow><mo>′</mo>
   </mrow></msubsup> :MATH]
   ) and raw adjacency matrix (
   [MATH: <msubsup><mrow><mi
   mathvariant="bold">A</mi></mrow><mrow><mi>i</mi></mrow><mrow><mo>′</mo>
   </mrow></msubsup> :MATH]
   ), with losses weighted by a hyperparameter (λ) to minimize feature and
   adjacency reconstruction errors, thereby learning cell-specific
   embeddings. d Construction of a K-nearest neighbor (KNN) graph in the
   latent space for refining and aggregating junction reads from
   neighboring cells based on consensus (majority voting), which enhances
   junction coverage for downstream splicing analysis. e Calculation of
   percent-splice-in (PSI) values from aggregated junction reads, enabling
   accurate alternative splicing inference at the single-cell level. f
   High-resolution cell embeddings generated by DOLPHIN improve the
   characterization of cellular heterogeneity compared to conventional
   gene count-based methods. g Detection of exon-specific markers and
   identification of biological pathways that are often missed in
   gene-level analyses. Exon-level biomarkers were identified through
   differential expression analysis using MAST. h Extensive alternative
   splicing analysis enabled by DOLPHIN across diverse cellular
   populations. By default, PSI values and splicing modalities were
   quantified using Expedition. However, DOLPHIN can be adapted to work
   with other alternative splicing quantification tools.

   The method operates in three main steps. First, DOLPHIN constructs an
   exon graph for each gene by capturing gene-specific exon connectivity
   from junction reads (Fig. [70]1a, b)^[71]19,[72]33. Raw scRNA-seq reads
   are aligned to a reference genome to identify exon reads and junction
   reads, which are then used to build exon graphs. Each exon graph has
   nodes representing exons annotated with their read counts and
   directional edges weighted by normalized junction read counts. This
   setup forms a cell-level structure comprising exon graphs for each
   gene. Second, these cell-level exon graphs are processed through a
   VGAE^[73]26 to produce informative cell embeddings (Fig. [74]1c). Each
   cell-level exon graph is converted into adjacency and feature matrices,
   which are processed by a graph attention (GAT) layer^[75]34. The GAT
   layer dynamically assigns weights to neighboring exons, emphasizing
   biologically relevant exon connections informed by junction reads. The
   variational autoencoder (VAE) encoder then learns a latent
   representation Z that captures critical exon-junction relationships,
   optimized through a composite loss function that balances
   reconstruction of both exon-level features and adjacency
   matrices^[76]35,[77]36. Third, DOLPHIN addresses the limited detection
   of junction reads in scRNA-seq by aggregating junction reads from
   similar cells in the junction-aware latent space (Fig. [78]1d)^[79]37.
   Using a KNN approach, cells with similar exon and junction patterns are
   identified, and junction reads from those neighboring cells are
   aggregated based on majority voting. This aggregation step,
   schematically illustrated in Fig. [80]1e, enriches each cell’s profile
   with junction reads from consistent neighboring cells, enhancing
   detection sensitivity without introducing noise.

   With these enhanced cell embeddings, DOLPHIN supports exon-level
   analyses such as refined cell clustering (Fig. [81]1f), differential
   gene analysis at the exon level (Fig. [82]1g), and AS detection
   (Fig. [83]1h). By integrating these embeddings with splicing detection
   tools like Outrigger from the Expedition suite, DOLPHIN can compute PSI
   values, providing detailed insights into exon usage and cell-specific
   splicing patterns^[84]22,[85]28.

DOLPHIN enhances cell embeddings across diverse single-cell scenarios through
graph-based exon and junction read integration

   DOLPHIN enhances cell embeddings through the graph-based integration of
   exon-level and junction read quantification, leveraging both read types
   to improve the quality of cell embeddings and the accuracy of scRNA-seq
   clustering compared to traditional gene count methods. To demonstrate
   the general applicability of DOLPHIN, we validated its performance
   across diverse scRNA-seq datasets spanning different platforms, tissue
   types, and biological conditions. These included a full-length dataset
   from human peripheral blood mononuclear cells (PBMCs)^[86]29 and two
   10X Genomics Chromium Single Cell 3′ v2 datasets from normal epithelial
   colon and rectum tissues from gastrointestinal cancer patients^[87]30.

   For each dataset, we processed four inputs through the VAE framework–an
   exon feature matrix, a junction-based adjacency matrix, a gene count
   table, and the integrated feature and adjacency matrices from DOLPHIN.
   These components were assessed individually to evaluate their
   contributions and the enhancement achieved through integration.
   Clustering outcomes were compared to ground truth labels using Uniform
   Manifold Approximation and Projection (UMAP) visualizations^[88]38,
   with the ground truth annotations taken from the original publications,
   as shown in Fig. [89]2a–d and Supplementary Fig. [90]S1. Additionally,
   Adjusted Rand Index (ARI)^[91]39 and Normalized Mutual Information
   (NMI)^[92]40 scores were used for quantitative evaluation, as presented
   in Fig. [93]2e–g. DOLPHIN’s integrated embeddings consistently
   outperformed individual matrices and gene count tables, capturing cell
   type-specific information at finer resolution and achieving higher ARI
   and NMI scores, as demonstrated in Fig. [94]2.

Fig. 2. DOLPHIN enhances cell embedding quality through exon and junction
read integration.

   [95]Fig. 2
   [96]Open in a new tab

   a–d UMAP plots comparing the quality of cell embeddings generated by
   DOLPHIN, which integrates both exon and junction read counts, against
   conventional gene count-based methods across multiple single-cell
   RNA-seq datasets. Improved clustering and separation of distinct cell
   populations define higher-quality embeddings. Top panels: Human
   peripheral blood mononuclear cells (PBMCs) analyzed using full-length
   single-cell RNA-seq. Middle panels: Human colon cells analyzed using
   10X Genomics. Bottom panels: Human rectum cells analyzed using 10X
   Genomics. For each dataset, the following inputs are compared: a
   DOLPHIN integrating both exon and junction read counts, producing the
   most integrative and biologically informative embeddings. b DOLPHIN
   framework using gene count tables, reflecting a conventional gene-level
   analysis. c DOLPHIN using only exon read counts (feature matrix). d
   DOLPHIN using only junction read counts (adjacency matrix). e–g Box
   plots of Adjusted Rand Index (ARI) and Normalized Mutual Information
   (NMI) scores comparing embedding quality across three different
   datasets. DOLPHIN, through the integration of exon and junction read
   counts, achieves significantly higher scores than approaches using exon
   or junction data alone or conventional gene-count methods. These
   metrics confirm DOLPHIN's superior clustering accuracy and alignment
   with known biological cell types, highlighting its performance
   advantage. Each score is based on N = 50 bootstrapping replicates using
   different random seeds (technical replicates). Boxes indicate the
   interquartile range (IQR, 25th to 75th percentile), with the line
   inside each box representing the median. Whiskers extend to the most
   extreme data points within 1.5 times the IQR from the quartiles. P
   values from one-sided Student’s t-tests: *P < 0.05, **P < 0.01,
   ***P < 0.001, ****P < 0.0001; n.s. not significant. Exact P values are
   provided in the source data. Source data are provided as a [97]Source
   Data file.

   In the PBMC dataset, UMAP visualizations illustrate that DOLPHIN
   distinctly delineates cell clusters closely matching ground truth cell
   types (Fig. [98]2a, top panel). In contrast, gene count tables yield
   denser clusters, obscuring subpopulations, particularly within T cell
   subsets (Fig. [99]2b, top panel). The feature matrix and adjacency
   matrix are each able to resolve specific cell types, including
   monocytes (Mono), B cells, and natural killer cells, into distinct and
   well-defined clusters (Fig. [100]2c, d, top panels). This suggests that
   both matrices effectively capture biologically relevant variations,
   facilitating the accurate identification of cell populations.
   Furthermore, their integration in DOLPHIN provides the most refined
   results. Supplementary Fig. [101]S2a highlights the abundance of exon
   and junction reads in full-length data, sufficient for constructing
   robust exon-level graphs for cell representation learning. Quantitative
   analysis with ARI and NMI metrics (Fig. [102]2e) shows that DOLPHIN
   achieves median ARI scores 0.11 higher than gene count methods, with
   statistical significance (P = 1.98 × 10^−4).

   For UMI-based platforms with limited gene coverage, DOLPHIN was applied
   to two 10X Genomics datasets. In the human colon dataset, UMAP plots
   show that DOLPHIN mitigates batch effects and produces well-defined
   clusters for Paneth-like, Goblet, and transient amplifying (TA) cells
   (Fig. [103]2a, middle panel). By contrast, the gene count table
   exhibits batch effects, blurring cell type boundaries (Fig. [104]2b,
   middle panel). The batch effect was evaluated in Supplementary
   Fig. [105]S3a, b, where we show that the DOLPHIN method exhibits
   significantly less batch effect compared to the gene count table
   approach. Notably, the integration LISI (iLISI) score showed the most
   substantial improvement, increasing from 0.01 with the gene count table
   to 0.82 with DOLPHIN, with P = 1.38 × 10^−23. Across multiple
   evaluation metrics, DOLPHIN demonstrated superior performance in
   reducing batch effects relative to the gene count table method. The
   feature matrix delineates Goblet and Paneth-like cells, while the
   adjacency matrix captures broader cell-type patterns with slightly
   diffuse boundaries (Fig. [106]2c, d, middle panel). DOLPHIN’s
   integrated embeddings achieve the best clustering accuracy, with ARI
   and NMI improvements of 0.10 and 0.08, respectively, over gene count
   tables (Fig. [107]2f). These results were statistically significant
   (P = 4.56 × 10^−25 and P = 4.85 × 10^−42, respectively), highlighting
   DOLPHIN’s robustness for low-coverage datasets.

   Similarly, in the 10X rectum dataset, DOLPHIN improved clustering
   performance, effectively resolving Enterocyte and Goblet cell
   populations, as seen in UMAP plots (Fig. [108]2a–d, bottom panel). ARI
   and NMI metrics further confirmed its advantage, with improvements of
   0.11 (P = 1.98 × 10^−44) and 0.09 (P = 1.05 × 10^−33), respectively,
   compared to gene count tables (Fig. [109]2g). We also compared the
   batch effect between the gene count table and DOLPHIN, as shown in
   Supplementary Fig. [110]S3c, d, where the iLISI score increased from
   0.06 with the gene count table to 0.40 with DOLPHIN with
   P = 4.91 × 10^−13. These findings demonstrate DOLPHIN’s adaptability to
   diverse datasets and its ability to detect biologically meaningful
   patterns even under the 10X tag-based platforms, where exon and
   junction reads are much less abundant, as shown by their distribution
   in Supplementary Fig. [111]S2b, c.

   Robustness of cell embeddings against batch effects is critical for
   accurately capturing biological variation in scRNA-seq data. We
   evaluated the robustness of DOLPHIN embeddings by conducting two
   complementary analyses. First, we assessed DOLPHIN’s default embeddings
   without applying any external batch correction to the input features.
   As shown in Supplementary Fig. [112]S3, DOLPHIN’s exon-level modeling
   inherently mitigates batch-driven separation, resulting in robust cell
   embeddings even under uncorrected conditions. To further strengthen
   this evaluation, we compared DOLPHIN embeddings against standard batch
   correction methods. Specifically, we applied Harmony^[113]41 and scVI
   to perform batch correction on the gene count matrix, and separately
   applied scVI to correct batch effects in the exon-level feature matrix
   prior to DOLPHIN embedding. In contrast, Harmony operates only on
   low-dimensional embeddings and is not compatible with exon-level
   feature correction before DOLPHIN. As shown in Supplementary
   Fig. [114]S4a–c, while all approaches reduced batch-driven separation,
   DOLPHIN embeddings derived from scVI-corrected exon inputs achieved the
   best batch mixing. This observation is further supported by
   quantitative metrics in Supplementary Fig. [115]S4d, which assess both
   biological conservation (ARI, NMI) and batch correction performance
   (batch average silhouette width (ASW), graph
   connectivity)^[116]42,[117]43. Notably, applying Harmony to gene-level
   embeddings improved the median ARI from 0.26 to 0.41
   (P = 1.71 × 10^−7), whereas DOLPHIN with batch-corrected exon inputs
   achieved a higher ARI of 0.49 compared to the Harmony-corrected gene
   count matrix (P = 2.09 × 10^−4), indicating superior preservation of
   biological structure. DOLPHIN embeddings also exhibited the highest
   median Batch ASW and comparable graph connectivity to Harmony,
   reflecting strong batch mixing while maintaining biological relevance.
   Together, these results demonstrate that DOLPHIN’s exon-level
   embeddings are inherently robust against batch effects and can achieve
   even greater performance when built upon batch-corrected exon-level
   inputs.

   Beyond clustering, DOLPHIN’s exon-level embeddings enable de novo cell
   type annotation by capturing transcriptomic differences often missed at
   the gene level. To systematically assess this, we compared gene-,
   isoform-, and exon-level expression across annotated cell types in
   three datasets. For each dataset, one well-established marker gene per
   cell type was selected^[118]30,[119]44–[120]46, and UMAP expression
   patterns were visualized for their corresponding isoforms
   (Supplementary Figs. [121]S5–[122]S7). While isoform expression
   generally resembled gene-level patterns, several isoforms revealed
   finer subcluster structures. For example, in the 10X colon dataset
   (Supplementary Fig. [123]S6a), among five isoforms of the enterocyte
   marker SLC26A3^[124]47, ENST00000453332 exhibited strong, localized
   expression, distinguishing subpopulations within enterocytes. Building
   on these observations, we emphasized exon-level features underlying
   DOLPHIN’s embeddings (Supplementary Figs. [125]S8–[126]S10). Exon-level
   expression further refined cell type-specific patterns beyond both
   gene- and isoform-level analyses. In the PBMC dataset, while CUX1 gene
   and isoform expressions broadly marked monocytes (Supplementary
   Fig. [127]S11), specific exons (e.g., exons 19 and 20) localized to the
   CD16 monocyte subcluster^[128]48 (Supplementary Fig. [129]S8b). These
   results demonstrate that DOLPHIN’s exon-level embeddings facilitate
   precise de novo annotation of cell types and subtypes, capturing
   biologically meaningful heterogeneity overlooked by conventional
   approaches.

   To explore the broader applicability of DOLPHIN for cell representation
   learning beyond short-read scRNA-seq data, we further applied it to
   single-cell long-read RNA-seq datasets^[130]49. In this analysis, we
   generated isoform-level counts and subsequently analyzed them with
   SCANPY and scVI to establish isoform-based baselines. In parallel,
   DOLPHIN was applied directly to exon-informed features to learn cell
   embeddings. As shown in Supplementary Fig. [131]S12, DOLPHIN
   consistently outperformed isoform-based approaches, achieving ARI
   improvements of 0.27 over SCANPY (P = 4.04 × 10^−18) and 0.31 over scVI
   (P = 4.17 × 10^−7). These results demonstrate that DOLPHIN can deliver
   enhanced clustering resolution even when applied to long-read datasets.

DOLPHIN outperforms conventional gene count tables in detecting
cancer-related marker genes

   The DOLPHIN framework leverages exon-level quantification in scRNA-seq
   to capture finer-grained transcriptomic details that conventional
   gene-level count methods often overlook. This approach enhances cell
   clustering accuracy and enables more insightful downstream analyses. We
   applied DOLPHIN to identify exon-level differentially expressed genes
   (EDEGs) in a pancreatic ductal adenocarcinoma (PDAC) dataset generated
   using the 10X Genomics Chromium Single Cell 3′ v2 chemistry^[132]31 and
   compared these findings to those obtained with conventional gene count
   tables, where differential genes are identified as differentially
   expressed genes (DEGs). Our analysis reveals significant improvements
   in sensitivity and biological relevance with DOLPHIN.

   Using a 10X PDAC dataset with cells from cancer and control
   conditions^[133]31, we first leveraged the latent cell embeddings from
   DOLPHIN, which integrate exon-level quantification and junction reads,
   for cell clustering. As shown in Fig. [134]3a, the clustering results
   closely aligned with cell-type annotations from the original study,
   reflecting DOLPHIN’s ability to capture distinct cellular identities.
   Focusing on cells within Leiden cluster 2, we performed differential
   gene expression analysis between cancer and control groups. For
   comparability, we applied the same cluster selection to the
   conventional gene count table approach to identify DEGs, ensuring that
   observed differences could be attributed to the method rather than
   clustering inconsistencies.

Fig. 3. DOLPHIN identifies exon-level differential genes undetectable by
gene-level analysis.

   [135]Fig. 3
   [136]Open in a new tab

   a Clustering of the PDAC dataset using DOLPHIN, with clusters labeled
   by subject condition, Leiden clusters, and cell type. Leiden cluster 2,
   highlighted, is used as an example for subsequent analyses comparing
   cancer and control groups. b Enrichment analysis reveals that
   exon-level differentially expressed genes (EDEGs) identified by DOLPHIN
   are significantly enriched in pancreatic cancer-related terms with
   lower adjusted P-values compared to differentially expressed genes
   (DEGs) identified by conventional gene count-based methods. This
   indicates deeper biological insights. Term marked as “n.s.” indicate no
   significant enrichment. The P values comparing DOLPHIN and conventional
   methods were calculated using a one-sided Wilcoxon test. c A Venn
   diagram shows that DOLPHIN identifies significantly more EDEGs than
   DEGs detected by conventional gene-level methods, highlighting its
   enhanced sensitivity in detecting biologically meaningful changes. d
   Heatmap of differentially expressed exons uniquely identified by
   DOLPHIN across cancer and control groups, alongside corresponding gene
   expression levels. The heatmaps illustrate that DOLPHIN captures subtle
   transcriptomic changes that remain undetectable at the gene level.
   P-values for cancer versus control comparisons were calculated using a
   two-sided Wilcoxon test. e Enrichment analysis of the 896 DOLPHIN-only
   EDEGs shows significant associations with pancreatic cancer-related
   terms. In contrast, 483 DEGs identified by conventional gene count-only
   methods, but not at the exon level, showed no significant enrichment in
   these terms. Adjusted P values for each enrichment term were calculated
   using one-sided hypergeometric tests, followed by multiple testing
   correction using the Benjamini–Hochberg method. f Volcano plot
   highlighting pancreatic cancer-related EDEGs identified by DOLPHIN,
   specifically from the disease term highlighted in part e. These EDEGs
   are not detected as DEGs by conventional gene count methods,
   demonstrating DOLPHIN's ability to uncover biologically important
   exon-level differential genes missed by traditional approaches.
   Non-significant differences are shaded in gray. P values were derived
   using MAST, which fits a hurdle model accounting for both detection
   rate and expression level, and were adjusted for multiple testing using
   the Benjamini–Hochberg method. See the “Methods” section for details.
   Source data are provided as a [137]Source Data file.

   In Fig. [138]3b, we present the results of disease and pathway
   enrichment analysis^[139]50 on EDEGs identified by DOLPHIN compared to
   DEGs identified using the gene count table. Here, pancreatic
   cancer-related terms show strong enrichment and lower adjusted P values
   when using EDEGs detected by DOLPHIN, underscoring the method’s
   sensitivity to relevant pathways and diseases; terms labeled “n.s.”
   (not significant) in the DEG analysis highlight the limited detection
   capacity of the conventional approach^[140]51,[141]52.

   A Venn diagram in Fig. [142]3c illustrates the overlap between EDEGs
   identified by DOLPHIN and DEGs detected using the conventional gene
   count table, revealing 896 unique EDEGs exclusively identified by
   DOLPHIN. These EDEGs correspond to genes that exhibit significant
   exon-level differential expression, which remain undetected when
   analyzed solely at the gene level using conventional methods. This
   highlights DOLPHIN’s enhanced sensitivity in capturing subtle,
   exon-specific variations that are otherwise masked in gene-level
   analyses. To further explore the biological significance of these
   uniquely identified EDEGs, we specifically examined the exons that
   contributed to their detection. From the 896 EDEGs, we selected exons
   that displayed differential expression, while their corresponding genes
   showed no significant differential expression at the gene level. The
   heatmap in Fig. [143]3d visualizes this subset, demonstrating that
   these exons exhibit robust differential expression when analyzed with
   DOLPHIN, yet are overlooked by the conventional gene count table
   approach. This underscores DOLPHIN’s ability to uncover exon-level
   regulatory changes that are critical but often missed by traditional
   gene-centric analyses.

   Further exploration of EDEGs unique to DOLPHIN is shown in
   Fig. [144]3e, where disease and pathway enrichment analysis reveals
   significant enrichment of pancreatic cancer-related terms. To
   illustrate the specific gene-level differences, a volcano plot in
   Fig. [145]3f shows log2 fold changes and adjusted P values for key
   PDAC-associated genes identified as EDEGs by DOLPHIN but missed as DEGs
   by the gene count table. The selection of these genes was guided by the
   top highlighted pancreatic cancer term in Fig. [146]3e. Several of
   these genes have well-established roles in PDAC progression and therapy
   response, including SMAD4, a canonical tumor suppressor gene frequently
   mutated or lost in PDAC and associated with poor prognosis and
   treatment resistance^[147]53–[148]55; ERCC1, a marker implicated in
   chemotherapy response and DNA repair deficiency in
   PDAC^[149]56,[150]57; TGFBR2, a key component of TGF-beta signaling,
   which plays a dual role in tumor suppression and progression in
   pancreatic cancer^[151]58,[152]59; and ATM, a DNA damage response
   kinase frequently mutated in PDAC, where its loss impairs double-strand
   break repair and confers increased sensitivity to DNA-damaging agents
   and PARP inhibitors^[153]60,[154]61. The identification of these genes
   through exon- and junction-level resolution suggests that DOLPHIN can
   recover biologically and clinically meaningful signals that remain
   undetected by conventional pipelines, with potential implications for
   both diagnostic biomarker discovery and therapeutic targeting. The
   distribution of these genes underscores DOLPHIN’s enhanced sensitivity,
   with many exhibiting exon-level differential expression that does not
   translate to gene-level differences, making them undetectable by
   conventional methods.

   To assess the clinical relevance of the 896 DOLPHIN-unique EDEGs
   identified in this PDAC dataset, we conducted a Kaplan-Meier survival
   analysis using real patient survival data from The Cancer Genome Atlas
   (TCGA) PDAC cohort^[155]62, stratifying patients based on the
   expression of DOLPHIN-unique EDEGs. Given that pseudo-bulk expression
   profiles derived from single-cell data may introduce biases into
   downstream analyses, particularly due to dropout events and limited
   coverage of lowly expressed genes^[156]63, we instead validated the
   clinical relevance of our findings using matched bulk RNA-seq data to
   ensure more reliable interpretation of survival associations. This
   strategy moves beyond pseudo-bulk approximations and leverages
   orthogonal, external bulk datasets to provide a more robust assessment
   of the prognostic value of the identified genes. As shown in
   Supplementary Fig. [157]S13a, we stratified patients into high-risk and
   low-risk groups based on the expression of the top 100 and all 896
   EDEGs, where the genes were ranked by increasing adjusted P from our
   DOLPHIN-based differential analysis. Across all subsets, the separation
   between risk groups was statistically significant, with the strongest
   prognostic signal observed when using the full set of 896 EDEGs
   (P = 2.22 × 10^−39, log-rank-sum test^[158]64). To characterize how the
   prognostic signal accumulates with increasing numbers of EDEGs, we
   plotted the association P values across ranked gene sets (Supplementary
   Fig. [159]S13b). The resulting curve demonstrates a consistent and
   monotonic strengthening of survival association as more top-ranked
   EDEGs are included. These analyses collectively demonstrate that the
   EDEGs uniquely identified by DOLPHIN not only capture biologically
   relevant information missed by gene-level approaches but also exhibit
   strong clinical relevance when validated against independent datasets.
   Additionally, we conducted a similar analysis using the junction count
   table to identify junction-level differentially expressed genes
   (JDEGs), as shown in Supplementary Fig. [160]S14b–d. This analysis
   further reinforces DOLPHIN’s capability in capturing transcriptomic
   variations beyond gene-level limitations, particularly in exon and
   junction reads usage.

   In addition to the results observed in Cluster 2, which contains a
   balanced number of cells between disease and control groups, we further
   examined other disease-relevant clusters to assess the robustness and
   generalizability of DOLPHIN under realistic group size imbalances.
   Given the biological relevance of ductal cells to PDAC, which
   originates from the epithelial lining of the pancreatic ducts, we
   additionally included Ductal Type 1 and Type 2 cell clusters in the
   EDEG and DEG comparison. Unlike Cluster 2, the ductal clusters exhibit
   pronounced imbalance in group sizes, reflecting a common feature of
   real-world single-cell datasets where cell-type abundance may vary
   across conditions. Specifically, this ductal cluster contains 1067
   cells, including 891 from cancer samples and 176 from healthy controls,
   providing a challenging and biologically meaningful setting to evaluate
   the robustness of differential analysis. Although downsampling has been
   proposed as a strategy to address group imbalance^[161]65, we did not
   employ it in this study, as doing so would further reduce the already
   limited number of cells in biologically relevant populations and
   diminish statistical power. Results are shown in Supplementary
   Fig. [162]S15. DOLPHIN identified 445 more significant genes than the
   conventional gene count-based method, as shown in Supplementary
   Fig. [163]S15b. Enrichment analysis Supplementary Fig. [164]S15c
   further demonstrates the biological relevance of these additional
   genes: the EDEGs identified by DOLPHIN yielded stronger enrichment for
   pancreatic-related terms compared to DEGs. Notably, the 1491 EDEGs
   uniquely identified by DOLPHIN were significantly enriched in the
   pancreatic cancer-related term, whereas the 1046 DEGs identified only
   by gene count-based analysis did not yield any enrichment for such term
   (Supplementary Fig. [165]S15d). These results highlight the added
   biological signal gained through exon-level analysis. We also analyzed
   JDEGs based on DOLPHIN’s junction reads. As shown in Supplementary
   Fig. [166]S15f, the 2867 JDEGs were significantly enriched for
   pancreatic disease-related terms. Furthermore, even when considering
   only the 1583 JDEGs that did not overlap with DEGs, enrichment analysis
   still revealed pancreatic cancer-related terms Supplementary
   Fig. [167]S15g. These findings emphasize the additional biological
   resolution provided by junction-level modeling and demonstrate that
   DOLPHIN’s splicing-aware framework captures disease-relevant signals
   that are often missed by conventional gene expression analyses.

DOLPHIN effectively detects alternative splicing events through junction read
aggregation

   DOLPHIN integrates exon reads and junction reads to aggregate cells
   based on exon-junction read patterns, making it well-suited for AS
   analysis at the single-cell level. To evaluate its performance, we
   selected Outrigger as a baseline, as it is one of the most widely used
   tools for AS event detection in transcriptomics^[168]65–[169]67. This
   comparison underscores the advantages of DOLPHIN’s junction-read-aware
   aggregation, a key feature that enhances sensitivity and accuracy in
   detecting AS events at the single-cell level. Notably, DOLPHIN’s
   aggregation approach can be adapted to work with other AS tools (see
   benchmarking sections), showcasing its versatility.

   In the full-length PBMC dataset, DOLPHIN shows marked improvements over
   Outrigger in detecting AS events. The top panel of Fig. [170]4a
   illustrates the number of Exon Skipping (ES) and Mutually Exclusive
   Exon (MXE) events detected per cell using Outrigger with single-cell
   input versus aggregated cell input generated by DOLPHIN. In this
   context, “single-cell input” refers to the original, unaggregated
   scRNA-seq reads, which were supplied directly to Outrigger without any
   aggregation. This configuration reflects the baseline setting used to
   evaluate the impact of DOLPHIN’s read aggregation strategy. The results
   demonstrate a substantial increase in the number of detected events
   using DOLPHIN, with the median count for ES rising from 183 to 1215,
   and for MXE increasing from 4 to 22, indicating a marked enhancement in
   sensitivity. We next assessed whether DOLPHIN effectively enhances
   single-cell splicing detection by examining AS events jointly detected
   by both approaches (Fig. [171]4a). While Fig. [172]4a summarizes the
   total number of events per cell, it does not capture how consistently
   each shared event is detected across cells by the two methods. To
   address this, we analyzed the cell-level detection patterns of
   overlapping AS events (Supplementary Fig. [173]S16). On one hand,
   DOLPHIN robustly preserves the detection of AS events originally
   identified by the single-cell input. In Supplementary
   Fig. [174]S16a, we present paired heatmaps for the full-length PBMC
   dataset, showing the detection patterns for each event across cells. We
   found that 97.8% of AS events detected by the single-cell input were
   also detected by DOLPHIN, demonstrating strong consistency. On the
   other hand, DOLPHIN identifies substantially more AS events beyond
   those captured by the single-cell input. In Supplementary Fig.
   [175]S16b, we quantify this relationship by plotting the distribution
   of Pearson correlation coefficients between the detection patterns of
   the two methods for each AS event. Across cells, DOLPHIN detected ~4.8
   times more events than the single-cell method alone. Together, these
   results demonstrate that DOLPHIN not only preserves the fidelity of
   single-cell AS detection but also enhances sensitivity by recovering a
   more complete landscape of splicing events across cells.

Fig. 4. DOLPHIN enhances alternative splicing detection and analysis.

   [176]Fig. 4
   [177]Open in a new tab

   a–c Detection of alternative splicing (AS) events across three
   datasets: Top: full-length PBMC, Middle: 10X colon, and Bottom: 10X
   rectum. a DOLPHIN identifies significantly more AS events, including
   exon skipping (ES) and mutually exclusive exons (MXE), compared to the
   baseline Outrigger tool, demonstrating superior sensitivity in
   detecting splicing variations. b Scatter plots of Percent Spliced-In
   (PSI) values show that DOLPHIN achieves higher correlation with
   pseudo-bulk data (used as a proxy ground truth), indicating more
   accurate AS quantification than conventional approaches. c UMAP plots
   based on PSI values reveal that DOLPHIN captures distinct
   cell-type-specific splicing patterns with greater clarity and
   biological relevance, improving resolution of splicing events missed by
   baseline methods. d Sashimi plots for the AS event HsaEX0051104 in the
   full-length PBMC dataset show stronger junction read signals after
   DOLPHIN aggregation, enabling detection of splicing events overlooked
   by conventional methods. e Similarly, for the AS event HsaEX0013878 in
   the 10X colon dataset, DOLPHIN enhances junction read signals,
   uncovering AS events missed by the baseline approaches. P values from
   one-sided Student’s t-tests: *P < 0.05, **P < 0.01, ***P < 0.001,
   ****P < 0.0001. Exact P values are provided in the source data. Source
   data are provided as a [178]Source Data file.

   To further demonstrate DOLPHIN’s capability, we compared PSI values
   between pseudo-bulk and single-cell samples (top panel of
   Fig. [179]4b), using pseudo-bulk PSI values as a proxy ground truth, a
   strategy commonly employed for AS validation^[180]65,[181]67,[182]68.
   Each point represents the PSI value for a specific AS event, with a
   higher density of points along the diagonal in DOLPHIN indicating
   stronger concordance with pseudo-bulk data. The Pearson correlation
   increases by 0.06 (P = 6.37 × 10^−242), indicating that the additional
   AS events detected by DOLPHIN exhibit comparable, if not stronger,
   correlation with pseudo-bulk results. This improvement reflects
   DOLPHIN’s enhanced detection capabilities and greater precision in
   capturing splicing patterns. In the scatter plot, we observed a higher
   density of AS events along the diagonal, reflecting a broader
   improvement across the entire PSI spectrum. AS in most cell populations
   predominantly yields near-complete exon inclusion or exclusion, with
   intermediate splicing states being relatively rare and technically
   challenging to detect^[183]22,[184]65. Building on this observation, we
   analyzed the distribution of detected AS events across different PSI
   ranges and assessed the corresponding junction read support to
   characterize DOLPHIN’s aggregation-enhanced detectability. As shown in
   Supplementary Fig. [185]S17a (upper panel), DOLPHIN-enhanced input
   increased the total number of detected exon-skipping events across all
   three PSI categories (PSI = 0, 0 < PSI < 1, and PSI = 1) in the
   full-length PBMC dataset. The most pronounced gain was observed for
   PSI = 1, with 431,406 additional events detected, although noticeable
   improvements were also seen in the other PSI ranges. We further
   examined the junction read support across the full PSI spectrum
   (Supplementary Fig. [186]S17b, upper panel). In the single-cell input,
   events with intermediate PSI values (e.g., between 0.4 and 0.6)
   exhibited substantially lower read counts, with a mean of 66 reads.
   After DOLPHIN enhancement, the mean read count increased to 168, a
   statistically significant difference (one-sided Mann–Whitney U test,
   P < 10^−300). These results demonstrate that DOLPHIN improves the
   detection of AS events across the PSI spectrum, including low-coverage
   events with intermediate splicing levels.

   To evaluate whether the PSI values reflected biologically meaningful
   splicing regulation, we assessed their ability to capture
   cell-type-specific splicing patterns. Specifically, we used PSI values
   as input features for cell representation and clustering analyses. The
   UMAP plots in the top panel of Fig. [187]4c show that DOLPHIN-inferred
   PSIs yield sharper boundaries between cell types compared to
   single-cell PSI values alone. This improvement is quantitatively
   supported by a 0.38 increase in ARI (P = 2.70 × 10^−121). These results
   indicate that DOLPHIN more effectively captures splicing signals that
   distinguish cell types, suggesting higher biological relevance and
   improved splicing quantification accuracy.

   Beyond full-length single-cell data, we extended our evaluation to the
   common tag-based 10X Genomics scRNA-seq data from human colon samples
   to demonstrate its general applicability, where DOLPHIN showed robust
   performance even with limited transcriptome coverage. The middle panel
   of Fig. [188]4a shows that the distribution of detected events by
   DOLPHIN is shifted towards higher counts compared to single-cell data
   alone (without aggregation), with the median number of detected ES
   increasing from 58 to 224, and the maximum number of MXE detected per
   cell rising from 2 to 8. This underscores DOLPHIN’s sensitivity to data
   with partial coverage. The concordance heatmap shown in the middle part
   of Supplementary Fig. [189]S16a further illustrates that DOLPHIN
   consistently preserves the original single-cell detection signals while
   detecting additional AS events across cells. The scatter plot between
   pseudo-bulk and single-cell PSI values (middle panel of Fig. [190]4b)
   demonstrates an improvement in Pearson correlation by 0.02
   (P = 1.19 × 10^−208) with DOLPHIN, further validating its accuracy. In
   addition to the correlation improvement, we observed a higher density
   of AS events along the diagonal in the scatter plot, reflecting
   DOLPHIN’s broader enhancement across the entire PSI spectrum.
   Specifically, in the lower panel of Supplementary Fig. [191]S17b, AS
   events with PSI = 1 showed the greatest increase, with an additional
   480,987 events detected compared to the original single-cell input. The
   mean junction read count supporting AS events with intermediate PSI
   values (i.e., between 0.4 and 0.6) increased from 65 to 116
   (P = 9.16 × 10^−3). These results confirm that DOLPHIN enhances the
   detection of low-coverage AS events with intermediate PSI values even
   in the sparse 10X dataset. The UMAP plots (middle panel of
   Fig. [192]4c) demonstrate that DOLPHIN achieves clear separation of
   specific cell types, such as TA and enterocyte cells, with an increase
   in the ARI score by 0.06 (P = 3.70 × 10^−31) compared to single-cell
   data, highlighting its broad applicability across various datasets. We
   observed similar improvements with the tag-based 10X rectum data (the
   bottom panels of Fig. [193]4a–c and Supplementary Fig. [194]S16a, b).
   Specifically, the bottom panel of Fig. [195]4a reveals an increase in
   the number of detected ES, with median values rising from 62 to 200,
   and for MXE events, from 1 to 2. Additionally, the bottom panel of
   Fig. [196]4b shows an improved correlation with pseudo-bulk PSI values,
   increasing by 0.01 (P = 1.22 × 10^−187). Notably, the bottom panel of
   Fig. [197]4c shows that the UMAP plot achieves clearer separation of
   Enterocyte cells using DOLPHIN, further validating its robustness.

   To illustrate the detailed insights DOLPHIN provides, we present
   examples of exon and junction read coverage for specific AS events.
   Fig. [198]4d showcases the full-length PBMC splicing event HsaEX0051104
   in the naïve T cell sample “SRR18385965,” comparing single-cell data
   with DOLPHIN-aggregated data. HsaEX0051104, an exon-skipping event in
   the PTPRC gene that generates the CD45RA isoform, critical for T cell
   function^[199]69,[200]70. HsaEX0051104 encompasses three exons (exon 4,
   exon 5, and exon 6), with junction read counts of 13 between exons 4
   and 5, 31 between exons 5 and 6, and 16 between exons 4 and 6. However,
   in single-cell data, this splicing event is not detectable due to the
   absence of junction reads spanning exons 4 and 6, which are critical
   for validating the exon-skipping event. We applied an in silico
   pseudo-bulk validation strategy using CD4 T cells to independently
   confirm the biological existence of the AS event identified by DOLPHIN.
   Specifically, we generated 20 pseudo-bulk BAM files by randomly
   sampling 80% of CD4 T cells per replicate, simulating replicate-level
   coverage. VALERIE^[201]71 was then applied to profile AS events based
   on junction read and coverage signals across these samples. As shown in
   Supplementary Fig. [202]S18a, VALERIE consistently detected the same
   exon-skipping event in PTPRC (HsaEX0051104) identified by DOLPHIN, with
   stable PSI distributions across replicates. We further confirmed this
   event by applying VALERIE to DOLPHIN’s single-cell BAM files for CD4 T
   cells (Supplementary Fig. [203]S18b), providing orthogonal evidence of
   its reproducibility and biological relevance. To further support this
   AS event, we visualized the pseudo-bulk read coverage using
   ggsashimi^[204]72. As shown in Supplementary Fig. [205]S18c, the
   sashimi plot based on full-length PBMC pseudo-bulk alignments clearly
   demonstrates the exon-skipping pattern corresponding to HsaEX0051104.
   In addition, DOLPHIN uncovers another splicing event (HsaEX0051102)
   involving exons 1, 3, and 4. This event is supported by 25 junction
   reads between exons 1 and 3, 11 reads between exons 3 and 4, and 22
   reads connecting exons 1 and 4. Conversely, in this specific single
   cell, this event is not detected due to the lack of junction reads
   bridging exons 1 and 4, which are crucial for identifying this splicing
   pattern. In Fig. [206]4e, we investigated the splicing event
   HsaEX0013878 within the CD47 gene in progenitor cell
   “AAGCCGCCACTACAGT-1” from the 10X colon dataset. CD47 has been
   implicated in colorectal cancer progression^[207]73,[208]74. This event
   involves exons 1, 2, and 3, with 52 junction reads supporting the
   connection between exons 1 and 2, 30 reads between exons 2 and 3, and
   70 reads between exons 1 and 3. However, this specific cell lacks the
   crucial junction reads linking exons 1 and 3, thereby precluding the
   detection of this event in this cell. The presence of this splicing
   event was further supported by pseudo-bulk alignments of progenitor
   cells (Supplementary Fig. [209]S18d). These examples underscore
   DOLPHIN’s capacity to uncover complex AS patterns and demonstrate its
   effectiveness in enhancing single-cell AS analyses through
   junction-read-informed cell aggregation, revealing biologically
   significant insights otherwise missed by standard methods.

DOLPHIN reveals biologically relevant alternative splicing events unique to
specific cell types

   We assessed DOLPHIN’s capability to detect cell-type-specific AS events
   by calculating PSI values for each event per cell type, enabling
   differential AS analysis. Genes associated with significantly
   differentially spliced events were identified as differentially spliced
   genes. Fig. [210]5a, b and Supplementary Fig. [211]S19 highlight the
   biological relevance of these cell-type-specific events identified by
   DOLPHIN, underscoring its ability to detect distinct splicing patterns
   not captured by the raw single-cell data without DOLPHIN aggregation
   enhancement. Specifically, Fig. [212]5a displays dot plots of the top
   differentially spliced events across cell types in the full-length PBMC
   and tag-based 10X colon datasets, respectively. The labels in the plot
   correspond to differentially spliced genes and event identifiers, which
   provide detailed information for each splicing event provided in
   Supplementary Table [213]S1 and Supplementary Table [214]S2. In
   contrast, Supplementary Fig. [215]S19 and Supplementary Table [216]S3
   display the top differentially spliced events identified using raw
   single-cell data without aggregation. Without DOLPHIN’s aggregation,
   the analysis based on raw data alone fails to capture the distinct
   splicing patterns, as evidenced by the reduced separation of PSI values
   across cell types. Dot colors in the plots represent the average PSI
   values of an event for cells from each specific cell type, further
   highlighting DOLPHIN’s capability to detect differential splicing
   events that were previously missed. For example, in the full-length
   PBMC dataset, unique splicing events specific to B cells were
   challenging to distinguish from dendritic cells (DCs) and Other cells
   using the raw single-cell data, but are now clearly identifiable after
   DOLPHIN aggregation. Similarly, in the 10X colon dataset, top
   differentially spliced events appear more prominently in paneth-like
   cells compared to the single-cell method, illustrating DOLPHIN’s
   enhanced sensitivity to cell-type-specific splicing.

Fig. 5. DOLPHIN captures biologically relevant alternative splicing events
that signify cell-type-specific differences.

   [217]Fig. 5
   [218]Open in a new tab

   a Dot plots showing the PSI values of the top differentially spliced
   events identified by DOLPHIN. b GO biological process (GOBP) enrichment
   analysis of biologically significant differentially spliced genes
   identified by DOLPHIN, with alternative splicing-related terms
   highlighted in red. Adjusted P-values for each enrichment term were
   calculated using one-sided hypergeometric tests, followed by multiple
   testing correction using the Benjamini–Hochberg method. c Schematic
   illustration explaining PSI distribution splicing modality
   categorization. d PSI distribution for a single alternative splicing
   event, categorized by splicing modality across cell types,
   demonstrating that DOLPHIN provides clearer distinctions of splicing
   differences that align with cell type identities. e Splicing modality
   composition across single cells shows that DOLPHIN captures more
   distinct and biologically relevant splicing patterns by reducing the
   proportion of multimodal (null) categories, which represent PSI
   distributions without clear features. This demonstrates that DOLPHIN
   reduces ambiguity in alternative splicing event detection, enabling
   more precise analysis. f UMAP plots of cell clusters using PSI modality
   one-hot encoding demonstrate that the PSI splicing modalities
   identified by DOLPHIN retain strong cell-type-specific signals. DOLPHIN
   enhances the resolution of these cell-type-specific splicing patterns,
   providing clearer separation and biologically meaningful clustering
   compared to single-cell data alone. These biologically relevant
   alternative splicing events can contribute to more accurate cell type
   classification and offer insights into cellular diversity and potential
   disease mechanisms. P values from one-sided Student’s t-tests:
   *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001; n.s. not
   significant. Exact P values are provided in the source data. Source
   data are provided as a [219]Source Data file.

   Supplementary Fig. [220]S20a highlights the top differentially spliced
   genes that could not be identified using traditional gene count-based
   differential expression methods, alongside their expression values in
   the PBMC and colon datasets. Unlike conventional approaches that
   primarily focus on gene expression differences, DOLPHIN leverages
   PSI-based differences to uncover differentially spliced genes that
   remain undetectable with single-cell gene count data alone. This
   capability is particularly evident in the 10X colon dataset, where
   unique splicing patterns are revealed across cell types, even in the
   absence of significant gene expression changes, underscoring DOLPHIN’s
   distinct advantage in detecting splicing-driven heterogeneity. To
   confirm the biological significance of these findings, we performed
   gene ontology biological process (GOBP) enrichment analysis using
   differentially spliced genes. In the upper panel of Fig. [221]5b, GOBP
   terms enriched in B cells from the PBMC dataset include B cell
   activation and B cell receptor signaling, reinforcing the biological
   relevance of these identified splicing events^[222]75. Additionally,
   GOBP terms associated with AS confirm DOLPHIN’s accuracy in detecting
   spliced genes involved in splicing regulation. In the 10X colon
   dataset, GOBP enrichment analysis (lower panel of Fig. [223]5b)
   revealed terms critical to enterocyte function, such as metabolic
   processes, aerobic respiration, and mitochondrial electron
   transport—biological processes that are essential for maintaining gut
   health^[224]76,[225]77. The identification of AS-related GOBP terms
   reflects the adaptive role of enterocytes in modulating gene expression
   in response to environmental and cellular stressors^[226]78,[227]79.
   GOBP enrichment analysis for other cell types is presented in
   Supplementary Fig. [228]S21, further underscoring the functional
   relevance of splicing events detected by DOLPHIN. Supplementary
   Fig. [229]S20b illustrates the distinction between differentially
   spliced genes identified by DOLPHIN and DEGs detected using
   conventional gene count methods. Supplementary Fig. [230]S20c presents
   the GOBP enrichment analysis for differentially spliced genes uniquely
   detected by DOLPHIN, after excluding those already identified as DEGs
   by conventional gene count-based approaches. The GOBP enrichment
   analysis of these remaining genes reveals critical biological processes
   encoded within PSI values that cannot be detected using gene count data
   alone, highlighting DOLPHIN’s unique ability to uncover
   splicing-specific regulatory mechanisms. To provide a more granular
   view of splicing distributions, we applied the Anchor tool from
   Expedition^[231]22, categorizing PSI distributions into five splicing
   modalities: excluded, bimodal, included, middle, and multimodal (null)
   (Fig. [232]5c). This categorization reveals variations in PSI
   distributions across cell types, facilitating detection of cell
   type-specific splicing patterns. In scRNA-seq data, splicing events
   often exhibit varying degrees of PSI consistency within the same cell
   type^[233]22,[234]65. Some events show concentrated PSI distributions
   corresponding to clear splicing modes, such as inclusion or exclusion,
   whereas others display dispersed or heterogeneous PSI patterns,
   classified as multimodal or null modalities. Multimodal splicing
   patterns can arise from genuine biological heterogeneity, including the
   co-expression of multiple isoforms and dynamic splicing regulation
   across cell types^[235]22,[236]80. However, in sparse single-cell
   datasets, multimodal and null modalities can also result from technical
   factors such as incomplete read coverage, dropout, and measurement
   noise, making the interpretation of such events more challenging. Null
   modalities, in particular, indicate splicing signals lacking sufficient
   consistency across cells, thereby complicating the identification of
   robust, biologically meaningful splicing patterns. DOLPHIN improves
   signal clarity by enhancing read coverage and exon-level resolution,
   which increases the proportion of splicing events that can be
   classified into more interpretable modalities.

   In the upper panel of Fig. [237]5d, we examine the splicing event
   HsaEX0051104 in the PBMC dataset, comparing PSI distributions from
   single-cell data with DOLPHIN results. DOLPHIN identifies four distinct
   splicing modes across eight cell types, whereas single-cell data alone
   captures only three modes. Notably, DOLPHIN enhances the detection of
   splicing variations in CD8 T cells, shifting the distribution from a
   null mode to a middle mode, thereby providing a clearer and more
   accurate representation of these events. We performed in silico
   validation of this splicing event using a bootstrapped pseudo-bulk
   strategy. Specifically, we randomly sampled 80% of CD8 T cells multiple
   times to construct pseudo-bulk profiles and applied VALERIE to
   visualize splicing signals at the event locus. As shown in
   Supplementary Fig. [238]S22a, the consistent detection of junction
   usage and read coverage patterns across replicates confirms the
   presence of this ES event. In addition, we assigned splicing modality
   based on the PSI values derived from these pseudo-bulk replicates. The
   resulting modality, shown in Supplementary Fig. [239]S22b, consistently
   falls within the middle modality, validating the splicing distribution
   identified by DOLPHIN in CD8 T cells.

   In the lower panel of Fig. [240]5d, the 10X colon dataset displays the
   PSI distribution for the splicing event HsaEX0013878 within the CD47
   gene across different cell types. The CD47 gene, known for its
   involvement in tumor progression and immune evasion, is typically
   upregulated in colorectal cancer tissues^[241]81. Due to its relevance
   in colorectal cancer, it is anticipated that multiple transcripts of
   CD47 would be detected in the colon dataset. Of the six annotated
   transcripts, three contain the splicing event HsaEX0013878: transcripts
   ENST00000398258 and ENST00000361309 include all three exons, while
   ENST00000517766 exhibits ES. To better contextualize the PSI
   distributions of this event, we incorporated pseudo-bulk transcript
   quantification using kallisto^[242]82 to estimate the exon inclusion
   probability across cell types for exon chr3:108049619-108049651
   (HsaEX0013878). These estimates, shown as red dashed lines in the
   Supplementary Fig. [243]S23a, serve as in silico reference values for
   comparing single-cell and DOLPHIN-aggregated results. Notably, we
   applied bootstrapping to quantify the absolute differences between
   method-specific median PSI values and pseudo-bulk estimates across cell
   types shown in Supplementary Fig. [244]S23b. The DOLPHIN-aggregated PSI
   values showed smaller deviations from the pseudo-bulk references