Abstract

   Nucleotide-containing metabolites, e.g., NAD, can serve as noncanonical
   initiating nucleotides (NCIN) during transcription, yielding
   NCIN-capped RNAs (NCIN-RNAs). Current profiling strategies are limited
   to detecting specific metabolite caps and lack an epitranscriptome-wide
   approach for quantifying the ratio between NCIN- and m^7G-capped forms.
   Here, we develop the CompasSeq analytical platform, which integrates
   experimental and computational frameworks, enabling comprehensive and
   quantitative assessment of NCIN-RNAs at the transcript resolution.
   CompasSeq utilizes carefully devised enzymatic reactions to selectively
   capture NCIN-RNAs. By introducing proper spike-ins, CompasSeq can
   analyze the stoichiometry of NCIN caps. We further design an orthogonal
   method, the quantitative exoribonuclease reduction assay, to validate
   newly identified NCIN-RNAs and their capping ratios. Using CompasSeq,
   we quantify previously unexplored NCIN capping percentages from mouse
   liver and illustrate their age-associated dynamics. Moreover, we
   uncover a dichotomy between RNA expression and NCIN capping in genes
   impinging on age-related pathways. Our study presents both experimental
   and computational solutions for in-depth analysis of NCIN-RNAs, paving
   the road for functional investigations into NCIN-RNAs.

   Subject terms: Transcriptomics, Sequencing, RNA modification
     __________________________________________________________________

   Certain cellular metabolites can cap RNA molecules, influencing their
   function. Here, the authors develop CompasSeq, a quantitative method to
   profile these noncanonical RNA caps across the transcriptome and reveal
   their dynamic changes with age in mouse liver.

Introduction

   The 5′-terminus of eukaryotic mRNA is characterized by a prevalent cap
   structure composed of 5′,5′-triphosphate-linked 7-methylguanosine
   (m^7G). This canonical capping exerts profound impacts on modulating
   various functions of mRNA species, such as stability, splicing,
   polyadenylation, and translation^[36]1,[37]2. Recent studies have
   revealed that nucleotide-containing metabolites, including nicotinamide
   adenine dinucleotide (NAD), flavin adenine dinucleotide (FAD),
   3′-dephospho-coenzyme A (dpCoA), uridine diphosphate glucose (UDP-Glc),
   uridine diphosphate N-acetyl glucosamine (UDP-GlcNAc), and dinucleoside
   polyphosphates, can be incorporated into the RNA 5′-terminus as
   noncanonical initiating nucleotides (NCIN), resulting in
   NCIN-RNA^[38]3–[39]7. Since the first discovery of NAD-capped RNA in
   bacteria^[40]8, NAD-RNAs have been reported in all kingdoms of
   life^[41]9–[42]16. Although studies have indicated the presence of FAD,
   dpCoA, UDP-Glc/GlcNAc, dinucleoside polyphosphate caps across different
   phyla^[43]13,[44]17–[45]20, comprehensive profiling of NCIN-RNAs at the
   epitranscriptome level is only beginning to be revealed.

   The current profiling strategies are limited to detecting specific
   metabolite caps^[46]9,[47]18,[48]21–[49]26. NAD captureSeq^[50]9, along
   with our recently developed methods, ONE-seq^[51]24 and DO-seq^[52]26,
   employ chemo-enzymatic reactions to label NAD-capped RNAs with biotin,
   catalyzed by adenosine diphosphate-ribosyl cyclase. These biotinylated
   RNAs are then captured by streptavidin beads and subjected to
   high-throughput sequencing to identify NAD-RNAs. Yet, these methods
   only reflect the relative enrichment for a specific type of
   metabolite-capped transcript and cannot quantify the ratio between
   metabolite-capped and m^7G-capped forms. Decapping-based strategies,
   such as CapZyme-seq^[53]27, utilize processing enzymes like NudC to
   target NCIN-capped RNAs, yielding RNAs with a 5′-monophosphate.
   Subsequently, barcoded RNA adapters are ligated to the 5′-end of these
   RNAs, followed by adapter-specific PCR enrichment and high-throughput
   sequencing. However, a notable limitation of applying CapZyme-seq in
   eukaryotes arises from contaminating signals of m^7G, the dominant cap
   type^[54]28. Furthermore, currently available methods lack the ability,
   including experimental and computational procedures, for the
   epitranscriptome-wide percentage assessment of NCIN capping at the
   transcript resolution.

   Aging is characterized by physiological declines accompanied with
   metabolic and transcriptional alterations^[55]29,[56]30, but how these
   processes are integrated into the regulation of aging remains elusive.
   Studies in rodents and humans have shown that the levels of NAD and FAD
   decline with age in multiple tissues, including the
   liver^[57]31–[58]33. Additionally, our recent findings reveal prominent
   age-associated alterations in NAD capping in mouse liver and human
   peripheral blood^[59]24,[60]34, highlighting the potential role of
   metabolite-modified RNAs during aging. Cellular metabolism and gene
   expression are fundamental biological processes, making it crucial to
   explore how metabolite-modified RNAs, which coordinate metabolic and
   transcriptional regulation, are modulated during the adult life cycle
   and consequently, their physiological impact on the progression of
   aging. However, the absence of quantitative methods to determine
   epitranscriptome-wide NCIN caps with stoichiometric information impedes
   investigations into the contribution of NCIN capping to a broad
   spectrum of biological processes.

   In the present study, we demonstrate that the CompasSeq analytical
   platform, combined with experimental and computational procedures,
   enables a comprehensive and quantitative assessment of NCIN capping
   levels in eukaryotes. We further devise an orthogonal and independent
   approach, the quantitative exoribonuclease reduction assay, to validate
   newly identified NCIN-RNAs and their capping ratios. Utilizing
   CompasSeq, we uncover previously unknown features of
   metabolite-modified RNAs in adult and aging mouse livers.

Results

The workflow of CompasSeq

   We designed the CompasSeq method for Comprehensive percentage
   assessment of NCIN-RNAs through enzymatic decapping followed by 5′-end
   adapter ligation and high-throughput Sequencing, an analytical platform
   that integrated both experimental and computational frameworks
   (Fig. [61]1).

Fig. 1. The workflow of CompasSeq.

   [62]Fig. 1
   [63]Open in a new tab

   a CompasSeq implements a parallel experimental pipeline to
   simultaneously capture NCIN-capped RNAs and all-capped RNAs (NCIN- and
   m^7G-capped RNAs). Meanwhile, CompasSeq implements a dual spike-in
   strategy to quantify NCIN-capped RNAs. A set of synthetic NCIN-RNAs
   with known capping ratios, is introduced at the beginning to serve as
   internal controls. Adapter-pre-ligated Drosophila RNA, is added at the
   step of cDNA library construction for subsequent computational
   analysis. b Diagram illustrating the workflow of CompassAnalyzer.
   CompassAnalyzer first applies stepwise normalizations to recover the
   intrinsic RNA abundance distribution between NCIN-capped and all-capped
   RNAs from sequencing data. It then employs empirical Bayes shrinkage to
   enable quantification of NCIN capping ratios at the transcript level.

   First, we subjected total RNA transcripts to cap-specific enzymes. To
   selectively capture NCIN-capped RNAs, we began with yDcpS, a decapping
   enzyme that hydrolyzes m^7G-capped mRNA, followed by CIAP treatment to
   eliminate 5′-phosphate residues. We then introduced the bacterial
   RNA-decapping enzyme NudC, which hydrolyzes the diphosphate group in
   the cap structures of NAD, NADH, FAD, and dpCoA, ensuring that only
   NCIN-RNA yields 5′-monophosphate ready for ligation. To capture both
   NCIN- and m^7G-capped RNAs, we initially treated total RNA with CIAP to
   remove pre-existing 5′-phosphate moieties. We then applied RppH, a
   pyrophosphatase, to hydrolyze pyrophosphate from the 5′ end of
   triphosphorylated RNA, together with NudC, to generate 5′-monophosphate
   RNAs from all-capped transcript forms. Second, we performed ligation
   between 5′ oligoribonucleotide adapters and 5′-monophosphate RNAs
   derived from either NCIN- or all-capped transcripts, followed by
   polyA-selected reverse transcription and high-throughput sequencing. Of
   note, we deliberately introduced an exogenous RNA spike-in as an
   internal control essential for subsequent computational analysis
   (Fig. [64]1a). Third, we proceeded with a computational framework
   composed of stepwise normalizations and empirical Bayes shrinkage.
   Stepwise normalizations, including global scaling using exogenous RNA
   spike-ins, linear adjustment, and the RUV strategy^[65]35, were applied
   to recover the intrinsic RNA abundance distribution between NCIN-capped
   and all-capped RNAs from sequencing data. Empirical Bayes shrinkage was
   then employed to enable quantification of NCIN capping ratios at
   transcript resolution (Fig. [66]1b).

Development of CompasSeq experimental procedures

   We generated 38 nt RNA spike-ins via in vitro transcription, using
   adenine exclusively as the initial nucleotide in the DNA template to
   synthesize NAD, dpCoA, and FAD-capped RNAs (Fig. [67]2a). As
   anticipated, capped RNAs (NAD and dpCoA) exhibited higher molecular
   weight than uncapped forms (ppp-RNA) as observed by UREA-PAGE gel
   electrophoresis (Fig. [68]2a). Consistent with previous
   reports^[69]22,[70]36, two RNA species for FAD-RNA were observed, with
   upper band being intact FAD-capped RNA and the lower band resulting
   from the loss of the flavin moiety together with one phosphate group
   (Fig. [71]2a). To corroborate the quality of the RNA spike-ins, we
   employed boronate affinity electrophoresis. Boronyl groups from boric
   acid form relatively stable complexes with vicinal diols, naturally
   occurring in the m^7G of the m^7G cap and the nicotinamide riboside of
   the NAD cap, leading to gel retardation during electrophoresis^[72]37.
   Indeed, only RNA with m^7G and NAD caps, but not ppp-RNA, exhibited
   electrophoretic mobilities, indicating the presence of these specific
   caps in the RNA spike-ins (Fig. [73]2b).

Fig. 2. Development of CompasSeq.

   [74]Fig. 2
   [75]Open in a new tab

   a In vitro transcription with ATP, m^7GpppA, NAD, FAD, or dpCoA as the
   initiating nucleotide. Sequence of the DNA template with the T7
   promoter (Top). The “A” in red was the transcription start site and the
   coding sequence has been underlined. In vitro-transcribed RNAs with
   different 5′ end caps were resolved in an 8% PAGE gel (Bottom). The
   arrows indicated the intact FAD-capped species. b Boronate affinity
   electrophoresis showing in vitro-transcribed 38 nt RNA, including ppp-,
   m^7G-, and NAD-capped RNAs. c UREA-PAGE gels showing products of
   processing of NAD- and m^7G-RNA 5′-ends by NudC, dRai1, RppH, and
   yDcpS. d UREA-PAGE gel showing products of processing of FAD- and
   dpCoA-RNA 5′-ends by NudC. e Analysis of the effect of RNA secondary
   structure on yDcpS decapping. All m^7G-capped RNA hairpins and linear
   30-mers (1 µg each) were treated by 9 µM yDcpS for 1 h and then
   resolved in a 12% PAGE gel. f After NudC treatment, NCIN-RNA, such as
   NAD-RNA (38 nt), could be ligated with the 5′-adapter (27 nt), as shown
   by the shift of an upper band in the UREA-PAGE gel (left). m^7G-capped
   RNA (38 nt) after being processed by yDcpS and CIAP, could not be
   ligated with the 5′-adapter (middle). RppH-processed m^7G-RNA, on the
   other hand, contained a 5′-phosphate group for adapter ligation
   (right). Source data are provided as a [76]Source Data file.

   Assisted by 38 nt RNA spike-ins, we examined the specificity of
   decapping enzymes (Fig. [77]2c). NudC, known for its ability to
   hydrolyze diphosphate but not triphosphate linkages, effectively
   removed the NAD cap but also exhibited non-specific cleavage of the
   m^7G cap upon extended treatment (Fig. [78]2c, d). We also evaluated
   exoribonucleases from the DXO/Rai1 family, known for their decapping
   activity^[79]27,[80]38. We purified the dRai1 protein, a Drosophila
   homolog of Rai1, and found that it selectively degraded NAD-capped RNA
   while leaving m^7G-capped RNA intact (Fig. [81]2c). However, dRai1 also
   caused additional nucleotides removal due to its exoribonuclease
   activity (Fig. [82]2c). In contrast, yDcpS specifically targeted m^7G
   cap, while RppH removed both m^7G and NAD caps, lacking specificity
   (Fig. [83]2c). It has been reported that RNA secondary structure would
   hinder the activity of yDcpS^[84]39. We generated m^7G-capped RNAs with
   polymorphic hairpins and 3′ or 5′ overhangs to test whether excess
   yDcpS (~9 µM) would overcome this issue. Our results showed that excess
   yDcpS completely removed the m^7G cap from RNAs regardless of their
   linear or secondary structures (Fig. [85]2e). Above experiments
   established the careful design of the CompasSeq workflow, including the
   choice and order of decapping enzyme treatments. The workflow started
   with a cascade of enzymatic reactions, followed by adapter ligation
   (Fig. [86]1a). To test the efficacy of adapter ligation, we subjected
   38 nt NAD-RNA and m^7G-RNA to CompasSeq. As evidenced by the
   accumulation of an upper band (Fig. [87]2f), we demonstrated the
   specificity of cap-processing enzymes in combination with adapter
   ligation for NAD- and m^7G-capped forms, respectively.

The feasibility of CompasSeq analytical platform

   To evaluate the feasibility of the CompasSeq analytical platform,
   encompassing both experimental and computational frameworks, we
   undertook several validation steps. First, we synthesized long RNA
   spike-ins (500 nt) with different sequences and cap structures,
   including NAD, FAD, dpCoA, m^7G, and pppA. We mixed 20 µg of total RNA
   from mouse livers with 1 ng of each spike-in and subjected this mixture
   to the CompasSeq experiment, followed by qPCR analysis (Fig. [88]3a).
   Our results indicated that only NCIN, and not m^7G- and ppp-RNAs, were
   enriched in the epitranscriptome-capturing group, whereas both NCIN-
   and m^7G-RNAs were significantly enriched in the
   transcriptome-capturing group (Fig. [89]3b). Second, we aimed to
   measure the percentage of NCIN-capped RNA spike-ins. We excluded
   FAD-RNA spike-in from this and subsequent experiments due to its
   spontaneous decapping, which might confound calculations. Instead, we
   synthesized paired RNA spike-ins with identical sequences (500 nt) but
   different cap structures, including NCIN (NAD and dpCoA) or canonical
   m^7G cap—each followed by a polyadenylated tail. It is likely that
   native transcripts exist in both NCIN- and m^7G-capped versions, with
   proportions varying by gene. This experimental setup was intended to
   mimic endogenous transcripts that have 5% NCIN-capped modification
   (mixed with 95% m^7G-RNA) or 0% NCIN-capped modification (i.e., 100%
   m^7G-RNA) spike-in representing genes with no NCIN capping, thereby
   allowing us to assess the specificity of CompasSeq. We mixed 20 µg of
   total RNA from 2-month mouse livers with 0.014 ng of synthetic spike-in
   RNAs and subjected this mixture to the CompasSeq experiment, followed
   by polyA-selected RNA sequencing.

Fig. 3. The feasibility of CompasSeq.

   [90]Fig. 3
   [91]Open in a new tab

   a Schematic workflow of gene-specific assessment of NCIN modification
   with CompasSeq platform by qPCR. Total RNAs from 2-month-old mouse
   livers were mixed with polyadenylated spike-in RNAs (500 nt), followed
   by enzymatic reactions of CompasSeq for qPCR analysis. ppp-RNA (500 nt)
   was included as a baseline control. b qPCR analysis showed that
   synthetic RNAs (500 nt) containing NCIN caps, including NAD, FAD, and
   dpCoA, could be significantly and selectively enriched by the procedure
   designed for capturing NCIN-RNAs, whereas ppp- and m^7G-RNA did not
   exhibit such enrichment (left). In the procedure designed for capturing
   m^7G- and NCIN-RNAs, all RNAs, except ppp-RNA, were significantly
   enriched (right). Data were shown in mean ± s.e.m. (n = 3 independent
   experiments). Two-tailed Welch′s t-tests were used to infer p values.
   n.s. denoted not significant. c RNA-seq experiment of spike-in RNAs
   determined the specificity of CompasSeq. A total of three different
   spike-in RNAs with different capped forms were mixed with total RNAs
   from 2-month-old mouse liver, and the mixture of RNAs was subjected to
   CompasSeq. For each capped form, two polyadenylated spike-in RNAs with
   identical sequences (500 nt) but have either NCIN (NAD or dpCoA) or
   m^7G cap, were used to mimic endogenous NCIN-capped transcript with
   different capped forms. As shown in the bar chart, NCIN-RNAs, but not
   m^7G-RNA, were significantly and selectively enriched with NCIN capping
   levels close to their expected ratio. Data were shown in mean ± s.e.m.
   (n = 3 independent experiments). d RNA-seq experiment of spike-in RNAs
   determined the sensitivity of CompasSeq. A total of five different
   polyadenylated spike-in RNAs (500 nt) with different proportions of
   NAD-RNA were mixed with total RNAs from 2-month-old mouse livers, and
   the mixture of RNAs was subjected to CompasSeq. Two spike-in RNAs with
   identical sequences but have either NAD or m^7G cap, were used to mimic
   endogenous NCIN-capped transcript with different NCIN capping levels.
   Scatter plot showing the concordance between expected and observed NCIN
   capping levels from spike-in RNAs. The Pearson correlation coefficient
   and the associated two-sided p value were shown in the top right
   corner. Data were shown in mean ± s.e.m. (n = 3 independent
   experiments). Exact p values and source data are provided as a
   [92]Source Data file.

   Importantly, we added pre-treated total RNA from Drosophila at a mass
   ratio of 50:1 for each sample, serving as a baseline control after
   adapter ligation, a critical procedure for downstream computational
   analysis (Supplementary Fig. [93]2a). We reasoned that exogenous
   Drosophila RNA spike-ins could be utilized to reflect and correct
   unwanted variations introduced by experimental procedures. In
   principle, Drosophila spike-ins, when added equally to each sample,
   should maintain a constant proportion across samples, capturing either
   NCIN- or all-capped forms. Given the fact that NCIN-capped RNAs
   constituted only a small proportion of the total RNA pool, read counts
   from the NCIN-capped library should be lower than those from the
   all-capped library. In practice, we observed that the proportion of
   reads from the Drosophila genome was higher in samples capturing
   NCIN-capped than all-capped forms (Supplementary Fig. [94]2b).
   Therefore, we employed a stepwise normalization procedure to address
   these variations, to enable the recovery of the theoretical
   distribution of NCIN-capped RNAs. We found that raw read counts
   displayed similar distributions between NCIN-capped RNA and all-capped
   RNA libraries (Supplementary Fig. [95]2c). Upon normalizations, gene
   measurements from all-capped forms exhibited higher values compared to
   those from NCIN-capped libraries, reflecting the intrinsic distribution
   of NCIN-capped transcripts (Supplementary Fig. [96]2c). Correlation
   analysis revealed that normalized gene measurements exhibited improved
   reproducibility compared to the raw read counts (Supplementary
   Fig. [97]2d). However, NCIN capping estimated from genes with
   relatively low expression exhibited higher variation compared to highly
   expressed genes, an observation potentially caused by genes with low
   read counts that tend to suffer from high variability^[98]40. To
   address this issue, we introduced the empirical Bayes shrinkage
   strategy. As shown in Supplementary Fig. [99]2e, CompassAnalyzer could
   significantly mitigate such variations. Comparatively, exogenous total
   RNA spike-ins added at the beginning step failed to correct PCR
   amplification biases, as shown by the poor performance of NCIN-capped
   RNA proportions (Supplementary Fig. [100]2f).

   Using CompassAnalyzer, we were able to analyze synthetic RNA spike-ins
   with known ratios, revealing that this platform not only captured
   metabolite-modified RNAs but also their relative ratios for the 5% of
   NCIN-capped spike-in (Fig. [101]3c). Background enrichment for the 100%
   m^7G-RNA spike-in was significantly lower (less than 2%) compared to
   the 5% NCIN-capped spike-in (Fig. [102]3c), reflective of the
   specificity. We further employed spike-in RNAs with ascending ratios of
   NAD-capped forms: 0% (i.e., 100% m^7G-capped), 1% (i.e., 99%
   m7G-capped), 5% (i.e., 95% m7G-capped), 10% (i.e., 90% m^7G-capped),
   and 20% (i.e., 80% m7G-capped) (Supplementary Fig. [103]2g). The
   CompasSeq method, with its streamlined experimental and computational
   procedures, could reach coherent results between observed and expected
   NCIN capping levels (Pearson′s r = 0.998, Fig. [104]3d), thus
   illustrating the feasibility of this method in capturing and
   quantitatively measuring endogenous metabolite-modified RNAs.

NCIN-RNA epitranscriptome

   With the CompasSeq analytical platform, we characterized NCIN-RNAs from
   young (2-month) and aged (18-month) mouse liver tissues. Following
   standard quality filtering procedures, each of the three independent
   biological replicates yielded an average of ~21.29 million high-quality
   reads that uniquely aligned to the reference genome (Supplementary
   Fig. [105]3a). PCA showed that the biological replicates of each sample
   type clustered together, indicating high reproducibility (Supplementary
   Fig. [106]3b). Dataset assessments corroborated that sequencing
   saturation has been reached (Supplementary Fig. [107]3c). Based on the
   background enrichment signal detected by 100% m^7G-RNA spike-in, which
   was less than 2% (Supplementary Fig. [108]3d), we chose 2% ratio as the
   cutoff. This criterion led us to identify a total of 5680 highly
   reproducible NCIN-RNAs (Supplementary Data [109]2; Pearson′s r = 0.895
   of the three biological replicates, Supplementary Fig. [110]3e),
   accounting for ~47.2% of the mRNA transcripts expressed in young mouse
   liver tissues. Besides RNA-seq, we also applied RT-qPCR. By analysis of
   three randomly selected genes, Ndufb10, Fkbp8, and Fads1, we calculated
   the relative abundance between NCIN-capped and all-capped groups
   (Supplementary Fig. [111]4a). The results obtained by RT-qPCR were in
   good congruence with those from high-throughput sequencing
   (Supplementary Fig. [112]4b), suggesting that CompasSeq can also be
   used for gene-specific assessment. Furthermore, comparison between the
   NCIN-RNAs, including, but not limited to, NAD-capped RNAs, identified
   in the current study with NAD-capped RNAs identified by our recently
   published ONE-seq method^[113]24 revealed a significant overlap
   (Supplementary Fig. [114]4c). For those NAD-RNAs uniquely identified by
   ONE-seq, their expression levels were significantly low (Supplementary
   Fig. [115]4d).

   We briefly analyzed the NCIN-capped epitranscriptome from young mouse
   livers. By computing NCIN capping levels, we observed that most genes
   producing NCIN caps had modification ratios between 5% and 30%, with
   some genes exhibiting exceptionally high levels, such as Rnr1 (91.1%)
   (Fig. [116]4a). NCIN capping predominantly occurred on protein-coding
   genes, but also extended to pseudogenes and noncoding RNAs
   (Fig. [117]4b). NCIN-RNAs were derived from genes located on autosomes,
   the X chromosome, and the mitochondrial genome, but not from the Y
   chromosome (Fig. [118]4c). Based on the extent of modification, we
   divided NCIN-RNAs into three categories: high (30–100%, e.g., Fads1),
   medium (12.5–30%, e.g., Fkbp8), and modest levels (2–12.5%, e.g.,
   Ndufb10) (Fig. [119]4d and Supplementary Fig. [120]4e). To inspect
   biological function, we performed pathway enrichment analysis and found
   that genes with modest NCIN capping were mainly involved in
   mitochondria-related events and energy metabolism, whereas genes with
   medium and high NICN capping levels were enriched in macroautophagy,
   vesicle-related processes, and cellular responses (Fig. [121]4e and
   Supplementary Data [122]3).

Fig. 4. The quantitative NCIN-capped epitranscriptome from mouse liver.

   [123]Fig. 4
   [124]Open in a new tab

   a The distribution of NCIN capping levels of each gene from 2-month-old
   mouse livers (n = 5462). Box plot represented the median (center line),
   the first and third quartiles (box limits), and the whiskers extended
   to 1.5 times the interquartile range. Data points beyond the whiskers
   were shown as outliers. b Pie chart showing the gene types of
   identified NCIN-RNAs. c Bar chart showing the chromosome distribution
   of identified NCIN-RNAs. d The distribution of NCIN capping levels of
   genes with modest (2–12.5%), medium (12.5–30%), and high (30–100%) NCIN
   capping levels. e Pathway enrichment analysis of genes with modest,
   medium, and high NCIN capping levels. Numbers represented the count of
   NCIN-RNA enriched in each pathway. Multiple testing corrections were
   performed using gprofiler2’s built-in “g_SCS” method. The gray dashed
   line indicated the cutoff of 0.05 adjusted p value. Exact p values and
   source data are provided as a [125]Source Data file.

Validation of CompasSeq by an orthogonal method

   To validate the CompasSeq platform by an orthogonal approach, we
   introduced two 5′−3′ exoribonucleases: dRai1 and Xrn1. Our rationale
   was based on the above observation that dRai1 specifically decaps
   NCIN-, but not m^7G-capped RNAs, leading to 5′-to-3′ cleavage
   (Fig. [126]2b). This effect could be further enhanced by Xrn1 towards
   degradation, whereas m^7G-capped transcripts would largely remain
   intact (Fig. [127]5a). We first validated the decapping activity of
   dRai1 on FAD- and dpCoA-RNAs (see Fig. [128]2b for NAD-RNA) by
   observing the appearance of lower bands on the UREA-PAGE gel
   electrophoresis (Fig. [129]5b). Second, we subjected 106 nt NAD-RNA and
   m^7G-RNA spike-ins to dRai1 and Xrn1 treatment, which showed selective
   degradation of NAD-, but not m^7G-capped, RNA (Fig. [130]5c). Prompted
   by this evidence, we hypothesized that the relative level of
   NCIN-capped transcripts could be determined by the reduction in
   abundance after treatment with dRai1 and Xrn1, compared to a mock
   treatment that includes both NCIN- and m^7G-capped forms. We named this
   method the quantitative exoribonuclease reduction assay (Fig. [131]5d).
   We subjected 1 μg of total RNA from mouse livers mixed with 0.1 ng of
   synthetic RNA spike-ins (500 nt), including m^7G, NAD, FAD, dpCoA, and
   pppA, to the assay. Consistently, only NCIN-RNAs exhibited significant
   degradation, while m^7G-RNA remained unchanged (Fig. [132]5e). To
   assess whether the assay could measure NCIN capping levels, we mixed
   1 μg of total RNA with 500 nt spike-in RNAs containing ascending ratios
   of NAD relative to m^7G-capped forms: 0%, 10%, 30%, 50%, 70%, and 90%.
   We observed that relative capping levels measured by the assay
   corresponded to their expected levels (Fig. [133]5f). Furthermore, we
   successfully validated the NCIN capping levels of five randomly
   selected genes: Idh1, F13b, Ssr3, Rnr2, and Rnr1, as identified and
   quantified by CompasSeq (Fig. [134]5g, h). As a control experiment, we
   confirmed the absence of NCIN capping in five genes not identified by
   CompasSeq (Apom, Cox6a1, Dbi, Ddt, and Lrg1) (Supplementary
   Fig. [135]5). Taken together, our results using quantitative
   exoribonuclease reduction assay strongly supported the reliability of
   CompasSeq for the identification and ratio quantification of NCIN-RNAs.

Fig. 5. Validation of CompasSeq by quantitative exoribonuclease reduction
assay.

   [136]Fig. 5
   [137]Open in a new tab

   a The validation strategy for the treatment with dRai1 and Xrn1. b
   UREA-PAGE gel showing products of processing of 38 nt FAD- and
   dpCoA-RNA 5′-ends by dRai1. c UREA-PAGE gel showing products of
   processing of 106 nt NAD- and m^7G-RNA by dRai1 and Xrn1. d Schematic
   workflow of gene-specific assessment of NCIN modification with
   quantitative exoribonuclease reduction assay. e Synthetic RNAs (500 nt)
   containing NCIN caps (NAD, FAD, and dpCoA) could be significantly and
   selectively degraded by dRai1 and Xrn1, whereas m^7G-RNA did not
   exhibit such degradation. Data were shown in mean ± s.e.m. (n = 3
   independent experiments). Two-tailed Welch′s t-tests were used to infer
   p values. n.s. denoted not significant. f Synthetic RNAs (500 nt) with
   ascending ratios of NAD-capped forms: 0% (i.e., 100% m^7G-capped), 10%,
   30%, 50%, 70%, and 90% were used for the quantitative exoribonuclease
   reduction assay. The red and blue numbers represented the expected and
   observed NCIN capping levels, respectively. The observed NCIN capping
   levels were calculated as the difference between 1 and the relative
   levels from enzymatic treatment measured using the assay. Data were
   shown in mean ± s.e.m. (n = 3 independent experiments). g Five
   endogenous genes from mouse liver were assessed by the quantitative
   exoribonuclease reduction assay. Data were shown in mean ± s.e.m.
   (n = 3 samples). h Comparison between the result obtained by
   quantitative exoribonuclease reduction assay and CompasSeq. The red
   bars represented NCIN capping levels obtained from CompasSeq (left
   axis), while the blue bars illustrated NCIN capping levels calculated
   from the assay (right axis). Data were shown in mean ± s.e.m. (n = 3
   samples). Two-tailed Welch′s t-tests were used to infer p values. n.s.
   denoted not significant. Exact p values and source data are provided as
   a [138]Source Data file.

Characterization of metabolite-capped RNA at the transcript resolution

   We proceeded to characterize newly identified NCIN-RNAs at the
   transcript resolution. Eukaryotic genomes feature transcript isoforms
   that can be derived from common gene loci. In-depth analysis revealed
   that, for most NCIN-RNAs (98.5%), their isoforms, if transcribed, all
   contained NCIN caps, e.g., Serpinf2, although the actual capping level
   varied among individual isoforms. Only 1.5% of NCIN-RNAs had select
   isoforms being modified, e.g., Zfp36 (Supplementary Fig. [139]6a, b and
   Supplementary Data [140]4). This observation prompted us to explore the
   link between NCIN capping and gene transcription, including transcript
   abundance, gene length, number of introns, alternative transcription
   start sites (ATSS)^[141]41, alternative splicing (AS)^[142]42, and
   alternative polyadenylation (APA)^[143]43. First, by dividing NCIN-RNAs
   into 10 deciles based on their NCIN capping levels, we observed that
   transcript abundance was negatively correlated with NCIN capping
   levels, while transcript length and number of introns were positively
   correlated (Supplementary Fig. [144]6c). Second, we found that RNAs
   with high levels of NCIN capping tended to possess ATSS (Supplementary
   Fig. [145]6d). Third, we noted that NCIN capping was negatively
   associated with AS events, including exon skipping, alternative 5′
   donor and 3′ acceptor sites, mutually exclusive exons, and intron
   retention (Supplementary Fig. [146]6e). Fourth, we assessed the extent
   of APA by the percentage of distal poly(A) site usage index (PDUI).
   This analysis illustrated that transcripts containing NCIN caps were
   more likely to have APA compared to RNAs without NCIN modification.
   Consistently, NCIN capping levels positively correlated with distal
   poly(A) site usage (Supplementary Fig. [147]6f). NCIN capping
   represents an evolutionarily conserved RNA
   modification^[148]5,[149]12,[150]27,[151]44,[152]45. To explore the
   potential impact of sequence conservation, we utilized PhastCons
   scores, which measure conservation based on sequence
   similarity^[153]46, to calculate the conservation score of each
   isoform. Our findings indicated that transcripts with low sequence
   conservation exhibited less metabolite capping, whereas transcripts
   with high sequence conservation tended to be highly modified
   (Supplementary Fig. [154]7a). Prior research has indicated that
   proximal RNA secondary structures can affect RNA capping^[155]47.
   Therefore, we analyzed sequence features within 100 bp downstream of
   the transcription start site (TSS), assessing potential structural
   influences based on minimum free energy (MFE)^[156]48. Our analysis
   revealed that RNAs with NCIN caps tended to adopt predicted structures
   with lower MFE than those with m^7G caps. Furthermore, RNAs with higher
   NCIN capping exhibited even lower MFE levels (Supplementary
   Fig. [157]7b, c).

NCIN capping on transcript stability and translation efficiency

   To investigate how NCIN capping affects RNA stability, we generated
   Genome-wide Mapping of Uncapped and Cleaved Transcripts (GMUCT)^[158]49
   datasets to assess mRNA degradation in mouse livers. All GMUCT
   libraries exhibited read enrichment at the 3′ end of mRNAs
   (Supplementary Fig. [159]7d), and GMUCT read abundance per transcript
   was positively correlated with mRNA levels quantified by RNA-seq
   (Supplementary Fig. [160]7e), which aligned with previously reported
   degradome sequencing patterns^[161]50. Using the GMUCT and RNA-seq
   data, we assessed transcript stability using the “proportion uncapped”
   metric (log[2][GMUCT/RNA-seq]), where higher values suggest lower
   stability. We analyzed the stability of RNA transcripts with and
   without NAD^[162]24 and NCIN caps (CompasSeq data from the current
   study), respectively. Our analysis revealed that NAD-capped RNAs in
   mouse liver exhibited a higher proportion of 5′ degradation
   intermediates, suggesting greater instability compared to RNAs without
   NAD modification (Supplementary Fig. [163]7f). These observations were
   in line with findings in Arabidopsis, where NAD capping was found to be
   associated with reduced RNA stability^[164]51. In a sharp contrast,
   NCIN-capped RNAs from CompasSeq data showed a significantly lower
   proportion of degradation intermediates (Supplementary Fig. [165]7g).
   These data suggested that NCIN caps, as the compendium of NAD, FAD,
   dpCoA caps, etc., tended to occur at transcripts with higher stability.
   To examine the impact of NCIN capping on translational efficiency, we
   calculated ribosome occupancy (RO) using available Ribo-seq and RNA-seq
   data from mouse liver (GEO: [166]GSE123981^[167]52). RO, an estimate of
   the translation efficiency, was computed as the log-ratio of Ribo-Seq
   and RNA-Seq read counts. Our analysis indicated that NCIN capping was
   positively correlated with translation efficiency (Supplementary
   Fig. [168]7h). Together, these findings suggested that NCIN caps were
   linked with transcripts exhibiting greater stability and higher
   translation efficiency.

NCIN capping dynamics during aging

   The number of genes harboring NCIN caps and their relative capping
   levels exhibited a significant decrease during aging (Fig. [169]6a).
   Moreover, we quantified the usage of metabolite caps by calculating the
   weighted sum of NCIN capping levels normalized to transcript abundance.
   Additionally, metabolite-capped RNA constituted ~8.3% of the mRNA
   transcriptome in young animals, and this proportion declined to 6.5% in
   aged mice, though the difference was not statistically significant
   (Fig. [170]6b).

Fig. 6. Features of NCIN-capped epitranscriptome.

   [171]Fig. 6
   [172]Open in a new tab

   a Violin plot showing the global dynamics of NCIN-RNAs during aging.
   Box plots represented the median (center line), the mean (red dots),
   the first and third quartiles (box limits), and the whiskers extended
   to 1.5 times the interquartile range. Data points beyond the whiskers
   were shown as outliers. Statistical analysis was performed using the
   nonparametric Mann–Whitney rank test (two-sided). b Pie chart showing
   the percentile usage of RNA caps (NCIN caps and non-NCIN cap) between
   young and old mouse livers. Statistical analysis was performed using a
   chi-squared test (p = 0.83, two-sided). c Scatterplot showing NCIN
   capping levels in young and old mouse livers. Differentially modified
   genes were defined as absolute fold change of NCIN capping levels ≥ 1.5
   in old compared to young samples. d Scatterplot showing
   log[2]-transformed normalized gene expression levels in young and aged
   mouse livers. Differentially expressed genes were defined as absolute
   fold change of normalized read counts ≥ 1.5 and adjusted p
   values < 0.05 in old compared to young animals. e Upset plot showing
   the number of intersections between differentially modified genes and
   differentially expressed genes. The size of each set was shown in the
   left bottom corner. f Heatmap showing the log[2]-transformed normalized
   gene expression levels and NCIN capping levels of selected genes. Genes
   with age-decreased NCIN capping were colored in blue, while the gene
   with age-increased NCIN capping was colored in red. g Circular lollipop
   chart showing the transcriptomic and epitranscriptomic pathway score of
   each pathway. The pathway score was defined as the log[2]-transformed
   average of fold change between normalized read counts or NCIN capping
   levels derived from transcriptome or epitranscriptome profiles from old
   and young animals, respectively. Statistical analyses were performed
   using the nonparametric Mann–Whitney rank test (two-sided). Pathways
   with significant alterations during aging (p < 0.05) were marked with
   red asterisk. h Box plot showing the log[2]-transformed normalized gene
   expression levels and NCIN capping levels of pathways with significant
   alterations during aging (n = 3 samples). Box plots represented the
   median (center line), the first and third quartiles (box limits), and
   the whiskers extended to 1.5 times the interquartile range. Data points
   beyond the whiskers were shown as outliers. Statistical analyses were
   performed using the nonparametric Mann–Whitney rank test (two-sided).
   Exact p values and source data are provided as a [173]Source Data file.

   We further investigated the impact of age on NCIN capping. Our analysis
   revealed that 23.3% of genes producing NCIN-capped isoforms exhibited a
   uniform decrease in capping levels with age (e.g., Bri3), while only
   0.5% showed a consistent increase (e.g., Ghitm; Supplementary
   Fig. [174]8a). Additionally, 5.8% of genes with NCIN-capped isoforms
   conferred discordant changes with age, with some isoforms increasing
   while others decreasing in capping levels (e.g., Spryd7; Supplementary
   Fig. [175]8a, b). Notably, specific isoforms could even lose their
   metabolite caps with age (e.g., Comt; Supplementary Fig. [176]8b).

   We conducted a comparative analysis between the epitranscriptome and
   transcriptome (Fig. [177]6c, d and Supplementary Data [178]5).
   Strikingly, most genes that exhibited altered NCIN capping with age
   remained unchanged at the RNA level (Fig. [179]6e). We further devised
   a scheme to filter NCIN-RNAs based on low variation, high modification,
   and high dynamics (Supplementary Fig. [180]9a). This approach allowed
   us to pinpoint five NCIN-capped genes that had the most significant
   changes during aging. Four genes showed decreased NCIN capping,
   including those involved in diphosphate hydrolysis (Nudt5), the
   cAMP-dependent pathway to Golgi (Pde4dip), lipid transport (Stoml1),
   and cytosolic sulfonation (Sult1c2), while one gene, associated with
   selenoprotein production (Selenot), exhibited an increase in NCIN
   capping (Fig. [181]6f).

   To inspect the potential impact of metabolite capping during aging, we
   performed pathway enrichment analysis and found that genes with
   decreased modification were involved in translation, ribosome-related,
   and mitochondrion-related events (Supplementary Fig. [182]9b and
   Supplementary Data [183]6). Notably, transcriptome analysis showed that
   genes with increased expression during aging were also involved in
   translation and ribosome-related events, while genes with decreased
   expression were enriched in lipid, glucose, steroid, and amino acid
   metabolism (Supplementary Fig. [184]9b). These results reflected that
   genes involved in basic cellular events conferred opposite changes in
   transcriptome and NCIN-capped epitranscriptome during aging. We further
   examined how the transcriptome and NCIN-capped epitranscriptome
   impinged on aging pathways. Hallmarks of aging, including inflammation,
   mitochondrion, proteostasis, cellular senescence, cell communication,
   and transcription regulation, showed discordant trends between the
   transcriptome and epitranscriptome with age (Fig. [185]6g).
   Specifically, genes involved in immune system process, translation,
   peptide metabolism, cell communication, and signal transduction had a
   marked increase in expression during aging, while exhibiting a
   significant decline in NCIN capping (Fig. [186]6h).

   To investigate the impact of NCIN capping on RNA stability during
   aging, we applied GMUCT datasets from young and aged mouse livers and
   grouped the transcripts based on their modification status. Using
   NAD-RNA data, a late-onset increase of NAD capping was significantly
   associated with a larger proportion of uncapped transcripts in old
   compared to young animals (Supplementary Fig. [187]9c). This finding
   suggested a link between elevated NAD capping during aging with
   increased RNA degradation. In contrast, we did not observe a
   significant correlation between age-associated changes in NCIN capping
   and RNA degradation (Supplementary Fig. [188]9d), consistent with RNA
   expression data shown in Fig. [189]6e. Similarly, we analyzed Ribo-seq
   datasets (GEO: [190]GSE123981) with NCIN capping and noted that
   transcripts with altered modification of metabolite in aged mouse
   livers were found to be associated with reduced translation efficiency
   (Supplementary Fig. [191]9e).

   In summary, we introduce CompasSeq, a strategy for the comprehensive
   and quantitative ratio assessment of NCIN-capped RNAs in eukaryotic
   organisms. Using CompasSeq, we elucidate the NCIN-capped
   epitranscriptome, encompassing not only the types of RNAs capped by
   NCIN but also their relative capping levels, from adult aging mouse
   livers. We unveil distinct features between the transcriptome and
   NCIN-capped epitranscriptome, suggesting that metabolite capping may
   serve as a hidden layer of gene regulation that occurs with age.

Discussion

   Nucleotide-containing metabolites, including NAD, NADH, FAD, and dpCoA,
   can be incorporated at the 5′ terminus of RNAs, resulting in
   NCIN-capped RNAs. Currently available methods are limited to the
   identification of specific metabolite-modified RNAs, e.g.,
   NAD-RNA^[192]9,[193]22,[194]24, FAD-RNA^[195]19 and
   dpCoA-RNA^[196]18,[197]53,[198]54, and lack quantitative assessment of
   the ratio between m^7G-capped and metabolite-capped transcripts. The
   CapZyme-seq method has been used to identify NCIN-RNAs in
   prokaryotes^[199]27; however, due to the prevalence of the m^7G cap in
   eukaryotic transcriptomes and the absence of specifically designed
   spike-ins as well as computational solutions, this method and others
   are not applicable for investigating eukaryotic NCIN-RNAs, especially
   their capping ratios. In this study, we present CompasSeq, an
   analytical platform that integrates both experimental and computational
   frameworks to simultaneously capture and quantify NCIN capping levels
   at the epitranscriptome level in eukaryotes.

   For the experimental part, we meticulously design enzymatic reactions,
   selecting cap-specific enzymes and ordering treatments to generate a
   monophosphate moiety at the 5′-end of NCIN-capped and all-capped RNAs,
   respectively, followed by adapter ligation and PCR enrichment. To
   minimize biases introduced during PCR amplification, we devise a
   spike-in normalization strategy. Pre-treated exogenous RNAs are added
   proportionally to adapter-ligated samples, departing from traditional
   methods like ERCC in which spike-ins are added at the beginning of the
   experiment^[200]55. This approach ensures an unbiased representation of
   NCIN-capped RNAs that represent a minor subset of the total RNA pool.
   Adding exogenous RNA spike-ins after enzymatic treatments but prior to
   library construction captures discrepancies in spike-in proportions
   between all-capped and NCIN-capped libraries, revealing PCR-induced
   biases. For the computational part, we develop CompassAnalyzer, an R
   package that applies stepwise normalization and empirical Bayes
   shrinkage to accurately recover NCIN- and all-capped RNA. The CompasSeq
   analytical platform enables quantitative assessment of the NCIN-capped
   epitranscriptome, identifying not only the types of RNAs capped by NCIN
   but also their relative capping levels. Moreover, we devise an
   orthogonal method, the quantitative exoribonuclease reduction assay, to
   independently validate CompasSeq. By harnessing the decapping activity
   of dRai1 and the 5′–3′ exonuclease activity of Xrn1, this assay allows
   gene-specific quantification of NCIN capping levels. However, this
   assay is not yet feasible for sequencing-based epitranscriptome-wide
   assessment, due to the lack of appropriate spike-ins to normalize
   unwanted variation from enzymatic treatment. Altogether, we provide
   compelling evidence to support the notion that CompasSeq is a reliable
   approach for the identification and quantification of NCIN-RNAs at the
   epitranscriptome level.

   With CompasSeq, we reveal insight into NCIN-capped epitranscriptome.
   First, we identify that over 45% of the transcripts from mouse liver
   are modified with metabolite caps, with most displaying capping ratios
   between 5 and 30%. Second, we determine that metabolite caps constitute
   between 6 and 8% of mRNA pool from adult mouse liver tissues,
   representing a significant type of RNA modification. A previous
   analysis using mass spectrometry has estimated that NAD and FAD
   combined make up ~0.8% in the mRNA transcriptome, rising to 3.7% when
   including NAD, FAD, UDP-Glc and UDP-GlcNAc caps^[201]13. Our current
   measurements, based on a weighted sum of a near-complete spectrum of
   NCIN caps normalized to gene abundance, provide an improved overview of
   metabolite usage as caps in the mRNA transcriptome. Third, we observe
   that select genes, such as Rnr1 and Rnr2 with 85.4% and 91.1% of NCIN
   capping, respectively (see Fig. [202]4a), display significant NCIN
   compared to m^7G capping. Particularly, these two genes are ribosomal
   RNAs encoded and localized in mitochondria. It has been reported that
   nucleotide-containing metabolites, e.g., NAD^[203]56 and FAD^[204]57,
   are highly abundant in mitochondria and that RNA polymerase in
   mitochondria exhibits more efficiency for NCIN capping compared to its
   counterpart in cytoplasm^[205]44. Hence, it is tempting to predict that
   such high NCIN modification of Rnr1 and Rnr2 may have a functional
   relevance. Fourth, in young mouse livers, we demonstrate that the
   combined effect of NCIN capping, despite negative regulation of NAD
   cap, would promote RNA stability. Similarly, NCIN capping positively
   correlates with translation efficiency, though the contribution of
   specific metabolite caps warrants further investigation. Fifth, we
   delineate the dynamics of NCIN-capped epitranscriptome during aging,
   revealing a significant decrease at both global and transcript levels.
   Though a majority of transcripts with altered NCIN capping remain
   unchanged at the transcriptome level, we observe a discordant trend
   between RNA expression and NCIN capping on genes linked to aging
   hallmarks, e.g., inflammation, mitochondrial function, proteostasis,
   cellular senescence, and intercellular communication^[206]30. Moreover,
   in aged mouse livers, NCIN capping appears to have confounding effects
   on RNA stability and translation efficiency. It is possible that
   different types of metabolite caps, when combined with their dynamic
   changes with alterations in cellular machineries of RNA transcription
   and protein synthesis that occur with age, may exert distinct outcomes
   on RNA stability and translation. Taken together, CompasSeq reveals
   features of the NCIN-capped epitranscriptome, suggesting a
   yet-to-be-discovered regulatory mechanism.

   Finally, we raise important issues and potential limitations of our
   current study. First, synthetic NCIN-RNAs by in vitro transcription,
   such as NAD-RNA, FAD-RNA, and dpCoA-RNA, are difficult to achieve 100%
   purity, a phenomenon that has been seen in the current study (see
   Fig. [207]2a, b) as well as by others^[208]22,[209]36,[210]58, such
   that these synthetic NCIN-RNA spike-ins can be useful as internal
   quality controls, but they are not appropriate for normalization
   procedures. Consequently, CompassAnalyzer, the computational procedures
   of CompasSeq, solely rely on exogenous Drosophila total RNAs for
   stepwise normalizations. While our strategy effectively corrects PCR
   amplification bias, it does not control for variability introduced at
   the earlier steps. Thus, the CompasSeq platform provides an
   approximate, but not absolute, ratio assessment of NCIN-RNAs. Second,
   we apply stringent filtering criteria for quality data; however, this
   may exclude genes with low expression. Third, CompasSeq can detect
   transcript with very low modification, such as those equal to or lower
   than 1% capping by metabolite, but their measurement of exact ratio
   usually exhibits high variation (see Fig. [211]3d). Fourth, we leverage
   the magnitude of fold change in NCIN capping levels between young and
   aged animals; yet quantitative assessment of age-associated alterations
   in NCIN-capped epitranscriptome is only beginning to be revealed.
   Future studies that focus on developing statistical methods for
   differential analysis of NCIN-RNA sequencing data would facilitate the
   robustness of detecting differential NCIN-RNAs in response to
   physiological and even pathological conditions. Fifth, although the use
   of pyrophosphatase, such as NudC, is effective in identifying bulk
   NCIN-RNAs, it cannot distinguish specific types of metabolite caps.
   Future advancements, such as employing specific RNA decapping enzymes
   like AtNUDX23, which selectively cleaves FAD caps^[212]19, will enable
   ratio analysis for specific metabolite caps. Nevertheless, the
   CompasSeq analytical platform provides a foundational framework for the
   quantification and ratio assessment of metabolite-capped RNAs in
   eukaryotes.

Methods

Ethics statement

   All animal procedures have been reviewed and approved by the
   Institutional Animal Care and Use Committee at the Chinese Academy of
   Sciences and were in accordance with the Guide for the Care and Use of
   Laboratory Animals of the Chinese Academy of Sciences. All efforts were
   made to minimize the suffering of the animals.

Animal experimentation

   C57BL/6 male mice were housed in an environmentally controlled room
   under a 12:12 h light/dark cycle at 23 °C and were provided commercial
   mouse chow food and water ad libitum. To profile the NCIN-capped
   epitranscriptome, livers were dissected from both 2-month and
   18-month-old C57BL/6 male mice and immediately frozen in liquid
   nitrogen. To avoid confounding issues related to sex-specific gene
   expression, we exclusively used male mice throughout the study. For the
   collection of 3-day-old fly samples as exogenous spike-ins, newly
   emerged male Drosophila melanogaster (BDSC #5905) were gathered and
   aged for 72 h, after which all samples were promptly frozen in liquid
   nitrogen.

In vitro transcription of NCIN-RNAs, and m^7G-RNA

   RNA spike-ins were synthesized by T7 polymerase as previously
   described^[213]24. The DNA templates were synthesized by Genewiz
   (Supplementary Data [214]1). Importantly, adenine was used exclusively
   as the initial nucleotide, with no additional adenine present in the
   DNA template sequence. Initiating adenosine allowed the incorporation
   of adenine-containing metabolites, e.g., NAD, FAD, and dpCoA, at the +1
   position to form an NCIN cap. For the synthesis of NCIN- and m7G-capped
   RNAs, ATP was excluded from the transcription reaction cocktail. For
   annealing oligonucleotides, 10 μM DNA oligonucleotides were combined in
   50 μL of 1× Cut Smart buffer (NEB, catalog: B7204S) and annealed by
   heating to 95 °C for 5 min. The reaction mixture was then cooled down
   to 4 °C at a ramp rate of 0.1 °C/s in a thermocycler (Eppendorf,
   Germany). For in vitro transcription,10 μM of double-stranded DNA
   templates were combined in 100 μL 1× transcription buffer (Thermo
   Fisher Scientific, catalog: AM1334), along with 1 mM each of GTP, CTP
   and UTP, and 1 mM ATP for ppp-RNA. For other capped RNA forms, the
   reaction included 4 mM NAD (Macklin, catalog: N814818, for NAD-RNA),
   4 mM FAD (Sigma-Aldrich, catalog: F6625, for FAD-RNA), 4 mM dpCoA
   (Sigma-Aldrich, catalog: D3385, for dpCoA-RNA) or 4 mM m^7GpppA (NEB,
   catalog: S1406S, for m^7GpppA-RNA). Additionally, 10 μL of T7 RNA
   polymerase (Thermo Fisher Scientific, catalog: AM1334), 5% DMSO, 5 mM
   DTT and 20 U of RNaseOUT (Takara Bio, catalog: 2313B) were added to the
   mixture, which was then incubated at 37 °C for 4 h. Following
   transcription, the reaction mixture was treated with 10 U of DNase I
   (Thermo Fisher Scientific, catalog: AM1334) at 37 °C for 30 min to
   remove the DNA template. The RNA was then purified using RNAiso Plus
   (Takara Bio, catalog: 9108) and precipitated with isopropanol and 0.3 M
   sodium acetate (Thermo Fisher Scientific, catalog: AM9740) at −80 °C
   overnight. The RNA pellet was washed twice with 75% ethanol, air-dried,
   and redissolved in DEPC-treated H[2]O. The 500 nt IVT RNA molecules
   were subjected to polyadenylation for poly(A) tail elongation, in
   accordance with the manufacturer′s instructions (E. coli Poly(A)
   Polymerase, TransGen, catalog: LA101-01). The quality of the in
   vitro-transcribed 38 nt RNAs was further analyzed by boronate affinity
   electrophoresis with an 8 M urea, 1× TBE, 12% polyacrylamide gel
   supplemented with 0.3% 3-acrylamidophenylboronic acid^[215]37.

Purification of Drosophila dRai1 protein

   The coding sequence of Drosophila dRai1 gene was inserted into the
   pET28a expression vector, fused with a hexa-His tag at the C-terminus.
   The plasmid was transformed into the E. coli BL21 (DE3) strain
   (TransGen, catalog: CD601-02). Bacteria were cultured at 37 °C until
   the OD[600] reached 0.8, after which protein expression was induced
   with 0.5 mM IPTG and at 18 °C for 18 h. The recombinant proteins were
   purified using a Ni Focurose FF (IMAC) column (Huiyan Bio, catalog:
   HQ060312001L). The purified protein was stored in a buffer containing
   20% (v/v) glycerol and 2 mM DTT at −80 °C (Supplementary Fig. [216]1).

NudC, dRai1, RppH, and yDcpS enzymatic reactions

   For assays shown in Fig. [217]2c–e, 1 µg of either NCIN- or m^7G-RNA
   (38 nt) spike-ins was used. 10 pmol of NudC (NEB, catalog: M0607S) was
   mixed with RNA in 40 μL of NEBuffer r3.1 at 37 °C for 1 h in the
   presence of 20 U of RNaseOUT and 200 mM DTT. For assays shown in
   Figs. [218]2c, [219]5b, 0.5 µL of purified dRai1 (final concentration:
   340 nM) was incubated with RNAs in 20 μL of NEBuffer r3.1 at 37 °C for
   1 h in the presence of 10 U of RNaseOUT. The RppH reaction was
   performed by adding 5 U of RppH (NEB, catalog: M0356) to RNAs,
   resuspended in 50 μL of 1× ThermoPol Reaction Buffer (NEB, catalog:
   B9004S) and incubated at 37 °C for 1 h in the presence of 20 U of
   RNaseOUT. Excess yDcpS enzyme (400 U, NEB, catalog: M0463S) was
   incubated with RNAs in 50 μL of 1× yDcpS reaction buffer at 37 °C for
   1 h in the presence of 20 U of RNaseOUT. The products were purified
   with RNAiso Plus and analyzed by 12% polyacrylamide UREA-PAGE gels.
   Gels were stained by SYBR Gold (Invitrogen, catalog: S11494) and
   fluorescence was detected by the Fusion Solo S imaging system (Vilber).

Enzymatic treatment and adapter ligation of NAD-RNA (38 nt) and m^7G-RNA (38
nt)

   To assess the capturing specificity for assays shown in Fig. [220]2f,
   1 µg of NAD-RNA (38 nt) was decapped with 10 pmol of NudC as described
   above. The products were purified using RNAiso Plus and then ligated
   with 100 pmol of 5′ adapter oligo (27 nt, Supplementary Data [221]1,
   synthesized by Genewiz), in the presence of 20 U of T4 RNA ligase 1
   (NEB, catalog: M0202), 15% of PEG8000, 10% DMSO, 1 mM ATP and 20 U of
   RNaseOUT in 40 μl of 1× T4 RNA ligase buffer at 16 °C for 16 h.
   Decapping of 1 µg of m^7G-RNA (38 nt) was carried out using an excess
   of yDcpS enzyme as described above. Products were purified with RNAiso
   Plus and further incubated with 10 U of CIAP (Thermo Fisher Scientific,
   catalog: 18009019) in the presence of 20 U of RNaseOUT in 1×
   Dephosphorylation Buffer at 37 °C for 1 h to remove the 5′-terminal
   phosphate from yDcpS-treated RNAs. After purification with RNAiso Plus,
   the CIAP-treated RNAs were ligated with 100 pmol of 5′ adapter oligo
   (27 nt) as described above.

   Decapping of 1 µg of m^7G-RNA (38 nt) was performed using 10 U of RppH
   in the presence of 20 U of RNaseOUT in 50 μL of 1× ThermoPol Reaction
   Buffer at 37 °C for 90 min. Following purification with RNAiso Plus,
   the RppH-treated RNAs were ligated with 100 pmol of 5′ adapter oligos
   (27 nt) as described above. The ligation products were then purified
   with RNAiso Plus and analyzed using 8% polyacrylamide UREA-PAGE gels.

qPCR analysis of NCIN-RNAs (500 nt) and m^7G-RNA (500 nt)

   For assays shown in Fig. [222]3a, b, polyadenylated spike-in RNAs were
   generated from different sequence templates, including 1 ng each of
   NAD-, FAD-, dpCoA-, m^7G-, and ppp-RNA (500 nt). These spike-ins were
   mixed with 20 μg of total RNAs from 2-month C57BL/6 mouse livers
   (n = 3) and then subjected to CompasSeq enzymatic reactions. For
   procedures capturing NCIN-RNAs, RNAs were incubated with excess yDcpS
   enzyme (400 U) in the presence of 20 U of RNaseOUT in 50 μL of 1× yDcpS
   reaction buffer at 37 °C for 1 h. The products were purified using
   RNAiso Plus and subsequently incubated with 10 U of CIAP as described
   above. For procedures capturing all-capped RNAs, RNAs were incubated
   with 10 U of CIAP to remove pre-existing 5′-phosphate moieties. The
   purified products were then incubated with 10 U of RppH in the presence
   of 20 U of RNaseOUT in 50 μL of 1× ThermoPol Reaction Buffer at 37 °C
   for 90 min.

   Purified products from two procedures were incubated with 10 pmol NudC
   in the presence of 20 U of RNaseOUT and 200 mM DTT in 40 μL of NEBuffer
   r3.1 at 37 °C for 90 min. Products were purified with RNAiso Plus and
   subsequently ligated with 100 pmol 5′ adapter oligo (38 nt,
   Supplementary Data [223]1, synthesized at Genewiz), in the presence of
   20 U of T4 RNA ligase 1, 15% of PEG8000, 10% DMSO, 1 mM ATP, and 20 U
   of RNaseOUT in 40 μL of 1× T4 RNA ligase buffer at 16 °C for 16 h.
   Following purification, products were re-dissolved in 20 μl of
   DEPC-treated H[2]O. Reverse transcription was performed using designed
   primers listed in Supplementary Data [224]1 with the ABScript II cDNA
   First-Strand Synthesis Kit (ABclonal, catalog: RK20400). 5 μL of the
   products were diluted to 100 μL and served as input. Meanwhile, 5 μL of
   reverse-transcribed products were further amplified by PCR using
   adapter-specific primers listed in Supplementary Data [225]1, in
   accordance with the manufacturer′s instructions (Phanta® Max
   Super-Fidelity DNA Polymerase, Vazyme, catalog: P505-d1). The PCR
   program included an initial denaturation step of 3 min at 95 °C,
   followed by 18 cycles of amplification (denaturation for 15 s at 95 °C,
   annealing for 15 s at 60 °C, and extension for 6 min at 72 °C), and a
   final extension for 5 min at 72 °C. The ppp-RNA (500 nt) was used as
   the baseline control. qPCR was conducted using SYBR Green Fast qPCR Mix
   (ABclonal, catalog: RK21203) in accordance with the manufacturer′s
   protocol with primers listed in Supplementary Data [226]1. To calculate
   the enrichment from qPCR data, the Ct value of the target gene was
   first normalized to the Ct value of the ppp-RNA (baseline). Next, the
   normalized PCR+ fraction value (ΔCt of the target gene normalized to
   the ppp-RNA) was normalized to the background (ΔCt calculation for the
   gene in the input) to yield the ΔΔCt value. The linear conversion of
   this ΔΔCt resulted in the fold enrichment. Statistical significance was
   assessed using Student′s t test.

CompasSeq protocol

RNA extraction and spike-in addition

   For the assays shown in Fig. [227]3c, total RNAs (20 μg) from 2-month
   C57BL/6 mouse liver (n = 3) were mixed with 0.014 ng of 3′-end
   polyadenylated spike-in RNA (500 nt) containing 100% m^7G-RNA, 5%
   NAD-RNA/95% m^7G-RNA, and 5% dpCoA-RNA/95% m^7G-RNA, respectively. For
   the assays shown in Fig. [228]3d, total RNAs (20 μg) from 2-month
   C57BL/6 mouse liver (n = 3) were mixed with 0.014 ng of 3′-end
   polyadenylated spike-in RNA (500 nt) that had 0% NAD-RNA/100% m^7G-RNA,
   5% NAD-RNA/95% m^7G-RNA, 10% NAD-RNA/90% m^7G-RNA, 15% NAD-RNA/85%
   m^7G-RNA, and 20% NAD-RNA/80% m^7G-RNA, respectively. For the assays
   shown in Figs. [229]4 and [230]6, total RNAs were extracted from three
   biological replicates of 2-month and 18-month-old C57BL/6 mouse
   samples. A total of 20 μg of RNA was mixed with 0.014 ng of 3′-end
   polyadenylated spike-in RNA (500 nt) containing 0% NAD-RNA/100%
   m^7G-RNA, 1% NAD-RNA/99% m^7G-RNA, 5% NAD-RNA/95% m^7G-RNA, and 20%
   NAD-RNA/80% m^7G-RNA, respectively.

Enzymatic treatments and adapter ligation

   The mixture of total RNAs and spike-in RNA (500 nt) underwent CompasSeq
   sequential enzymatic reactions following the protocols as described in
   “qPCR analysis of NCIN-RNAs (500 nt) and m^7G-RNA (500 nt)” part. We
   incubated the RNAs at 65 °C for 5 min before ligation to prevent RNA
   secondary structure interference. To capture PCR amplification bias,
   barcoded adapters were first introduced into the 5′-end of spike-in
   total RNA from Drosophila using the enzymatic procedures for capturing
   all-capped RNAs. Subsequently, the adapter-ligated Drosophila spike-in
   RNAs were then added to each sample at a mass ratio of 50:1. The mixed
   RNA was subjected to reverse transcription followed by PCR
   amplification.

Reverse transcription and amplification

   Reverse transcription was performed following the protocols as
   described in “qPCR analysis of NCIN-RNAs (500 nt) and m^7G-RNA (500
   nt)” part. The PCR program included an initial denaturation step at
   95 °C for 3 min, followed by 16 cycles of denaturation at 95 °C for
   15 s, annealing at 60 °C for 15 s, and extension at 72 °C for 6 min,
   with a final extension at 72 °C for 5 min.

Library preparation and sequencing

   The purified PCR products were used for NGS library construction
   following the manufacturer′s instructions (AFTMag NGS DNA Clean Beads,
   Abclonal, catalog: RK20257; TruePrep® DNA Library Prep Kit V2 for
   Illumina, Vazyme, catalog: TD501). Library quality was assessed by the
   Bioanalyzer 2100 (Agilent, United States), and quantification was
   performed by RT-qPCR with reference to a standard library. Libraries
   were pooled in equimolar amounts to a final concentration of 2 nM and
   denatured with 0.1 M NaOH (Sigma, catalog: 72068). Sequencing was
   performed on the Illumina NovaSeq 6000 system (paired-end, 150 bp).

High-throughput sequencing data analysis

   All sequencing reads were processed using Trim Galore (v0.6.6)^[231]59
   with the parameters “--nextseq 30 --paired” to remove the adapter
   sequences (AGATCGGAAGAGC) from NovaSeq platforms. Reads longer than 20
   bp were retained. Quality-controlled reads were then mapped to the
   combined genome of Mus musculus (GRCm39) and Drosophila melanogaster
   (dmel-all-chromosome-r6.36) using STAR (v2.7.6a)^[232]60 with default
   parameters, respectively. Uniquely mapped read pairs were counted
   against merged annotations from Mus musculus (Ensembl: GRCm39.104) and
   Drosophila melanogaster (Flybase: dmel-all-r6.36) using featureCounts
   (v2.0.1)^[233]61 with parameters “-p -B -C” and summarized as
   gene-level counts. Transcript abundance was calculated based on the
   transcript FASTA files downloaded from Ensembl release-104
   (Mus_musculus.GRCm39.cdna.all.fa.gz and
   Drosophila_melanogaster.BDGP6.32.cdna.all.fa.gz) using Salmon
   (v1.10.2)^[234]62 with default parameters. Sequencing saturation was
   assessed by randomly subsampling the original libraries and examining
   the corresponding changes in the number of mouse genome-derived genes
   with more than 10 read counts. Read alignments were converted to bigwig
   files by bamCoverage^[235]63 and visualized at selected genomic regions
   by IGV (v2.16.0)^[236]64. Gene annotation information, such as
   chromosome, gene types, and gene lengths, was retrieved from Ensembl
   annotations (GRCm39.104). Visualization of the data was performed using
   various R packages: violin plots, boxplots, bar plots, line charts, the
   density plot, and scatter plots were generated with ggplot2
   (v3.4.2)^[237]65; the upset plot was created with UpSetR
   (v1.4.0)^[238]66; and the heatmap was plotted using ComplexHeatmap
   (v2.14.0)^[239]67.

CompassAnalyzer

   CompassAnalyzer was implemented in R and is publicly available at
   GitHub ([240]https://github.com/thereallda/compassanalyzer). The
   workflow comprised three main steps: (1) quality control, (2) stepwise
   normalizations, and (3) NCIN capping level quantification. The detailed
   steps were outlined below.

Quality control

   The objective of quality control was to eliminate problematic or noisy
   observations from downstream analysis. In this study, we used gene
   filtering to ensure data quality. To keep well-detected genes across
   samples, mouse genes with more than 29 counts in all sequencing
   libraries were retained for subsequent analysis. Drosophila transcripts
   with more than 29 counts in all sequencing libraries were kept for
   normalization procedures. All ribosomal RNA-encoded genes and TEC genes
   were excluded.

Stepwise normalizations

   CompassAnalyzer implemented stepwise normalizations to adjust for
   unwanted variations, enabling the quantitative assessment of the NCIN
   capping level of individual transcripts. The goal of stepwise
   normalizations was to recover the ratio of RNA abundance between
   samples capturing NCIN-capped and all-capped RNAs. CompassAnalyzer
   first employed global scaling normalization to adjust gene-level read
   counts by library-wise scaling factors. Since equal amounts of
   Drosophila RNAs were added as spike-ins to samples after adapter
   ligation, subsequent amplification bias could be represented by the
   discordance of spike-in counts. Library-wise scaling factors (f) were
   generated based on read counts from spike-in using the trimmed mean of
   M-values (TMM) normalization method from the R package edgeR
   (v3.40.2)^[241]68. Particularly, CompassAnalyzer adjusted the
   library-wise scaling factors based on the relative proportion of
   spike-in read counts
   [MATH:
   <msubsup><mrow><mi>p</mi></mrow><mrow><mi>k</mi></mrow><mrow><mo>*</mo>
   </mrow></msubsup> :MATH]
   . The proportion of spike-in read counts was defined as
   [MATH:
   <msub><mrow><mi>p</mi></mrow><mrow><mi>k</mi></mrow></msub><mo>=</mo><m
   frac><mrow><msub><mrow><mi>Y</mi></mrow><mrow><mi>s</mi><mi>p</mi><mi>i
   </mi><mi>k</mi><mi>e</mi><mo>−</mo><mi>i</mi><mi>n</mi><mo>,</mo><mi>k<
   /mi></mrow></msub></mrow><mrow><msub><mrow><mi>Y</mi></mrow><mrow><mi>s
   </mi><mi>a</mi><mi>m</mi><mi>p</mi><mi>l</mi><mi>e</mi><mo>,</mo><mi>k<
   /mi></mrow></msub></mrow></mfrac> :MATH]
   , and the relative proportion of spike-in read counts of all-capped RNA
   and NCIN-capped RNA libraries was defined as
   [MATH:
   <msubsup><mrow><mi>p</mi></mrow><mrow><mi>i</mi><mi>n</mi><mo>,</mo><mi
   >k</mi></mrow><mrow><mo>*</mo></mrow></msubsup><mo>=</mo><mfrac><mrow><
   msub><mrow><mi>p</mi></mrow><mrow><mi>i</mi><mi>n</mi><mo>,</mo><mi>k</
   mi></mrow></msub></mrow><mrow><msub><mrow><mover
   accent="true"><mrow><mi>p</mi></mrow><mo>¯</mo></mover></mrow><mrow><mi
   >i</mi><mi>n</mi></mrow></msub></mrow></mfrac> :MATH]
   and
   [MATH:
   <msubsup><mrow><mi>p</mi></mrow><mrow><mi>e</mi><mi>n</mi><mo>,</mo><mi
   >k</mi></mrow><mrow><mo>*</mo></mrow></msubsup><mo>=</mo><mfrac><mrow><
   msub><mrow><mi>p</mi></mrow><mrow><mi>e</mi><mi>n</mi><mo>,</mo><mi>k</
   mi></mrow></msub></mrow><mrow><msub><mrow><mover
   accent="true"><mrow><mi>p</mi></mrow><mo>¯</mo></mover></mrow><mrow><mi
   >e</mi><mi>n</mi></mrow></msub></mrow></mfrac> :MATH]
   , respectively. Here,
   [MATH:
   <msub><mrow><mi>Y</mi></mrow><mrow><mi>s</mi><mi>p</mi><mi>i</mi><mi>k<
   /mi><mi>e</mi><mo>−</mo><mi>i</mi><mi>n</mi><mo>,</mo><mi>k</mi></mrow>
   </msub> :MATH]
   was the sum of spike-in read counts in library
   [MATH: <mi>k</mi> :MATH]
   and
   [MATH:
   <msub><mrow><mi>Y</mi></mrow><mrow><mi>s</mi><mi>a</mi><mi>m</mi><mi>p<
   /mi><mi>l</mi><mi>e</mi><mo>,</mo><mi>k</mi></mrow></msub> :MATH]
   was the sum of sample read counts in library
   [MATH: <mi>k</mi> :MATH]
   ,
   [MATH: <msub><mrow><mover
   accent="true"><mrow><mi>p</mi></mrow><mo>¯</mo></mover></mrow><mrow><mi
   >i</mi><mi>n</mi></mrow></msub> :MATH]
   and
   [MATH: <msub><mrow><mover
   accent="true"><mrow><mi>p</mi></mrow><mo>¯</mo></mover></mrow><mrow><mi
   >e</mi><mi>n</mi></mrow></msub> :MATH]
   were the average proportion of spike-in read counts for all-capped RNA
   and NCIN-capped RNA libraries, respectively. Therefore, the scaled read
   counts
   [MATH:
   <msubsup><mrow><mi>Y</mi></mrow><mrow><mi>g</mi><mi>k</mi></mrow><mrow>
   <mo>*</mo></mrow></msubsup> :MATH]
   for gene
   [MATH: <mi>g</mi> :MATH]
   in library
   [MATH: <mi>k</mi> :MATH]
   can be defined by
   [MATH:
   <msubsup><mrow><mi>Y</mi></mrow><mrow><mi>g</mi><mi>k</mi></mrow><mrow>
   <mo>*</mo></mrow></msubsup><mo>=</mo><msub><mrow><mi>Y</mi></mrow><mrow
   ><mi>g</mi><mi>k</mi></mrow></msub><mo>/</mo><mrow><mo>(</mo><mrow><msu
   b><mrow><mi>f</mi></mrow><mrow><mi>k</mi></mrow></msub><mo>*</mo><msubs
   up><mrow><mi>p</mi></mrow><mrow><mi>k</mi></mrow><mrow><mo>*</mo></mrow
   ></msubsup></mrow><mo>)</mo></mrow> :MATH]
   , where
   [MATH:
   <msub><mrow><mi>Y</mi></mrow><mrow><mi>g</mi><mi>k</mi></mrow></msub>
   :MATH]
   was the raw read counts for gene
   [MATH: <mi>g</mi> :MATH]
   in library
   [MATH: <mi>k</mi> :MATH]
   . Next, CompassAnalyzer introduces the adjustment factors to adjust
   gene-level read counts. Genes with
   [MATH:
   <msubsup><mrow><mi>Y</mi></mrow><mrow><mi>e</mi><mi>n</mi><mo>,</mo><mi
   >g</mi><mi>k</mi></mrow><mrow><mo>*</mo></mrow></msubsup><mo>/</mo><msu
   bsup><mrow><mi>Y</mi></mrow><mrow><mi>i</mi><mi>n</mi><mo>,</mo><mi>g</
   mi><msup><mrow><mi>k</mi></mrow><mrow><mo>′</mo></mrow></msup></mrow><m
   row><mo>*</mo></mrow></msubsup> :MATH]
   exceeding 1 in at least one sample were denoted as the gene set
   [MATH: <mi>G</mi><mi>c</mi> :MATH]
   by default. For each sample
   [MATH: <mi>j</mi> :MATH]
   , the adjustment factors can be estimated as
   [MATH:
   <msub><mrow><mi>r</mi></mrow><mrow><mi>j</mi></mrow></msub><mo>=</mo><m
   frac><mrow><mn>1</mn></mrow><mrow><msub><mrow><mi>N</mi></mrow><mrow><m
   i>G</mi><mi>c</mi></mrow></msub></mrow></mfrac><mo>*</mo><mo>∑</mo><mro
   w><mo>(</mo><mrow><msub><mrow><mi>log</mi></mrow><mrow><mn>2</mn></mrow
   ></msub><msubsup><mrow><mi>Y</mi></mrow><mrow><mi>e</mi><mi>n</mi><mo>,
   </mo><mi>G</mi><mi>c</mi><mi>k</mi></mrow><mrow><mo>*</mo></mrow></msub
   sup><mo>−</mo><msub><mrow><mi>log</mi></mrow><mrow><mn>2</mn></mrow></m
   sub><msubsup><mrow><mi>Y</mi></mrow><mrow><mi>i</mi><mi>n</mi><mo>,</mo
   ><mi>G</mi><mi>c</mi><msup><mrow><mi>k</mi></mrow><mrow><mo>′</mo></mro
   w></msup></mrow><mrow><mo>*</mo></mrow></msubsup></mrow><mo>)</mo></mro
   w> :MATH]
   , where
   [MATH:
   <msub><mrow><mi>N</mi></mrow><mrow><mi>G</mi><mi>c</mi></mrow></msub>
   :MATH]
   is the number of genes in gene set Gc,
   [MATH:
   <msubsup><mrow><mi>Y</mi></mrow><mrow><mi>e</mi><mi>n</mi><mo>,</mo><mi
   >k</mi></mrow><mrow><mo>*</mo></mrow></msubsup> :MATH]
   and
   [MATH:
   <msubsup><mrow><mi>Y</mi></mrow><mrow><mi>i</mi><mi>n</mi><mo>,</mo><ms
   pace
   width="0.25em"></mspace><msup><mrow><mi>k</mi></mrow><mrow><mo>′</mo></
   mrow></msup></mrow><mrow><mo>*</mo></mrow></msubsup> :MATH]
   were the scaled read counts from the paired NCIN-capped RNA library
   [MATH: <mi>k</mi> :MATH]
   and all-capped RNA library
   [MATH: <msup><mrow><mi>k</mi></mrow><mrow><mo>′</mo></mrow></msup>
   :MATH]
   , respectively.

   To further reduce the unwanted variation, CompassAnalyzer implemented
   an RUV strategy for unwanted variation estimation. Briefly, genes from
   spike-in that have constant expression between NCIN-capped RNA
   libraries and all-capped RNA libraries were used to derive the unwanted
   variations. RUV was then applied to estimate the unwanted factors
   matrix W and its associated nuisance parameters α as previously
   described^[242]34,[243]35. Finally, for a chosen k (1 ≤ k ≤ j), the
   normalized read counts can be defined by
   [MATH:
   <mi>Z</mi><mo>=</mo><msup><mrow><mi>Y</mi></mrow><mrow><mo>*</mo></mrow
   ></msup><mo>−</mo><msup><mrow><mover
   accent="true"><mrow><mi>W</mi></mrow><mrow><mo>^</mo></mrow></mover></m
   row><mrow><mfenced close=")"
   open="("><mrow><mi>k</mi></mrow></mfenced></mrow></msup><msup><mrow><mo
   ver
   accent="true"><mrow><mi>α</mi></mrow><mrow><mo>^</mo></mrow></mover></m
   row><mrow><mfenced close=")"
   open="("><mrow><mi>k</mi></mrow></mfenced></mrow></msup><mo>−</mo><msup
   ><mrow><mn>2</mn></mrow><mrow><msub><mrow><mi>r</mi></mrow><mrow><mi>j<
   /mi></mrow></msub></mrow></msup> :MATH]
   .

NCIN capping level quantification

   The NCIN capping level (
   [MATH: <mi>Q</mi> :MATH]
   ) of gene
   [MATH: <mi>g</mi> :MATH]
   in sample
   [MATH: <mi>j</mi> :MATH]
   could be calculated as:
   [MATH:
   <msub><mrow><mi>Q</mi></mrow><mrow><mi>g</mi><mi>j</mi></mrow></msub><m
   o>=</mo><mspace
   width="0.25em"></mspace><mfrac><mrow><msub><mrow><mi>Z</mi></mrow><mrow
   ><mi>e</mi><mi>n</mi><mo>,</mo><mi>g</mi><mi>k</mi></mrow></msub></mrow
   ><mrow><msub><mrow><mi>Z</mi></mrow><mrow><mi>e</mi><mi>n</mi><mo>,</mo
   ><mi>g</mi><mi>k</mi><mo>′</mo></mrow></msub></mrow></mfrac> :MATH]
   . Genes with low read counts tend to suffer from a high variability,
   which lead to inflated estimates of NCIN capping ratios. To obtain
   stabilized NCIN capping estimates, CompassAnalyzer incorporated
   empirical Bayes shrinkage of NCIN capping ratio using ashr R
   package^[244]69. For subsequent analysis, only NCIN-RNAs with a
   coefficient of variation below 0.3 were retained.

Gene-specific quantification of NCIN capping level by qPCR

   Following CompasSeq enzymatic reactions, reverse transcription, and PCR
   enrichment, the resulting PCR products were used for gene-specific
   quantification of NCIN capping levels via qPCR. The qPCR was conducted
   using primers listed in Supplementary Data [245]1. GAPDH1 from
   Drosophila was used as the baseline internal control. To calculate the
   relative enrichment from RT-qPCR data, the Ct value of the target gene
   was first normalized to the Ct of the GAPDH1. The normalized value from
   the epitranscriptome capture procedure (ΔCt of the target gene
   normalized to GAPDH1) was then normalized to the background (ΔCt
   calculation for the gene in the transcriptome capturing procedure),
   yielding the ΔΔCt value. The linear conversion of this ΔΔCt resulted in
   the fold enrichment.

Quantitative exoribonuclease reduction assay

   For assays depicted in Fig. [246]5c, reactions with dRai1 and Xrn1 were
   performed by adding 0.5 µL of dRai1 (final concentration: 340 nM) and
   2 U of Xrn1 (NEB, catalog: M0338) to 1 µg of either NAD- or m^7G-RNA
   (106 nt), resuspended in 20 μL of 1× NEBuffer r3.1 and incubated at
   37 °C for 1 h with 10 U of RNaseOUT. Products were analyzed by a 12%
   polyacrylamide UREA-PAGE gel. For assays shown in Fig. [247]5e, 500 nt
   spike-in RNAs generated from different sequence templates, including
   1 ng each of NAD-, FAD-, dpCoA-, m^7G-, and ppp-RNA, were mixed with
   1 μg of total RNAs from 2-month C57BL/6 mouse livers (n = 3), followed
   by treatment with dRai1 and Xrn1 or mock treatment. RT-qPCR was
   performed as described above, using primers listed in Supplementary
   Data [248]1. ppp-RNA was used as the baseline internal control. To
   calculate the relative levels, the Ct value of the target gene was
   first normalized to the Ct of the ppp-RNA. The normalized value from
   the treatment with dRai1 and Xrn1 (ΔCt of the target gene normalized to
   the ppp-RNA) was then normalized to the background (ΔCt calculation for
   mock treatment), yielding the ΔΔCt value. For assays shown in
   Fig. [249]5f, 500 nt spike-in RNAs generated from different sequence
   templates, including 1 ng each of 0% NAD-RNA/100% m^7G-RNA, 10%
   NAD-RNA/90% m^7G-RNA, 30% NAD-RNA/70% m^7G-RNA, 50% NAD-RNA/50%
   m^7G-RNA, 70% NAD-RNA/30% m^7G-RNA, 90% NAD-RNA/10% m^7G-RNA, and
   ppp-RNA, were mixed with 1 μg of total RNAs from 2-month C57BL/6 mouse
   livers (n = 3), followed by dRai1 and Xrn1 or mock treatment.

RNA alternative TSS, splicing, and polyadenylation analysis

   TSS information of RNAs was extracted using the refTSS dataset
   (refTSS_v4.1_mouse_coordinate.mm39.bed.gz)^[250]70. Genes with multiple
   TSSs were classified as exhibiting alternative TSS usage. AS events
   were identified using rMATS (v4.1.2)^[251]71 with BAM files from
   transcriptome-capturing samples. Event counts included reads spanning
   junctions defined by rMATS and reads not crossing an exon boundary.
   Splicing events were filtered based on coverage, retaining only those
   with more than zero read counts. For APS analysis, DaPars2^[252]72 was
   applied to transcriptome-capturing samples to calculate the percentage
   of distal PDUI for individual NCIN-capped RNA. Transcripts with a PDUI
   greater than 0.3 were retained for further analysis.

Analysis of conserved sequences and RNA secondary structure

   Conservation analysis was performed using PhastCons scores
   (phastCons35way.UCSC.mm39) obtained from the R package GenomicScores
   (v2.10.0)^[253]46. The conservation score for each NCIN-capped isoform
   was calculated by averaging the PhastCons scores across the 5′
   untranslated region (UTR) of the transcript. For each transcript with
   5′ UTR, the sequence of 100 bp region downstream of TSS was used to
   calculate RNA MFE by ViennaRNA with default parameters^[254]48. To
   generate a background distribution of MFE, transcripts not containing
   NCIN caps were randomly sampled 1000 times from the whole transcriptome
   (after subtracting NCIN-RNAs) with a similar size (n = 5000) to
   transcripts containing NCIN caps (n = 5462).

GMUCT

   GMUCT libraries were constructed and sequenced for 2-month-old and
   18-month-old mouse livers as described by Endres M et al.^[255]49, with
   the following modifications: In brief, poly(A) + RNA was first isolated
   from total RNA, followed by 5′ adapter ligation to capture uncapped RNA
   molecules. The ligated RNA was then reverse transcribed, and PCR
   amplification was performed using primers complementary to the adapter
   sequences. The resulting cDNA was fragmented and used for final library
   construction. The resulting libraries were sequenced on the Illumina
   NovaSeq 6000 system (paired-end, 150 bp). Raw reads on each gene were
   counted by featureCounts and normalized using the TMM normalization
   method from the edgeR package. “Proportion uncapped” for each gene was
   calculated as the log-ratio of TMM-normalized read counts between GMUCT
   and RNA-seq. To further investigate the impact of NCIN-capping on RNA
   stability during aging, we grouped the transcripts based on age-related
   changes in their modification status.

Ribosome profiling data analysis

   Raw counts from [256]GSE123981^[257]52 were normalized using the TMM
   normalization method implemented in the edgeR package. RO for each gene
   was calculated as the log-ratio of TMM-normalized read counts between
   Ribo-seq and RNA-seq. To investigate the correlation between RO and
   NCIN capping during aging, we calculated the fold change in RO between
   the 20-month-old and 3-month-old samples and grouped the transcripts
   based on age-related changes in their modification status.

Pathway enrichment analysis

   Pathway enrichment analysis was performed using the R package
   gprofiler2 (v0.2.1)^[258]73. Pathways were defined based on the
   molecular pathways of Reactome, the biological process of GO, and the
   Kyoto Encyclopedia of Genes and Genomes database. Enriched gene sets
   were constrained to sizes ranging between 5 and 350 genes. Multiple
   testing corrections were performed using gprofiler2’s built-in “g_SCS”
   method, with terms having an adjusted p value of less than 0.05
   considered significantly enriched. The top terms were ranked by the
   smallest adjusted p values in each set and were selected for
   visualization.

Analyses of differentially expressed genes and differential NCIN capping
genes and isoforms

   Differentially expressed genes were analyzed from datasets of
   all-capped RNAs between 2-month-old (young) and 18-month-old (old)
   mouse livers using the edgeR package. The significance of logarithmic
   fold changes was determined by a likelihood ratio test to approximate p
   values, with genes adjusted for multiple testing using the Bonferroni
   procedure to yield a false discovery rate (FDR). Differential
   expression genes were defined as those with an absolute fold change in
   normalized read count ≥ 1.5 and FDR < 0.05 in old compared to young
   samples (Supplementary Data [259]5). Differential NCIN capping genes
   were defined as those with an absolute fold change in NCIN
   capping ≥ 1.5 in old compared to young samples. Differential
   NCIN-capped isoforms were defined as those with an absolute fold change
   in NCIN capping ≥ 1.25 in old compared to young samples.

Statistics and reproducibility

   No statistical methods were used to predetermine sample size. All data
   were included in the analyses. The experiments were not randomized, and
   investigators were not blinded to group allocation. RNA-seq experiments
   were performed with three biological replicates. Unless otherwise
   specified, p values were calculated using two-sided nonparametric
   Mann–Whitney tests. All gel blots were independently repeated at least
   twice with similar results.

Reporting summary

   Further information on research design is available in the [260]Nature
   Portfolio Reporting Summary linked to this article.

Supplementary information

   [261]Supplementary Information^ (3.3MB, pdf)
   [262]41467_2025_61697_MOESM2_ESM.pdf^ (123.6KB, pdf)

   Description of Additional Supplementary Files
   [263]Supplementary Data 1^ (15.7KB, xlsx)
   [264]Supplementary Data 2^ (1,015.9KB, xlsx)
   [265]Supplementary Data 3^ (23.3KB, xlsx)
   [266]Supplementary Data 4^ (605.6KB, xlsx)
   [267]Supplementary Data 5^ (824.9KB, xlsx)
   [268]Supplementary Data 6^ (32KB, xlsx)
   [269]Reporting Summary^ (2MB, pdf)
   [270]Transparent Peer Review file^ (4.5MB, pdf)

Source data

   [271]Source Data^ (11.1MB, xlsx)

Acknowledgments