Abstract
Nucleotide-containing metabolites, e.g., NAD, can serve as noncanonical
initiating nucleotides (NCIN) during transcription, yielding
NCIN-capped RNAs (NCIN-RNAs). Current profiling strategies are limited
to detecting specific metabolite caps and lack an epitranscriptome-wide
approach for quantifying the ratio between NCIN- and m^7G-capped forms.
Here, we develop the CompasSeq analytical platform, which integrates
experimental and computational frameworks, enabling comprehensive and
quantitative assessment of NCIN-RNAs at the transcript resolution.
CompasSeq utilizes carefully devised enzymatic reactions to selectively
capture NCIN-RNAs. By introducing proper spike-ins, CompasSeq can
analyze the stoichiometry of NCIN caps. We further design an orthogonal
method, the quantitative exoribonuclease reduction assay, to validate
newly identified NCIN-RNAs and their capping ratios. Using CompasSeq,
we quantify previously unexplored NCIN capping percentages from mouse
liver and illustrate their age-associated dynamics. Moreover, we
uncover a dichotomy between RNA expression and NCIN capping in genes
impinging on age-related pathways. Our study presents both experimental
and computational solutions for in-depth analysis of NCIN-RNAs, paving
the road for functional investigations into NCIN-RNAs.
Subject terms: Transcriptomics, Sequencing, RNA modification
__________________________________________________________________
Certain cellular metabolites can cap RNA molecules, influencing their
function. Here, the authors develop CompasSeq, a quantitative method to
profile these noncanonical RNA caps across the transcriptome and reveal
their dynamic changes with age in mouse liver.
Introduction
The 5′-terminus of eukaryotic mRNA is characterized by a prevalent cap
structure composed of 5′,5′-triphosphate-linked 7-methylguanosine
(m^7G). This canonical capping exerts profound impacts on modulating
various functions of mRNA species, such as stability, splicing,
polyadenylation, and translation^[36]1,[37]2. Recent studies have
revealed that nucleotide-containing metabolites, including nicotinamide
adenine dinucleotide (NAD), flavin adenine dinucleotide (FAD),
3′-dephospho-coenzyme A (dpCoA), uridine diphosphate glucose (UDP-Glc),
uridine diphosphate N-acetyl glucosamine (UDP-GlcNAc), and dinucleoside
polyphosphates, can be incorporated into the RNA 5′-terminus as
noncanonical initiating nucleotides (NCIN), resulting in
NCIN-RNA^[38]3–[39]7. Since the first discovery of NAD-capped RNA in
bacteria^[40]8, NAD-RNAs have been reported in all kingdoms of
life^[41]9–[42]16. Although studies have indicated the presence of FAD,
dpCoA, UDP-Glc/GlcNAc, dinucleoside polyphosphate caps across different
phyla^[43]13,[44]17–[45]20, comprehensive profiling of NCIN-RNAs at the
epitranscriptome level is only beginning to be revealed.
The current profiling strategies are limited to detecting specific
metabolite caps^[46]9,[47]18,[48]21–[49]26. NAD captureSeq^[50]9, along
with our recently developed methods, ONE-seq^[51]24 and DO-seq^[52]26,
employ chemo-enzymatic reactions to label NAD-capped RNAs with biotin,
catalyzed by adenosine diphosphate-ribosyl cyclase. These biotinylated
RNAs are then captured by streptavidin beads and subjected to
high-throughput sequencing to identify NAD-RNAs. Yet, these methods
only reflect the relative enrichment for a specific type of
metabolite-capped transcript and cannot quantify the ratio between
metabolite-capped and m^7G-capped forms. Decapping-based strategies,
such as CapZyme-seq^[53]27, utilize processing enzymes like NudC to
target NCIN-capped RNAs, yielding RNAs with a 5′-monophosphate.
Subsequently, barcoded RNA adapters are ligated to the 5′-end of these
RNAs, followed by adapter-specific PCR enrichment and high-throughput
sequencing. However, a notable limitation of applying CapZyme-seq in
eukaryotes arises from contaminating signals of m^7G, the dominant cap
type^[54]28. Furthermore, currently available methods lack the ability,
including experimental and computational procedures, for the
epitranscriptome-wide percentage assessment of NCIN capping at the
transcript resolution.
Aging is characterized by physiological declines accompanied with
metabolic and transcriptional alterations^[55]29,[56]30, but how these
processes are integrated into the regulation of aging remains elusive.
Studies in rodents and humans have shown that the levels of NAD and FAD
decline with age in multiple tissues, including the
liver^[57]31–[58]33. Additionally, our recent findings reveal prominent
age-associated alterations in NAD capping in mouse liver and human
peripheral blood^[59]24,[60]34, highlighting the potential role of
metabolite-modified RNAs during aging. Cellular metabolism and gene
expression are fundamental biological processes, making it crucial to
explore how metabolite-modified RNAs, which coordinate metabolic and
transcriptional regulation, are modulated during the adult life cycle
and consequently, their physiological impact on the progression of
aging. However, the absence of quantitative methods to determine
epitranscriptome-wide NCIN caps with stoichiometric information impedes
investigations into the contribution of NCIN capping to a broad
spectrum of biological processes.
In the present study, we demonstrate that the CompasSeq analytical
platform, combined with experimental and computational procedures,
enables a comprehensive and quantitative assessment of NCIN capping
levels in eukaryotes. We further devise an orthogonal and independent
approach, the quantitative exoribonuclease reduction assay, to validate
newly identified NCIN-RNAs and their capping ratios. Utilizing
CompasSeq, we uncover previously unknown features of
metabolite-modified RNAs in adult and aging mouse livers.
Results
The workflow of CompasSeq
We designed the CompasSeq method for Comprehensive percentage
assessment of NCIN-RNAs through enzymatic decapping followed by 5′-end
adapter ligation and high-throughput Sequencing, an analytical platform
that integrated both experimental and computational frameworks
(Fig. [61]1).
Fig. 1. The workflow of CompasSeq.
[62]Fig. 1
[63]Open in a new tab
a CompasSeq implements a parallel experimental pipeline to
simultaneously capture NCIN-capped RNAs and all-capped RNAs (NCIN- and
m^7G-capped RNAs). Meanwhile, CompasSeq implements a dual spike-in
strategy to quantify NCIN-capped RNAs. A set of synthetic NCIN-RNAs
with known capping ratios, is introduced at the beginning to serve as
internal controls. Adapter-pre-ligated Drosophila RNA, is added at the
step of cDNA library construction for subsequent computational
analysis. b Diagram illustrating the workflow of CompassAnalyzer.
CompassAnalyzer first applies stepwise normalizations to recover the
intrinsic RNA abundance distribution between NCIN-capped and all-capped
RNAs from sequencing data. It then employs empirical Bayes shrinkage to
enable quantification of NCIN capping ratios at the transcript level.
First, we subjected total RNA transcripts to cap-specific enzymes. To
selectively capture NCIN-capped RNAs, we began with yDcpS, a decapping
enzyme that hydrolyzes m^7G-capped mRNA, followed by CIAP treatment to
eliminate 5′-phosphate residues. We then introduced the bacterial
RNA-decapping enzyme NudC, which hydrolyzes the diphosphate group in
the cap structures of NAD, NADH, FAD, and dpCoA, ensuring that only
NCIN-RNA yields 5′-monophosphate ready for ligation. To capture both
NCIN- and m^7G-capped RNAs, we initially treated total RNA with CIAP to
remove pre-existing 5′-phosphate moieties. We then applied RppH, a
pyrophosphatase, to hydrolyze pyrophosphate from the 5′ end of
triphosphorylated RNA, together with NudC, to generate 5′-monophosphate
RNAs from all-capped transcript forms. Second, we performed ligation
between 5′ oligoribonucleotide adapters and 5′-monophosphate RNAs
derived from either NCIN- or all-capped transcripts, followed by
polyA-selected reverse transcription and high-throughput sequencing. Of
note, we deliberately introduced an exogenous RNA spike-in as an
internal control essential for subsequent computational analysis
(Fig. [64]1a). Third, we proceeded with a computational framework
composed of stepwise normalizations and empirical Bayes shrinkage.
Stepwise normalizations, including global scaling using exogenous RNA
spike-ins, linear adjustment, and the RUV strategy^[65]35, were applied
to recover the intrinsic RNA abundance distribution between NCIN-capped
and all-capped RNAs from sequencing data. Empirical Bayes shrinkage was
then employed to enable quantification of NCIN capping ratios at
transcript resolution (Fig. [66]1b).
Development of CompasSeq experimental procedures
We generated 38 nt RNA spike-ins via in vitro transcription, using
adenine exclusively as the initial nucleotide in the DNA template to
synthesize NAD, dpCoA, and FAD-capped RNAs (Fig. [67]2a). As
anticipated, capped RNAs (NAD and dpCoA) exhibited higher molecular
weight than uncapped forms (ppp-RNA) as observed by UREA-PAGE gel
electrophoresis (Fig. [68]2a). Consistent with previous
reports^[69]22,[70]36, two RNA species for FAD-RNA were observed, with
upper band being intact FAD-capped RNA and the lower band resulting
from the loss of the flavin moiety together with one phosphate group
(Fig. [71]2a). To corroborate the quality of the RNA spike-ins, we
employed boronate affinity electrophoresis. Boronyl groups from boric
acid form relatively stable complexes with vicinal diols, naturally
occurring in the m^7G of the m^7G cap and the nicotinamide riboside of
the NAD cap, leading to gel retardation during electrophoresis^[72]37.
Indeed, only RNA with m^7G and NAD caps, but not ppp-RNA, exhibited
electrophoretic mobilities, indicating the presence of these specific
caps in the RNA spike-ins (Fig. [73]2b).
Fig. 2. Development of CompasSeq.
[74]Fig. 2
[75]Open in a new tab
a In vitro transcription with ATP, m^7GpppA, NAD, FAD, or dpCoA as the
initiating nucleotide. Sequence of the DNA template with the T7
promoter (Top). The “A” in red was the transcription start site and the
coding sequence has been underlined. In vitro-transcribed RNAs with
different 5′ end caps were resolved in an 8% PAGE gel (Bottom). The
arrows indicated the intact FAD-capped species. b Boronate affinity
electrophoresis showing in vitro-transcribed 38 nt RNA, including ppp-,
m^7G-, and NAD-capped RNAs. c UREA-PAGE gels showing products of
processing of NAD- and m^7G-RNA 5′-ends by NudC, dRai1, RppH, and
yDcpS. d UREA-PAGE gel showing products of processing of FAD- and
dpCoA-RNA 5′-ends by NudC. e Analysis of the effect of RNA secondary
structure on yDcpS decapping. All m^7G-capped RNA hairpins and linear
30-mers (1 µg each) were treated by 9 µM yDcpS for 1 h and then
resolved in a 12% PAGE gel. f After NudC treatment, NCIN-RNA, such as
NAD-RNA (38 nt), could be ligated with the 5′-adapter (27 nt), as shown
by the shift of an upper band in the UREA-PAGE gel (left). m^7G-capped
RNA (38 nt) after being processed by yDcpS and CIAP, could not be
ligated with the 5′-adapter (middle). RppH-processed m^7G-RNA, on the
other hand, contained a 5′-phosphate group for adapter ligation
(right). Source data are provided as a [76]Source Data file.
Assisted by 38 nt RNA spike-ins, we examined the specificity of
decapping enzymes (Fig. [77]2c). NudC, known for its ability to
hydrolyze diphosphate but not triphosphate linkages, effectively
removed the NAD cap but also exhibited non-specific cleavage of the
m^7G cap upon extended treatment (Fig. [78]2c, d). We also evaluated
exoribonucleases from the DXO/Rai1 family, known for their decapping
activity^[79]27,[80]38. We purified the dRai1 protein, a Drosophila
homolog of Rai1, and found that it selectively degraded NAD-capped RNA
while leaving m^7G-capped RNA intact (Fig. [81]2c). However, dRai1 also
caused additional nucleotides removal due to its exoribonuclease
activity (Fig. [82]2c). In contrast, yDcpS specifically targeted m^7G
cap, while RppH removed both m^7G and NAD caps, lacking specificity
(Fig. [83]2c). It has been reported that RNA secondary structure would
hinder the activity of yDcpS^[84]39. We generated m^7G-capped RNAs with
polymorphic hairpins and 3′ or 5′ overhangs to test whether excess
yDcpS (~9 µM) would overcome this issue. Our results showed that excess
yDcpS completely removed the m^7G cap from RNAs regardless of their
linear or secondary structures (Fig. [85]2e). Above experiments
established the careful design of the CompasSeq workflow, including the
choice and order of decapping enzyme treatments. The workflow started
with a cascade of enzymatic reactions, followed by adapter ligation
(Fig. [86]1a). To test the efficacy of adapter ligation, we subjected
38 nt NAD-RNA and m^7G-RNA to CompasSeq. As evidenced by the
accumulation of an upper band (Fig. [87]2f), we demonstrated the
specificity of cap-processing enzymes in combination with adapter
ligation for NAD- and m^7G-capped forms, respectively.
The feasibility of CompasSeq analytical platform
To evaluate the feasibility of the CompasSeq analytical platform,
encompassing both experimental and computational frameworks, we
undertook several validation steps. First, we synthesized long RNA
spike-ins (500 nt) with different sequences and cap structures,
including NAD, FAD, dpCoA, m^7G, and pppA. We mixed 20 µg of total RNA
from mouse livers with 1 ng of each spike-in and subjected this mixture
to the CompasSeq experiment, followed by qPCR analysis (Fig. [88]3a).
Our results indicated that only NCIN, and not m^7G- and ppp-RNAs, were
enriched in the epitranscriptome-capturing group, whereas both NCIN-
and m^7G-RNAs were significantly enriched in the
transcriptome-capturing group (Fig. [89]3b). Second, we aimed to
measure the percentage of NCIN-capped RNA spike-ins. We excluded
FAD-RNA spike-in from this and subsequent experiments due to its
spontaneous decapping, which might confound calculations. Instead, we
synthesized paired RNA spike-ins with identical sequences (500 nt) but
different cap structures, including NCIN (NAD and dpCoA) or canonical
m^7G cap—each followed by a polyadenylated tail. It is likely that
native transcripts exist in both NCIN- and m^7G-capped versions, with
proportions varying by gene. This experimental setup was intended to
mimic endogenous transcripts that have 5% NCIN-capped modification
(mixed with 95% m^7G-RNA) or 0% NCIN-capped modification (i.e., 100%
m^7G-RNA) spike-in representing genes with no NCIN capping, thereby
allowing us to assess the specificity of CompasSeq. We mixed 20 µg of
total RNA from 2-month mouse livers with 0.014 ng of synthetic spike-in
RNAs and subjected this mixture to the CompasSeq experiment, followed
by polyA-selected RNA sequencing.
Fig. 3. The feasibility of CompasSeq.
[90]Fig. 3
[91]Open in a new tab
a Schematic workflow of gene-specific assessment of NCIN modification
with CompasSeq platform by qPCR. Total RNAs from 2-month-old mouse
livers were mixed with polyadenylated spike-in RNAs (500 nt), followed
by enzymatic reactions of CompasSeq for qPCR analysis. ppp-RNA (500 nt)
was included as a baseline control. b qPCR analysis showed that
synthetic RNAs (500 nt) containing NCIN caps, including NAD, FAD, and
dpCoA, could be significantly and selectively enriched by the procedure
designed for capturing NCIN-RNAs, whereas ppp- and m^7G-RNA did not
exhibit such enrichment (left). In the procedure designed for capturing
m^7G- and NCIN-RNAs, all RNAs, except ppp-RNA, were significantly
enriched (right). Data were shown in mean ± s.e.m. (n = 3 independent
experiments). Two-tailed Welch′s t-tests were used to infer p values.
n.s. denoted not significant. c RNA-seq experiment of spike-in RNAs
determined the specificity of CompasSeq. A total of three different
spike-in RNAs with different capped forms were mixed with total RNAs
from 2-month-old mouse liver, and the mixture of RNAs was subjected to
CompasSeq. For each capped form, two polyadenylated spike-in RNAs with
identical sequences (500 nt) but have either NCIN (NAD or dpCoA) or
m^7G cap, were used to mimic endogenous NCIN-capped transcript with
different capped forms. As shown in the bar chart, NCIN-RNAs, but not
m^7G-RNA, were significantly and selectively enriched with NCIN capping
levels close to their expected ratio. Data were shown in mean ± s.e.m.
(n = 3 independent experiments). d RNA-seq experiment of spike-in RNAs
determined the sensitivity of CompasSeq. A total of five different
polyadenylated spike-in RNAs (500 nt) with different proportions of
NAD-RNA were mixed with total RNAs from 2-month-old mouse livers, and
the mixture of RNAs was subjected to CompasSeq. Two spike-in RNAs with
identical sequences but have either NAD or m^7G cap, were used to mimic
endogenous NCIN-capped transcript with different NCIN capping levels.
Scatter plot showing the concordance between expected and observed NCIN
capping levels from spike-in RNAs. The Pearson correlation coefficient
and the associated two-sided p value were shown in the top right
corner. Data were shown in mean ± s.e.m. (n = 3 independent
experiments). Exact p values and source data are provided as a
[92]Source Data file.
Importantly, we added pre-treated total RNA from Drosophila at a mass
ratio of 50:1 for each sample, serving as a baseline control after
adapter ligation, a critical procedure for downstream computational
analysis (Supplementary Fig. [93]2a). We reasoned that exogenous
Drosophila RNA spike-ins could be utilized to reflect and correct
unwanted variations introduced by experimental procedures. In
principle, Drosophila spike-ins, when added equally to each sample,
should maintain a constant proportion across samples, capturing either
NCIN- or all-capped forms. Given the fact that NCIN-capped RNAs
constituted only a small proportion of the total RNA pool, read counts
from the NCIN-capped library should be lower than those from the
all-capped library. In practice, we observed that the proportion of
reads from the Drosophila genome was higher in samples capturing
NCIN-capped than all-capped forms (Supplementary Fig. [94]2b).
Therefore, we employed a stepwise normalization procedure to address
these variations, to enable the recovery of the theoretical
distribution of NCIN-capped RNAs. We found that raw read counts
displayed similar distributions between NCIN-capped RNA and all-capped
RNA libraries (Supplementary Fig. [95]2c). Upon normalizations, gene
measurements from all-capped forms exhibited higher values compared to
those from NCIN-capped libraries, reflecting the intrinsic distribution
of NCIN-capped transcripts (Supplementary Fig. [96]2c). Correlation
analysis revealed that normalized gene measurements exhibited improved
reproducibility compared to the raw read counts (Supplementary
Fig. [97]2d). However, NCIN capping estimated from genes with
relatively low expression exhibited higher variation compared to highly
expressed genes, an observation potentially caused by genes with low
read counts that tend to suffer from high variability^[98]40. To
address this issue, we introduced the empirical Bayes shrinkage
strategy. As shown in Supplementary Fig. [99]2e, CompassAnalyzer could
significantly mitigate such variations. Comparatively, exogenous total
RNA spike-ins added at the beginning step failed to correct PCR
amplification biases, as shown by the poor performance of NCIN-capped
RNA proportions (Supplementary Fig. [100]2f).
Using CompassAnalyzer, we were able to analyze synthetic RNA spike-ins
with known ratios, revealing that this platform not only captured
metabolite-modified RNAs but also their relative ratios for the 5% of
NCIN-capped spike-in (Fig. [101]3c). Background enrichment for the 100%
m^7G-RNA spike-in was significantly lower (less than 2%) compared to
the 5% NCIN-capped spike-in (Fig. [102]3c), reflective of the
specificity. We further employed spike-in RNAs with ascending ratios of
NAD-capped forms: 0% (i.e., 100% m^7G-capped), 1% (i.e., 99%
m7G-capped), 5% (i.e., 95% m7G-capped), 10% (i.e., 90% m^7G-capped),
and 20% (i.e., 80% m7G-capped) (Supplementary Fig. [103]2g). The
CompasSeq method, with its streamlined experimental and computational
procedures, could reach coherent results between observed and expected
NCIN capping levels (Pearson′s r = 0.998, Fig. [104]3d), thus
illustrating the feasibility of this method in capturing and
quantitatively measuring endogenous metabolite-modified RNAs.
NCIN-RNA epitranscriptome
With the CompasSeq analytical platform, we characterized NCIN-RNAs from
young (2-month) and aged (18-month) mouse liver tissues. Following
standard quality filtering procedures, each of the three independent
biological replicates yielded an average of ~21.29 million high-quality
reads that uniquely aligned to the reference genome (Supplementary
Fig. [105]3a). PCA showed that the biological replicates of each sample
type clustered together, indicating high reproducibility (Supplementary
Fig. [106]3b). Dataset assessments corroborated that sequencing
saturation has been reached (Supplementary Fig. [107]3c). Based on the
background enrichment signal detected by 100% m^7G-RNA spike-in, which
was less than 2% (Supplementary Fig. [108]3d), we chose 2% ratio as the
cutoff. This criterion led us to identify a total of 5680 highly
reproducible NCIN-RNAs (Supplementary Data [109]2; Pearson′s r = 0.895
of the three biological replicates, Supplementary Fig. [110]3e),
accounting for ~47.2% of the mRNA transcripts expressed in young mouse
liver tissues. Besides RNA-seq, we also applied RT-qPCR. By analysis of
three randomly selected genes, Ndufb10, Fkbp8, and Fads1, we calculated
the relative abundance between NCIN-capped and all-capped groups
(Supplementary Fig. [111]4a). The results obtained by RT-qPCR were in
good congruence with those from high-throughput sequencing
(Supplementary Fig. [112]4b), suggesting that CompasSeq can also be
used for gene-specific assessment. Furthermore, comparison between the
NCIN-RNAs, including, but not limited to, NAD-capped RNAs, identified
in the current study with NAD-capped RNAs identified by our recently
published ONE-seq method^[113]24 revealed a significant overlap
(Supplementary Fig. [114]4c). For those NAD-RNAs uniquely identified by
ONE-seq, their expression levels were significantly low (Supplementary
Fig. [115]4d).
We briefly analyzed the NCIN-capped epitranscriptome from young mouse
livers. By computing NCIN capping levels, we observed that most genes
producing NCIN caps had modification ratios between 5% and 30%, with
some genes exhibiting exceptionally high levels, such as Rnr1 (91.1%)
(Fig. [116]4a). NCIN capping predominantly occurred on protein-coding
genes, but also extended to pseudogenes and noncoding RNAs
(Fig. [117]4b). NCIN-RNAs were derived from genes located on autosomes,
the X chromosome, and the mitochondrial genome, but not from the Y
chromosome (Fig. [118]4c). Based on the extent of modification, we
divided NCIN-RNAs into three categories: high (30–100%, e.g., Fads1),
medium (12.5–30%, e.g., Fkbp8), and modest levels (2–12.5%, e.g.,
Ndufb10) (Fig. [119]4d and Supplementary Fig. [120]4e). To inspect
biological function, we performed pathway enrichment analysis and found
that genes with modest NCIN capping were mainly involved in
mitochondria-related events and energy metabolism, whereas genes with
medium and high NICN capping levels were enriched in macroautophagy,
vesicle-related processes, and cellular responses (Fig. [121]4e and
Supplementary Data [122]3).
Fig. 4. The quantitative NCIN-capped epitranscriptome from mouse liver.
[123]Fig. 4
[124]Open in a new tab
a The distribution of NCIN capping levels of each gene from 2-month-old
mouse livers (n = 5462). Box plot represented the median (center line),
the first and third quartiles (box limits), and the whiskers extended
to 1.5 times the interquartile range. Data points beyond the whiskers
were shown as outliers. b Pie chart showing the gene types of
identified NCIN-RNAs. c Bar chart showing the chromosome distribution
of identified NCIN-RNAs. d The distribution of NCIN capping levels of
genes with modest (2–12.5%), medium (12.5–30%), and high (30–100%) NCIN
capping levels. e Pathway enrichment analysis of genes with modest,
medium, and high NCIN capping levels. Numbers represented the count of
NCIN-RNA enriched in each pathway. Multiple testing corrections were
performed using gprofiler2’s built-in “g_SCS” method. The gray dashed
line indicated the cutoff of 0.05 adjusted p value. Exact p values and
source data are provided as a [125]Source Data file.
Validation of CompasSeq by an orthogonal method
To validate the CompasSeq platform by an orthogonal approach, we
introduced two 5′−3′ exoribonucleases: dRai1 and Xrn1. Our rationale
was based on the above observation that dRai1 specifically decaps
NCIN-, but not m^7G-capped RNAs, leading to 5′-to-3′ cleavage
(Fig. [126]2b). This effect could be further enhanced by Xrn1 towards
degradation, whereas m^7G-capped transcripts would largely remain
intact (Fig. [127]5a). We first validated the decapping activity of
dRai1 on FAD- and dpCoA-RNAs (see Fig. [128]2b for NAD-RNA) by
observing the appearance of lower bands on the UREA-PAGE gel
electrophoresis (Fig. [129]5b). Second, we subjected 106 nt NAD-RNA and
m^7G-RNA spike-ins to dRai1 and Xrn1 treatment, which showed selective
degradation of NAD-, but not m^7G-capped, RNA (Fig. [130]5c). Prompted
by this evidence, we hypothesized that the relative level of
NCIN-capped transcripts could be determined by the reduction in
abundance after treatment with dRai1 and Xrn1, compared to a mock
treatment that includes both NCIN- and m^7G-capped forms. We named this
method the quantitative exoribonuclease reduction assay (Fig. [131]5d).
We subjected 1 μg of total RNA from mouse livers mixed with 0.1 ng of
synthetic RNA spike-ins (500 nt), including m^7G, NAD, FAD, dpCoA, and
pppA, to the assay. Consistently, only NCIN-RNAs exhibited significant
degradation, while m^7G-RNA remained unchanged (Fig. [132]5e). To
assess whether the assay could measure NCIN capping levels, we mixed
1 μg of total RNA with 500 nt spike-in RNAs containing ascending ratios
of NAD relative to m^7G-capped forms: 0%, 10%, 30%, 50%, 70%, and 90%.
We observed that relative capping levels measured by the assay
corresponded to their expected levels (Fig. [133]5f). Furthermore, we
successfully validated the NCIN capping levels of five randomly
selected genes: Idh1, F13b, Ssr3, Rnr2, and Rnr1, as identified and
quantified by CompasSeq (Fig. [134]5g, h). As a control experiment, we
confirmed the absence of NCIN capping in five genes not identified by
CompasSeq (Apom, Cox6a1, Dbi, Ddt, and Lrg1) (Supplementary
Fig. [135]5). Taken together, our results using quantitative
exoribonuclease reduction assay strongly supported the reliability of
CompasSeq for the identification and ratio quantification of NCIN-RNAs.
Fig. 5. Validation of CompasSeq by quantitative exoribonuclease reduction
assay.
[136]Fig. 5
[137]Open in a new tab
a The validation strategy for the treatment with dRai1 and Xrn1. b
UREA-PAGE gel showing products of processing of 38 nt FAD- and
dpCoA-RNA 5′-ends by dRai1. c UREA-PAGE gel showing products of
processing of 106 nt NAD- and m^7G-RNA by dRai1 and Xrn1. d Schematic
workflow of gene-specific assessment of NCIN modification with
quantitative exoribonuclease reduction assay. e Synthetic RNAs (500 nt)
containing NCIN caps (NAD, FAD, and dpCoA) could be significantly and
selectively degraded by dRai1 and Xrn1, whereas m^7G-RNA did not
exhibit such degradation. Data were shown in mean ± s.e.m. (n = 3
independent experiments). Two-tailed Welch′s t-tests were used to infer
p values. n.s. denoted not significant. f Synthetic RNAs (500 nt) with
ascending ratios of NAD-capped forms: 0% (i.e., 100% m^7G-capped), 10%,
30%, 50%, 70%, and 90% were used for the quantitative exoribonuclease
reduction assay. The red and blue numbers represented the expected and
observed NCIN capping levels, respectively. The observed NCIN capping
levels were calculated as the difference between 1 and the relative
levels from enzymatic treatment measured using the assay. Data were
shown in mean ± s.e.m. (n = 3 independent experiments). g Five
endogenous genes from mouse liver were assessed by the quantitative
exoribonuclease reduction assay. Data were shown in mean ± s.e.m.
(n = 3 samples). h Comparison between the result obtained by
quantitative exoribonuclease reduction assay and CompasSeq. The red
bars represented NCIN capping levels obtained from CompasSeq (left
axis), while the blue bars illustrated NCIN capping levels calculated
from the assay (right axis). Data were shown in mean ± s.e.m. (n = 3
samples). Two-tailed Welch′s t-tests were used to infer p values. n.s.
denoted not significant. Exact p values and source data are provided as
a [138]Source Data file.
Characterization of metabolite-capped RNA at the transcript resolution
We proceeded to characterize newly identified NCIN-RNAs at the
transcript resolution. Eukaryotic genomes feature transcript isoforms
that can be derived from common gene loci. In-depth analysis revealed
that, for most NCIN-RNAs (98.5%), their isoforms, if transcribed, all
contained NCIN caps, e.g., Serpinf2, although the actual capping level
varied among individual isoforms. Only 1.5% of NCIN-RNAs had select
isoforms being modified, e.g., Zfp36 (Supplementary Fig. [139]6a, b and
Supplementary Data [140]4). This observation prompted us to explore the
link between NCIN capping and gene transcription, including transcript
abundance, gene length, number of introns, alternative transcription
start sites (ATSS)^[141]41, alternative splicing (AS)^[142]42, and
alternative polyadenylation (APA)^[143]43. First, by dividing NCIN-RNAs
into 10 deciles based on their NCIN capping levels, we observed that
transcript abundance was negatively correlated with NCIN capping
levels, while transcript length and number of introns were positively
correlated (Supplementary Fig. [144]6c). Second, we found that RNAs
with high levels of NCIN capping tended to possess ATSS (Supplementary
Fig. [145]6d). Third, we noted that NCIN capping was negatively
associated with AS events, including exon skipping, alternative 5′
donor and 3′ acceptor sites, mutually exclusive exons, and intron
retention (Supplementary Fig. [146]6e). Fourth, we assessed the extent
of APA by the percentage of distal poly(A) site usage index (PDUI).
This analysis illustrated that transcripts containing NCIN caps were
more likely to have APA compared to RNAs without NCIN modification.
Consistently, NCIN capping levels positively correlated with distal
poly(A) site usage (Supplementary Fig. [147]6f). NCIN capping
represents an evolutionarily conserved RNA
modification^[148]5,[149]12,[150]27,[151]44,[152]45. To explore the
potential impact of sequence conservation, we utilized PhastCons
scores, which measure conservation based on sequence
similarity^[153]46, to calculate the conservation score of each
isoform. Our findings indicated that transcripts with low sequence
conservation exhibited less metabolite capping, whereas transcripts
with high sequence conservation tended to be highly modified
(Supplementary Fig. [154]7a). Prior research has indicated that
proximal RNA secondary structures can affect RNA capping^[155]47.
Therefore, we analyzed sequence features within 100 bp downstream of
the transcription start site (TSS), assessing potential structural
influences based on minimum free energy (MFE)^[156]48. Our analysis
revealed that RNAs with NCIN caps tended to adopt predicted structures
with lower MFE than those with m^7G caps. Furthermore, RNAs with higher
NCIN capping exhibited even lower MFE levels (Supplementary
Fig. [157]7b, c).
NCIN capping on transcript stability and translation efficiency
To investigate how NCIN capping affects RNA stability, we generated
Genome-wide Mapping of Uncapped and Cleaved Transcripts (GMUCT)^[158]49
datasets to assess mRNA degradation in mouse livers. All GMUCT
libraries exhibited read enrichment at the 3′ end of mRNAs
(Supplementary Fig. [159]7d), and GMUCT read abundance per transcript
was positively correlated with mRNA levels quantified by RNA-seq
(Supplementary Fig. [160]7e), which aligned with previously reported
degradome sequencing patterns^[161]50. Using the GMUCT and RNA-seq
data, we assessed transcript stability using the “proportion uncapped”
metric (log[2][GMUCT/RNA-seq]), where higher values suggest lower
stability. We analyzed the stability of RNA transcripts with and
without NAD^[162]24 and NCIN caps (CompasSeq data from the current
study), respectively. Our analysis revealed that NAD-capped RNAs in
mouse liver exhibited a higher proportion of 5′ degradation
intermediates, suggesting greater instability compared to RNAs without
NAD modification (Supplementary Fig. [163]7f). These observations were
in line with findings in Arabidopsis, where NAD capping was found to be
associated with reduced RNA stability^[164]51. In a sharp contrast,
NCIN-capped RNAs from CompasSeq data showed a significantly lower
proportion of degradation intermediates (Supplementary Fig. [165]7g).
These data suggested that NCIN caps, as the compendium of NAD, FAD,
dpCoA caps, etc., tended to occur at transcripts with higher stability.
To examine the impact of NCIN capping on translational efficiency, we
calculated ribosome occupancy (RO) using available Ribo-seq and RNA-seq
data from mouse liver (GEO: [166]GSE123981^[167]52). RO, an estimate of
the translation efficiency, was computed as the log-ratio of Ribo-Seq
and RNA-Seq read counts. Our analysis indicated that NCIN capping was
positively correlated with translation efficiency (Supplementary
Fig. [168]7h). Together, these findings suggested that NCIN caps were
linked with transcripts exhibiting greater stability and higher
translation efficiency.
NCIN capping dynamics during aging
The number of genes harboring NCIN caps and their relative capping
levels exhibited a significant decrease during aging (Fig. [169]6a).
Moreover, we quantified the usage of metabolite caps by calculating the
weighted sum of NCIN capping levels normalized to transcript abundance.
Additionally, metabolite-capped RNA constituted ~8.3% of the mRNA
transcriptome in young animals, and this proportion declined to 6.5% in
aged mice, though the difference was not statistically significant
(Fig. [170]6b).
Fig. 6. Features of NCIN-capped epitranscriptome.
[171]Fig. 6
[172]Open in a new tab
a Violin plot showing the global dynamics of NCIN-RNAs during aging.
Box plots represented the median (center line), the mean (red dots),
the first and third quartiles (box limits), and the whiskers extended
to 1.5 times the interquartile range. Data points beyond the whiskers
were shown as outliers. Statistical analysis was performed using the
nonparametric Mann–Whitney rank test (two-sided). b Pie chart showing
the percentile usage of RNA caps (NCIN caps and non-NCIN cap) between
young and old mouse livers. Statistical analysis was performed using a
chi-squared test (p = 0.83, two-sided). c Scatterplot showing NCIN
capping levels in young and old mouse livers. Differentially modified
genes were defined as absolute fold change of NCIN capping levels ≥ 1.5
in old compared to young samples. d Scatterplot showing
log[2]-transformed normalized gene expression levels in young and aged
mouse livers. Differentially expressed genes were defined as absolute
fold change of normalized read counts ≥ 1.5 and adjusted p
values < 0.05 in old compared to young animals. e Upset plot showing
the number of intersections between differentially modified genes and
differentially expressed genes. The size of each set was shown in the
left bottom corner. f Heatmap showing the log[2]-transformed normalized
gene expression levels and NCIN capping levels of selected genes. Genes
with age-decreased NCIN capping were colored in blue, while the gene
with age-increased NCIN capping was colored in red. g Circular lollipop
chart showing the transcriptomic and epitranscriptomic pathway score of
each pathway. The pathway score was defined as the log[2]-transformed
average of fold change between normalized read counts or NCIN capping
levels derived from transcriptome or epitranscriptome profiles from old
and young animals, respectively. Statistical analyses were performed
using the nonparametric Mann–Whitney rank test (two-sided). Pathways
with significant alterations during aging (p < 0.05) were marked with
red asterisk. h Box plot showing the log[2]-transformed normalized gene
expression levels and NCIN capping levels of pathways with significant
alterations during aging (n = 3 samples). Box plots represented the
median (center line), the first and third quartiles (box limits), and
the whiskers extended to 1.5 times the interquartile range. Data points
beyond the whiskers were shown as outliers. Statistical analyses were
performed using the nonparametric Mann–Whitney rank test (two-sided).
Exact p values and source data are provided as a [173]Source Data file.
We further investigated the impact of age on NCIN capping. Our analysis
revealed that 23.3% of genes producing NCIN-capped isoforms exhibited a
uniform decrease in capping levels with age (e.g., Bri3), while only
0.5% showed a consistent increase (e.g., Ghitm; Supplementary
Fig. [174]8a). Additionally, 5.8% of genes with NCIN-capped isoforms
conferred discordant changes with age, with some isoforms increasing
while others decreasing in capping levels (e.g., Spryd7; Supplementary
Fig. [175]8a, b). Notably, specific isoforms could even lose their
metabolite caps with age (e.g., Comt; Supplementary Fig. [176]8b).
We conducted a comparative analysis between the epitranscriptome and
transcriptome (Fig. [177]6c, d and Supplementary Data [178]5).
Strikingly, most genes that exhibited altered NCIN capping with age
remained unchanged at the RNA level (Fig. [179]6e). We further devised
a scheme to filter NCIN-RNAs based on low variation, high modification,
and high dynamics (Supplementary Fig. [180]9a). This approach allowed
us to pinpoint five NCIN-capped genes that had the most significant
changes during aging. Four genes showed decreased NCIN capping,
including those involved in diphosphate hydrolysis (Nudt5), the
cAMP-dependent pathway to Golgi (Pde4dip), lipid transport (Stoml1),
and cytosolic sulfonation (Sult1c2), while one gene, associated with
selenoprotein production (Selenot), exhibited an increase in NCIN
capping (Fig. [181]6f).
To inspect the potential impact of metabolite capping during aging, we
performed pathway enrichment analysis and found that genes with
decreased modification were involved in translation, ribosome-related,
and mitochondrion-related events (Supplementary Fig. [182]9b and
Supplementary Data [183]6). Notably, transcriptome analysis showed that
genes with increased expression during aging were also involved in
translation and ribosome-related events, while genes with decreased
expression were enriched in lipid, glucose, steroid, and amino acid
metabolism (Supplementary Fig. [184]9b). These results reflected that
genes involved in basic cellular events conferred opposite changes in
transcriptome and NCIN-capped epitranscriptome during aging. We further
examined how the transcriptome and NCIN-capped epitranscriptome
impinged on aging pathways. Hallmarks of aging, including inflammation,
mitochondrion, proteostasis, cellular senescence, cell communication,
and transcription regulation, showed discordant trends between the
transcriptome and epitranscriptome with age (Fig. [185]6g).
Specifically, genes involved in immune system process, translation,
peptide metabolism, cell communication, and signal transduction had a
marked increase in expression during aging, while exhibiting a
significant decline in NCIN capping (Fig. [186]6h).
To investigate the impact of NCIN capping on RNA stability during
aging, we applied GMUCT datasets from young and aged mouse livers and
grouped the transcripts based on their modification status. Using
NAD-RNA data, a late-onset increase of NAD capping was significantly
associated with a larger proportion of uncapped transcripts in old
compared to young animals (Supplementary Fig. [187]9c). This finding
suggested a link between elevated NAD capping during aging with
increased RNA degradation. In contrast, we did not observe a
significant correlation between age-associated changes in NCIN capping
and RNA degradation (Supplementary Fig. [188]9d), consistent with RNA
expression data shown in Fig. [189]6e. Similarly, we analyzed Ribo-seq
datasets (GEO: [190]GSE123981) with NCIN capping and noted that
transcripts with altered modification of metabolite in aged mouse
livers were found to be associated with reduced translation efficiency
(Supplementary Fig. [191]9e).
In summary, we introduce CompasSeq, a strategy for the comprehensive
and quantitative ratio assessment of NCIN-capped RNAs in eukaryotic
organisms. Using CompasSeq, we elucidate the NCIN-capped
epitranscriptome, encompassing not only the types of RNAs capped by
NCIN but also their relative capping levels, from adult aging mouse
livers. We unveil distinct features between the transcriptome and
NCIN-capped epitranscriptome, suggesting that metabolite capping may
serve as a hidden layer of gene regulation that occurs with age.
Discussion
Nucleotide-containing metabolites, including NAD, NADH, FAD, and dpCoA,
can be incorporated at the 5′ terminus of RNAs, resulting in
NCIN-capped RNAs. Currently available methods are limited to the
identification of specific metabolite-modified RNAs, e.g.,
NAD-RNA^[192]9,[193]22,[194]24, FAD-RNA^[195]19 and
dpCoA-RNA^[196]18,[197]53,[198]54, and lack quantitative assessment of
the ratio between m^7G-capped and metabolite-capped transcripts. The
CapZyme-seq method has been used to identify NCIN-RNAs in
prokaryotes^[199]27; however, due to the prevalence of the m^7G cap in
eukaryotic transcriptomes and the absence of specifically designed
spike-ins as well as computational solutions, this method and others
are not applicable for investigating eukaryotic NCIN-RNAs, especially
their capping ratios. In this study, we present CompasSeq, an
analytical platform that integrates both experimental and computational
frameworks to simultaneously capture and quantify NCIN capping levels
at the epitranscriptome level in eukaryotes.
For the experimental part, we meticulously design enzymatic reactions,
selecting cap-specific enzymes and ordering treatments to generate a
monophosphate moiety at the 5′-end of NCIN-capped and all-capped RNAs,
respectively, followed by adapter ligation and PCR enrichment. To
minimize biases introduced during PCR amplification, we devise a
spike-in normalization strategy. Pre-treated exogenous RNAs are added
proportionally to adapter-ligated samples, departing from traditional
methods like ERCC in which spike-ins are added at the beginning of the
experiment^[200]55. This approach ensures an unbiased representation of
NCIN-capped RNAs that represent a minor subset of the total RNA pool.
Adding exogenous RNA spike-ins after enzymatic treatments but prior to
library construction captures discrepancies in spike-in proportions
between all-capped and NCIN-capped libraries, revealing PCR-induced
biases. For the computational part, we develop CompassAnalyzer, an R
package that applies stepwise normalization and empirical Bayes
shrinkage to accurately recover NCIN- and all-capped RNA. The CompasSeq
analytical platform enables quantitative assessment of the NCIN-capped
epitranscriptome, identifying not only the types of RNAs capped by NCIN
but also their relative capping levels. Moreover, we devise an
orthogonal method, the quantitative exoribonuclease reduction assay, to
independently validate CompasSeq. By harnessing the decapping activity
of dRai1 and the 5′–3′ exonuclease activity of Xrn1, this assay allows
gene-specific quantification of NCIN capping levels. However, this
assay is not yet feasible for sequencing-based epitranscriptome-wide
assessment, due to the lack of appropriate spike-ins to normalize
unwanted variation from enzymatic treatment. Altogether, we provide
compelling evidence to support the notion that CompasSeq is a reliable
approach for the identification and quantification of NCIN-RNAs at the
epitranscriptome level.
With CompasSeq, we reveal insight into NCIN-capped epitranscriptome.
First, we identify that over 45% of the transcripts from mouse liver
are modified with metabolite caps, with most displaying capping ratios
between 5 and 30%. Second, we determine that metabolite caps constitute
between 6 and 8% of mRNA pool from adult mouse liver tissues,
representing a significant type of RNA modification. A previous
analysis using mass spectrometry has estimated that NAD and FAD
combined make up ~0.8% in the mRNA transcriptome, rising to 3.7% when
including NAD, FAD, UDP-Glc and UDP-GlcNAc caps^[201]13. Our current
measurements, based on a weighted sum of a near-complete spectrum of
NCIN caps normalized to gene abundance, provide an improved overview of
metabolite usage as caps in the mRNA transcriptome. Third, we observe
that select genes, such as Rnr1 and Rnr2 with 85.4% and 91.1% of NCIN
capping, respectively (see Fig. [202]4a), display significant NCIN
compared to m^7G capping. Particularly, these two genes are ribosomal
RNAs encoded and localized in mitochondria. It has been reported that
nucleotide-containing metabolites, e.g., NAD^[203]56 and FAD^[204]57,
are highly abundant in mitochondria and that RNA polymerase in
mitochondria exhibits more efficiency for NCIN capping compared to its
counterpart in cytoplasm^[205]44. Hence, it is tempting to predict that
such high NCIN modification of Rnr1 and Rnr2 may have a functional
relevance. Fourth, in young mouse livers, we demonstrate that the
combined effect of NCIN capping, despite negative regulation of NAD
cap, would promote RNA stability. Similarly, NCIN capping positively
correlates with translation efficiency, though the contribution of
specific metabolite caps warrants further investigation. Fifth, we
delineate the dynamics of NCIN-capped epitranscriptome during aging,
revealing a significant decrease at both global and transcript levels.
Though a majority of transcripts with altered NCIN capping remain
unchanged at the transcriptome level, we observe a discordant trend
between RNA expression and NCIN capping on genes linked to aging
hallmarks, e.g., inflammation, mitochondrial function, proteostasis,
cellular senescence, and intercellular communication^[206]30. Moreover,
in aged mouse livers, NCIN capping appears to have confounding effects
on RNA stability and translation efficiency. It is possible that
different types of metabolite caps, when combined with their dynamic
changes with alterations in cellular machineries of RNA transcription
and protein synthesis that occur with age, may exert distinct outcomes
on RNA stability and translation. Taken together, CompasSeq reveals
features of the NCIN-capped epitranscriptome, suggesting a
yet-to-be-discovered regulatory mechanism.
Finally, we raise important issues and potential limitations of our
current study. First, synthetic NCIN-RNAs by in vitro transcription,
such as NAD-RNA, FAD-RNA, and dpCoA-RNA, are difficult to achieve 100%
purity, a phenomenon that has been seen in the current study (see
Fig. [207]2a, b) as well as by others^[208]22,[209]36,[210]58, such
that these synthetic NCIN-RNA spike-ins can be useful as internal
quality controls, but they are not appropriate for normalization
procedures. Consequently, CompassAnalyzer, the computational procedures
of CompasSeq, solely rely on exogenous Drosophila total RNAs for
stepwise normalizations. While our strategy effectively corrects PCR
amplification bias, it does not control for variability introduced at
the earlier steps. Thus, the CompasSeq platform provides an
approximate, but not absolute, ratio assessment of NCIN-RNAs. Second,
we apply stringent filtering criteria for quality data; however, this
may exclude genes with low expression. Third, CompasSeq can detect
transcript with very low modification, such as those equal to or lower
than 1% capping by metabolite, but their measurement of exact ratio
usually exhibits high variation (see Fig. [211]3d). Fourth, we leverage
the magnitude of fold change in NCIN capping levels between young and
aged animals; yet quantitative assessment of age-associated alterations
in NCIN-capped epitranscriptome is only beginning to be revealed.
Future studies that focus on developing statistical methods for
differential analysis of NCIN-RNA sequencing data would facilitate the
robustness of detecting differential NCIN-RNAs in response to
physiological and even pathological conditions. Fifth, although the use
of pyrophosphatase, such as NudC, is effective in identifying bulk
NCIN-RNAs, it cannot distinguish specific types of metabolite caps.
Future advancements, such as employing specific RNA decapping enzymes
like AtNUDX23, which selectively cleaves FAD caps^[212]19, will enable
ratio analysis for specific metabolite caps. Nevertheless, the
CompasSeq analytical platform provides a foundational framework for the
quantification and ratio assessment of metabolite-capped RNAs in
eukaryotes.
Methods
Ethics statement
All animal procedures have been reviewed and approved by the
Institutional Animal Care and Use Committee at the Chinese Academy of
Sciences and were in accordance with the Guide for the Care and Use of
Laboratory Animals of the Chinese Academy of Sciences. All efforts were
made to minimize the suffering of the animals.
Animal experimentation
C57BL/6 male mice were housed in an environmentally controlled room
under a 12:12 h light/dark cycle at 23 °C and were provided commercial
mouse chow food and water ad libitum. To profile the NCIN-capped
epitranscriptome, livers were dissected from both 2-month and
18-month-old C57BL/6 male mice and immediately frozen in liquid
nitrogen. To avoid confounding issues related to sex-specific gene
expression, we exclusively used male mice throughout the study. For the
collection of 3-day-old fly samples as exogenous spike-ins, newly
emerged male Drosophila melanogaster (BDSC #5905) were gathered and
aged for 72 h, after which all samples were promptly frozen in liquid
nitrogen.
In vitro transcription of NCIN-RNAs, and m^7G-RNA
RNA spike-ins were synthesized by T7 polymerase as previously
described^[213]24. The DNA templates were synthesized by Genewiz
(Supplementary Data [214]1). Importantly, adenine was used exclusively
as the initial nucleotide, with no additional adenine present in the
DNA template sequence. Initiating adenosine allowed the incorporation
of adenine-containing metabolites, e.g., NAD, FAD, and dpCoA, at the +1
position to form an NCIN cap. For the synthesis of NCIN- and m7G-capped
RNAs, ATP was excluded from the transcription reaction cocktail. For
annealing oligonucleotides, 10 μM DNA oligonucleotides were combined in
50 μL of 1× Cut Smart buffer (NEB, catalog: B7204S) and annealed by
heating to 95 °C for 5 min. The reaction mixture was then cooled down
to 4 °C at a ramp rate of 0.1 °C/s in a thermocycler (Eppendorf,
Germany). For in vitro transcription,10 μM of double-stranded DNA
templates were combined in 100 μL 1× transcription buffer (Thermo
Fisher Scientific, catalog: AM1334), along with 1 mM each of GTP, CTP
and UTP, and 1 mM ATP for ppp-RNA. For other capped RNA forms, the
reaction included 4 mM NAD (Macklin, catalog: N814818, for NAD-RNA),
4 mM FAD (Sigma-Aldrich, catalog: F6625, for FAD-RNA), 4 mM dpCoA
(Sigma-Aldrich, catalog: D3385, for dpCoA-RNA) or 4 mM m^7GpppA (NEB,
catalog: S1406S, for m^7GpppA-RNA). Additionally, 10 μL of T7 RNA
polymerase (Thermo Fisher Scientific, catalog: AM1334), 5% DMSO, 5 mM
DTT and 20 U of RNaseOUT (Takara Bio, catalog: 2313B) were added to the
mixture, which was then incubated at 37 °C for 4 h. Following
transcription, the reaction mixture was treated with 10 U of DNase I
(Thermo Fisher Scientific, catalog: AM1334) at 37 °C for 30 min to
remove the DNA template. The RNA was then purified using RNAiso Plus
(Takara Bio, catalog: 9108) and precipitated with isopropanol and 0.3 M
sodium acetate (Thermo Fisher Scientific, catalog: AM9740) at −80 °C
overnight. The RNA pellet was washed twice with 75% ethanol, air-dried,
and redissolved in DEPC-treated H[2]O. The 500 nt IVT RNA molecules
were subjected to polyadenylation for poly(A) tail elongation, in
accordance with the manufacturer′s instructions (E. coli Poly(A)
Polymerase, TransGen, catalog: LA101-01). The quality of the in
vitro-transcribed 38 nt RNAs was further analyzed by boronate affinity
electrophoresis with an 8 M urea, 1× TBE, 12% polyacrylamide gel
supplemented with 0.3% 3-acrylamidophenylboronic acid^[215]37.
Purification of Drosophila dRai1 protein
The coding sequence of Drosophila dRai1 gene was inserted into the
pET28a expression vector, fused with a hexa-His tag at the C-terminus.
The plasmid was transformed into the E. coli BL21 (DE3) strain
(TransGen, catalog: CD601-02). Bacteria were cultured at 37 °C until
the OD[600] reached 0.8, after which protein expression was induced
with 0.5 mM IPTG and at 18 °C for 18 h. The recombinant proteins were
purified using a Ni Focurose FF (IMAC) column (Huiyan Bio, catalog:
HQ060312001L). The purified protein was stored in a buffer containing
20% (v/v) glycerol and 2 mM DTT at −80 °C (Supplementary Fig. [216]1).
NudC, dRai1, RppH, and yDcpS enzymatic reactions
For assays shown in Fig. [217]2c–e, 1 µg of either NCIN- or m^7G-RNA
(38 nt) spike-ins was used. 10 pmol of NudC (NEB, catalog: M0607S) was
mixed with RNA in 40 μL of NEBuffer r3.1 at 37 °C for 1 h in the
presence of 20 U of RNaseOUT and 200 mM DTT. For assays shown in
Figs. [218]2c, [219]5b, 0.5 µL of purified dRai1 (final concentration:
340 nM) was incubated with RNAs in 20 μL of NEBuffer r3.1 at 37 °C for
1 h in the presence of 10 U of RNaseOUT. The RppH reaction was
performed by adding 5 U of RppH (NEB, catalog: M0356) to RNAs,
resuspended in 50 μL of 1× ThermoPol Reaction Buffer (NEB, catalog:
B9004S) and incubated at 37 °C for 1 h in the presence of 20 U of
RNaseOUT. Excess yDcpS enzyme (400 U, NEB, catalog: M0463S) was
incubated with RNAs in 50 μL of 1× yDcpS reaction buffer at 37 °C for
1 h in the presence of 20 U of RNaseOUT. The products were purified
with RNAiso Plus and analyzed by 12% polyacrylamide UREA-PAGE gels.
Gels were stained by SYBR Gold (Invitrogen, catalog: S11494) and
fluorescence was detected by the Fusion Solo S imaging system (Vilber).
Enzymatic treatment and adapter ligation of NAD-RNA (38 nt) and m^7G-RNA (38
nt)
To assess the capturing specificity for assays shown in Fig. [220]2f,
1 µg of NAD-RNA (38 nt) was decapped with 10 pmol of NudC as described
above. The products were purified using RNAiso Plus and then ligated
with 100 pmol of 5′ adapter oligo (27 nt, Supplementary Data [221]1,
synthesized by Genewiz), in the presence of 20 U of T4 RNA ligase 1
(NEB, catalog: M0202), 15% of PEG8000, 10% DMSO, 1 mM ATP and 20 U of
RNaseOUT in 40 μl of 1× T4 RNA ligase buffer at 16 °C for 16 h.
Decapping of 1 µg of m^7G-RNA (38 nt) was carried out using an excess
of yDcpS enzyme as described above. Products were purified with RNAiso
Plus and further incubated with 10 U of CIAP (Thermo Fisher Scientific,
catalog: 18009019) in the presence of 20 U of RNaseOUT in 1×
Dephosphorylation Buffer at 37 °C for 1 h to remove the 5′-terminal
phosphate from yDcpS-treated RNAs. After purification with RNAiso Plus,
the CIAP-treated RNAs were ligated with 100 pmol of 5′ adapter oligo
(27 nt) as described above.
Decapping of 1 µg of m^7G-RNA (38 nt) was performed using 10 U of RppH
in the presence of 20 U of RNaseOUT in 50 μL of 1× ThermoPol Reaction
Buffer at 37 °C for 90 min. Following purification with RNAiso Plus,
the RppH-treated RNAs were ligated with 100 pmol of 5′ adapter oligos
(27 nt) as described above. The ligation products were then purified
with RNAiso Plus and analyzed using 8% polyacrylamide UREA-PAGE gels.
qPCR analysis of NCIN-RNAs (500 nt) and m^7G-RNA (500 nt)
For assays shown in Fig. [222]3a, b, polyadenylated spike-in RNAs were
generated from different sequence templates, including 1 ng each of
NAD-, FAD-, dpCoA-, m^7G-, and ppp-RNA (500 nt). These spike-ins were
mixed with 20 μg of total RNAs from 2-month C57BL/6 mouse livers
(n = 3) and then subjected to CompasSeq enzymatic reactions. For
procedures capturing NCIN-RNAs, RNAs were incubated with excess yDcpS
enzyme (400 U) in the presence of 20 U of RNaseOUT in 50 μL of 1× yDcpS
reaction buffer at 37 °C for 1 h. The products were purified using
RNAiso Plus and subsequently incubated with 10 U of CIAP as described
above. For procedures capturing all-capped RNAs, RNAs were incubated
with 10 U of CIAP to remove pre-existing 5′-phosphate moieties. The
purified products were then incubated with 10 U of RppH in the presence
of 20 U of RNaseOUT in 50 μL of 1× ThermoPol Reaction Buffer at 37 °C
for 90 min.
Purified products from two procedures were incubated with 10 pmol NudC
in the presence of 20 U of RNaseOUT and 200 mM DTT in 40 μL of NEBuffer
r3.1 at 37 °C for 90 min. Products were purified with RNAiso Plus and
subsequently ligated with 100 pmol 5′ adapter oligo (38 nt,
Supplementary Data [223]1, synthesized at Genewiz), in the presence of
20 U of T4 RNA ligase 1, 15% of PEG8000, 10% DMSO, 1 mM ATP, and 20 U
of RNaseOUT in 40 μL of 1× T4 RNA ligase buffer at 16 °C for 16 h.
Following purification, products were re-dissolved in 20 μl of
DEPC-treated H[2]O. Reverse transcription was performed using designed
primers listed in Supplementary Data [224]1 with the ABScript II cDNA
First-Strand Synthesis Kit (ABclonal, catalog: RK20400). 5 μL of the
products were diluted to 100 μL and served as input. Meanwhile, 5 μL of
reverse-transcribed products were further amplified by PCR using
adapter-specific primers listed in Supplementary Data [225]1, in
accordance with the manufacturer′s instructions (Phanta® Max
Super-Fidelity DNA Polymerase, Vazyme, catalog: P505-d1). The PCR
program included an initial denaturation step of 3 min at 95 °C,
followed by 18 cycles of amplification (denaturation for 15 s at 95 °C,
annealing for 15 s at 60 °C, and extension for 6 min at 72 °C), and a
final extension for 5 min at 72 °C. The ppp-RNA (500 nt) was used as
the baseline control. qPCR was conducted using SYBR Green Fast qPCR Mix
(ABclonal, catalog: RK21203) in accordance with the manufacturer′s
protocol with primers listed in Supplementary Data [226]1. To calculate
the enrichment from qPCR data, the Ct value of the target gene was
first normalized to the Ct value of the ppp-RNA (baseline). Next, the
normalized PCR+ fraction value (ΔCt of the target gene normalized to
the ppp-RNA) was normalized to the background (ΔCt calculation for the
gene in the input) to yield the ΔΔCt value. The linear conversion of
this ΔΔCt resulted in the fold enrichment. Statistical significance was
assessed using Student′s t test.
CompasSeq protocol
RNA extraction and spike-in addition
For the assays shown in Fig. [227]3c, total RNAs (20 μg) from 2-month
C57BL/6 mouse liver (n = 3) were mixed with 0.014 ng of 3′-end
polyadenylated spike-in RNA (500 nt) containing 100% m^7G-RNA, 5%
NAD-RNA/95% m^7G-RNA, and 5% dpCoA-RNA/95% m^7G-RNA, respectively. For
the assays shown in Fig. [228]3d, total RNAs (20 μg) from 2-month
C57BL/6 mouse liver (n = 3) were mixed with 0.014 ng of 3′-end
polyadenylated spike-in RNA (500 nt) that had 0% NAD-RNA/100% m^7G-RNA,
5% NAD-RNA/95% m^7G-RNA, 10% NAD-RNA/90% m^7G-RNA, 15% NAD-RNA/85%
m^7G-RNA, and 20% NAD-RNA/80% m^7G-RNA, respectively. For the assays
shown in Figs. [229]4 and [230]6, total RNAs were extracted from three
biological replicates of 2-month and 18-month-old C57BL/6 mouse
samples. A total of 20 μg of RNA was mixed with 0.014 ng of 3′-end
polyadenylated spike-in RNA (500 nt) containing 0% NAD-RNA/100%
m^7G-RNA, 1% NAD-RNA/99% m^7G-RNA, 5% NAD-RNA/95% m^7G-RNA, and 20%
NAD-RNA/80% m^7G-RNA, respectively.
Enzymatic treatments and adapter ligation
The mixture of total RNAs and spike-in RNA (500 nt) underwent CompasSeq
sequential enzymatic reactions following the protocols as described in
“qPCR analysis of NCIN-RNAs (500 nt) and m^7G-RNA (500 nt)” part. We
incubated the RNAs at 65 °C for 5 min before ligation to prevent RNA
secondary structure interference. To capture PCR amplification bias,
barcoded adapters were first introduced into the 5′-end of spike-in
total RNA from Drosophila using the enzymatic procedures for capturing
all-capped RNAs. Subsequently, the adapter-ligated Drosophila spike-in
RNAs were then added to each sample at a mass ratio of 50:1. The mixed
RNA was subjected to reverse transcription followed by PCR
amplification.
Reverse transcription and amplification
Reverse transcription was performed following the protocols as
described in “qPCR analysis of NCIN-RNAs (500 nt) and m^7G-RNA (500
nt)” part. The PCR program included an initial denaturation step at
95 °C for 3 min, followed by 16 cycles of denaturation at 95 °C for
15 s, annealing at 60 °C for 15 s, and extension at 72 °C for 6 min,
with a final extension at 72 °C for 5 min.
Library preparation and sequencing
The purified PCR products were used for NGS library construction
following the manufacturer′s instructions (AFTMag NGS DNA Clean Beads,
Abclonal, catalog: RK20257; TruePrep® DNA Library Prep Kit V2 for
Illumina, Vazyme, catalog: TD501). Library quality was assessed by the
Bioanalyzer 2100 (Agilent, United States), and quantification was
performed by RT-qPCR with reference to a standard library. Libraries
were pooled in equimolar amounts to a final concentration of 2 nM and
denatured with 0.1 M NaOH (Sigma, catalog: 72068). Sequencing was
performed on the Illumina NovaSeq 6000 system (paired-end, 150 bp).
High-throughput sequencing data analysis
All sequencing reads were processed using Trim Galore (v0.6.6)^[231]59
with the parameters “--nextseq 30 --paired” to remove the adapter
sequences (AGATCGGAAGAGC) from NovaSeq platforms. Reads longer than 20
bp were retained. Quality-controlled reads were then mapped to the
combined genome of Mus musculus (GRCm39) and Drosophila melanogaster
(dmel-all-chromosome-r6.36) using STAR (v2.7.6a)^[232]60 with default
parameters, respectively. Uniquely mapped read pairs were counted
against merged annotations from Mus musculus (Ensembl: GRCm39.104) and
Drosophila melanogaster (Flybase: dmel-all-r6.36) using featureCounts
(v2.0.1)^[233]61 with parameters “-p -B -C” and summarized as
gene-level counts. Transcript abundance was calculated based on the
transcript FASTA files downloaded from Ensembl release-104
(Mus_musculus.GRCm39.cdna.all.fa.gz and
Drosophila_melanogaster.BDGP6.32.cdna.all.fa.gz) using Salmon
(v1.10.2)^[234]62 with default parameters. Sequencing saturation was
assessed by randomly subsampling the original libraries and examining
the corresponding changes in the number of mouse genome-derived genes
with more than 10 read counts. Read alignments were converted to bigwig
files by bamCoverage^[235]63 and visualized at selected genomic regions
by IGV (v2.16.0)^[236]64. Gene annotation information, such as
chromosome, gene types, and gene lengths, was retrieved from Ensembl
annotations (GRCm39.104). Visualization of the data was performed using
various R packages: violin plots, boxplots, bar plots, line charts, the
density plot, and scatter plots were generated with ggplot2
(v3.4.2)^[237]65; the upset plot was created with UpSetR
(v1.4.0)^[238]66; and the heatmap was plotted using ComplexHeatmap
(v2.14.0)^[239]67.
CompassAnalyzer
CompassAnalyzer was implemented in R and is publicly available at
GitHub ([240]https://github.com/thereallda/compassanalyzer). The
workflow comprised three main steps: (1) quality control, (2) stepwise
normalizations, and (3) NCIN capping level quantification. The detailed
steps were outlined below.
Quality control
The objective of quality control was to eliminate problematic or noisy
observations from downstream analysis. In this study, we used gene
filtering to ensure data quality. To keep well-detected genes across
samples, mouse genes with more than 29 counts in all sequencing
libraries were retained for subsequent analysis. Drosophila transcripts
with more than 29 counts in all sequencing libraries were kept for
normalization procedures. All ribosomal RNA-encoded genes and TEC genes
were excluded.
Stepwise normalizations
CompassAnalyzer implemented stepwise normalizations to adjust for
unwanted variations, enabling the quantitative assessment of the NCIN
capping level of individual transcripts. The goal of stepwise
normalizations was to recover the ratio of RNA abundance between
samples capturing NCIN-capped and all-capped RNAs. CompassAnalyzer
first employed global scaling normalization to adjust gene-level read
counts by library-wise scaling factors. Since equal amounts of
Drosophila RNAs were added as spike-ins to samples after adapter
ligation, subsequent amplification bias could be represented by the
discordance of spike-in counts. Library-wise scaling factors (f) were
generated based on read counts from spike-in using the trimmed mean of
M-values (TMM) normalization method from the R package edgeR
(v3.40.2)^[241]68. Particularly, CompassAnalyzer adjusted the
library-wise scaling factors based on the relative proportion of
spike-in read counts
[MATH:
pk*
:MATH]
. The proportion of spike-in read counts was defined as
[MATH:
pk=Yspi
ke−in,k<
/mi>Ys
ample,k<
/mi> :MATH]
, and the relative proportion of spike-in read counts of all-capped RNA
and NCIN-capped RNA libraries was defined as
[MATH:
pin,k*=<
msub>pin,k
mi>p¯in :MATH]
and
[MATH:
pen,k*=<
msub>pen,k
mi>p¯en :MATH]
, respectively. Here,
[MATH:
Yspik<
/mi>e−in,k
:MATH]
was the sum of spike-in read counts in library
[MATH: k :MATH]
and
[MATH:
Ysamp<
/mi>le,k :MATH]
was the sum of sample read counts in library
[MATH: k :MATH]
,
[MATH: p¯in :MATH]
and
[MATH: p¯en :MATH]
were the average proportion of spike-in read counts for all-capped RNA
and NCIN-capped RNA libraries, respectively. Therefore, the scaled read
counts
[MATH:
Ygk
* :MATH]
for gene
[MATH: g :MATH]
in library
[MATH: k :MATH]
can be defined by
[MATH:
Ygk
*=Ygk/(fk*pk*) :MATH]
, where
[MATH:
Ygk
:MATH]
was the raw read counts for gene
[MATH: g :MATH]
in library
[MATH: k :MATH]
. Next, CompassAnalyzer introduces the adjustment factors to adjust
gene-level read counts. Genes with
[MATH:
Yen,gk*/Yin,g
mi>k′* :MATH]
exceeding 1 in at least one sample were denoted as the gene set
[MATH: Gc :MATH]
by default. For each sample
[MATH: j :MATH]
, the adjustment factors can be estimated as
[MATH:
rj=1NGc*∑(log2Yen,
Gck*−log2Yin,Gck′*) :MATH]
, where
[MATH:
NGc
:MATH]
is the number of genes in gene set Gc,
[MATH:
Yen,k* :MATH]
and
[MATH:
Yin,k′
mrow>* :MATH]
were the scaled read counts from the paired NCIN-capped RNA library
[MATH: k :MATH]
and all-capped RNA library
[MATH: k′
:MATH]
, respectively.
To further reduce the unwanted variation, CompassAnalyzer implemented
an RUV strategy for unwanted variation estimation. Briefly, genes from
spike-in that have constant expression between NCIN-capped RNA
libraries and all-capped RNA libraries were used to derive the unwanted
variations. RUV was then applied to estimate the unwanted factors
matrix W and its associated nuisance parameters α as previously
described^[242]34,[243]35. Finally, for a chosen k (1 ≤ k ≤ j), the
normalized read counts can be defined by
[MATH:
Z=Y*−W^kα^k−2rj<
/mi> :MATH]
.
NCIN capping level quantification
The NCIN capping level (
[MATH: Q :MATH]
) of gene
[MATH: g :MATH]
in sample
[MATH: j :MATH]
could be calculated as:
[MATH:
Qgj=Zen,gkZen,gk′ :MATH]
. Genes with low read counts tend to suffer from a high variability,
which lead to inflated estimates of NCIN capping ratios. To obtain
stabilized NCIN capping estimates, CompassAnalyzer incorporated
empirical Bayes shrinkage of NCIN capping ratio using ashr R
package^[244]69. For subsequent analysis, only NCIN-RNAs with a
coefficient of variation below 0.3 were retained.
Gene-specific quantification of NCIN capping level by qPCR
Following CompasSeq enzymatic reactions, reverse transcription, and PCR
enrichment, the resulting PCR products were used for gene-specific
quantification of NCIN capping levels via qPCR. The qPCR was conducted
using primers listed in Supplementary Data [245]1. GAPDH1 from
Drosophila was used as the baseline internal control. To calculate the
relative enrichment from RT-qPCR data, the Ct value of the target gene
was first normalized to the Ct of the GAPDH1. The normalized value from
the epitranscriptome capture procedure (ΔCt of the target gene
normalized to GAPDH1) was then normalized to the background (ΔCt
calculation for the gene in the transcriptome capturing procedure),
yielding the ΔΔCt value. The linear conversion of this ΔΔCt resulted in
the fold enrichment.
Quantitative exoribonuclease reduction assay
For assays depicted in Fig. [246]5c, reactions with dRai1 and Xrn1 were
performed by adding 0.5 µL of dRai1 (final concentration: 340 nM) and
2 U of Xrn1 (NEB, catalog: M0338) to 1 µg of either NAD- or m^7G-RNA
(106 nt), resuspended in 20 μL of 1× NEBuffer r3.1 and incubated at
37 °C for 1 h with 10 U of RNaseOUT. Products were analyzed by a 12%
polyacrylamide UREA-PAGE gel. For assays shown in Fig. [247]5e, 500 nt
spike-in RNAs generated from different sequence templates, including
1 ng each of NAD-, FAD-, dpCoA-, m^7G-, and ppp-RNA, were mixed with
1 μg of total RNAs from 2-month C57BL/6 mouse livers (n = 3), followed
by treatment with dRai1 and Xrn1 or mock treatment. RT-qPCR was
performed as described above, using primers listed in Supplementary
Data [248]1. ppp-RNA was used as the baseline internal control. To
calculate the relative levels, the Ct value of the target gene was
first normalized to the Ct of the ppp-RNA. The normalized value from
the treatment with dRai1 and Xrn1 (ΔCt of the target gene normalized to
the ppp-RNA) was then normalized to the background (ΔCt calculation for
mock treatment), yielding the ΔΔCt value. For assays shown in
Fig. [249]5f, 500 nt spike-in RNAs generated from different sequence
templates, including 1 ng each of 0% NAD-RNA/100% m^7G-RNA, 10%
NAD-RNA/90% m^7G-RNA, 30% NAD-RNA/70% m^7G-RNA, 50% NAD-RNA/50%
m^7G-RNA, 70% NAD-RNA/30% m^7G-RNA, 90% NAD-RNA/10% m^7G-RNA, and
ppp-RNA, were mixed with 1 μg of total RNAs from 2-month C57BL/6 mouse
livers (n = 3), followed by dRai1 and Xrn1 or mock treatment.
RNA alternative TSS, splicing, and polyadenylation analysis
TSS information of RNAs was extracted using the refTSS dataset
(refTSS_v4.1_mouse_coordinate.mm39.bed.gz)^[250]70. Genes with multiple
TSSs were classified as exhibiting alternative TSS usage. AS events
were identified using rMATS (v4.1.2)^[251]71 with BAM files from
transcriptome-capturing samples. Event counts included reads spanning
junctions defined by rMATS and reads not crossing an exon boundary.
Splicing events were filtered based on coverage, retaining only those
with more than zero read counts. For APS analysis, DaPars2^[252]72 was
applied to transcriptome-capturing samples to calculate the percentage
of distal PDUI for individual NCIN-capped RNA. Transcripts with a PDUI
greater than 0.3 were retained for further analysis.
Analysis of conserved sequences and RNA secondary structure
Conservation analysis was performed using PhastCons scores
(phastCons35way.UCSC.mm39) obtained from the R package GenomicScores
(v2.10.0)^[253]46. The conservation score for each NCIN-capped isoform
was calculated by averaging the PhastCons scores across the 5′
untranslated region (UTR) of the transcript. For each transcript with
5′ UTR, the sequence of 100 bp region downstream of TSS was used to
calculate RNA MFE by ViennaRNA with default parameters^[254]48. To
generate a background distribution of MFE, transcripts not containing
NCIN caps were randomly sampled 1000 times from the whole transcriptome
(after subtracting NCIN-RNAs) with a similar size (n = 5000) to
transcripts containing NCIN caps (n = 5462).
GMUCT
GMUCT libraries were constructed and sequenced for 2-month-old and
18-month-old mouse livers as described by Endres M et al.^[255]49, with
the following modifications: In brief, poly(A) + RNA was first isolated
from total RNA, followed by 5′ adapter ligation to capture uncapped RNA
molecules. The ligated RNA was then reverse transcribed, and PCR
amplification was performed using primers complementary to the adapter
sequences. The resulting cDNA was fragmented and used for final library
construction. The resulting libraries were sequenced on the Illumina
NovaSeq 6000 system (paired-end, 150 bp). Raw reads on each gene were
counted by featureCounts and normalized using the TMM normalization
method from the edgeR package. “Proportion uncapped” for each gene was
calculated as the log-ratio of TMM-normalized read counts between GMUCT
and RNA-seq. To further investigate the impact of NCIN-capping on RNA
stability during aging, we grouped the transcripts based on age-related
changes in their modification status.
Ribosome profiling data analysis
Raw counts from [256]GSE123981^[257]52 were normalized using the TMM
normalization method implemented in the edgeR package. RO for each gene
was calculated as the log-ratio of TMM-normalized read counts between
Ribo-seq and RNA-seq. To investigate the correlation between RO and
NCIN capping during aging, we calculated the fold change in RO between
the 20-month-old and 3-month-old samples and grouped the transcripts
based on age-related changes in their modification status.
Pathway enrichment analysis
Pathway enrichment analysis was performed using the R package
gprofiler2 (v0.2.1)^[258]73. Pathways were defined based on the
molecular pathways of Reactome, the biological process of GO, and the
Kyoto Encyclopedia of Genes and Genomes database. Enriched gene sets
were constrained to sizes ranging between 5 and 350 genes. Multiple
testing corrections were performed using gprofiler2’s built-in “g_SCS”
method, with terms having an adjusted p value of less than 0.05
considered significantly enriched. The top terms were ranked by the
smallest adjusted p values in each set and were selected for
visualization.
Analyses of differentially expressed genes and differential NCIN capping
genes and isoforms
Differentially expressed genes were analyzed from datasets of
all-capped RNAs between 2-month-old (young) and 18-month-old (old)
mouse livers using the edgeR package. The significance of logarithmic
fold changes was determined by a likelihood ratio test to approximate p
values, with genes adjusted for multiple testing using the Bonferroni
procedure to yield a false discovery rate (FDR). Differential
expression genes were defined as those with an absolute fold change in
normalized read count ≥ 1.5 and FDR < 0.05 in old compared to young
samples (Supplementary Data [259]5). Differential NCIN capping genes
were defined as those with an absolute fold change in NCIN
capping ≥ 1.5 in old compared to young samples. Differential
NCIN-capped isoforms were defined as those with an absolute fold change
in NCIN capping ≥ 1.25 in old compared to young samples.
Statistics and reproducibility
No statistical methods were used to predetermine sample size. All data
were included in the analyses. The experiments were not randomized, and
investigators were not blinded to group allocation. RNA-seq experiments
were performed with three biological replicates. Unless otherwise
specified, p values were calculated using two-sided nonparametric
Mann–Whitney tests. All gel blots were independently repeated at least
twice with similar results.
Reporting summary
Further information on research design is available in the [260]Nature
Portfolio Reporting Summary linked to this article.
Supplementary information
[261]Supplementary Information^ (3.3MB, pdf)
[262]41467_2025_61697_MOESM2_ESM.pdf^ (123.6KB, pdf)
Description of Additional Supplementary Files
[263]Supplementary Data 1^ (15.7KB, xlsx)
[264]Supplementary Data 2^ (1,015.9KB, xlsx)
[265]Supplementary Data 3^ (23.3KB, xlsx)
[266]Supplementary Data 4^ (605.6KB, xlsx)
[267]Supplementary Data 5^ (824.9KB, xlsx)
[268]Supplementary Data 6^ (32KB, xlsx)
[269]Reporting Summary^ (2MB, pdf)
[270]Transparent Peer Review file^ (4.5MB, pdf)
Source data
[271]Source Data^ (11.1MB, xlsx)
Acknowledgments