Abstract
Terminally differentiated somatic cells can be reprogrammed into a
totipotent state through somatic cell nuclear transfer (SCNT). The
incomplete reprogramming is the major reason for developmental arrest
of SCNT embryos at early stages. In our studies, we found that pathways
for autophagy, endocytosis, and apoptosis were incompletely activated
in nuclear transfer (NT) 2-cell arrest embryos, whereas extensively
inhibited pathways for stem cell pluripotency maintenance, DNA repair,
cell cycle, and autophagy may result in NT 4-cell embryos arrest. As
for NT normal embryos, a significant shift in expression of
developmental transcription factors (TFs) Id1, Pou6f1, Cited1, and
Zscan4c was observed. Compared with pluripotent gene Ascl2 being
activated only in NT 2-cell, Nanog, Dppa2, and Sall4 had major
expression waves in normal development of both NT 2-cell and 4-cell
embryos. Additionally, Kdm4b/4d and Kdm5b had been confirmed as key
markers in NT 2-cell and 4-cell embryos, respectively. Histone
acetylases Kat8, Elp6, and Eid1 were co-activated in NT 2-cell and
4-cell embryos to facilitate normal development. Gadd45a as a key
driver functions with Tet1 and Tet2 to improve the efficiency of NT
reprogramming. Taken together, our findings provided an important
theoretical basis for elucidating the potential molecular mechanisms
and identified reprogramming driver factor to improve the efficiency of
SCNT reprogramming.
Introduction
Somatic cell nuclear transfer (SCNT) is a method to artificially
reprogram the terminally differentiated somatic cells into totipotency
embryos in addition to normal fertilization.[33]^1 Dr. John Gurdon
first demonstrated that animals can be cloned from the differentiated
frog somatic cells by SCNT;[34]^2 after 30 years of effort, the first
cloned mammal Dolly was also born,[35]^3 and more than 20 mammals were
successfully cloned using various types of somatic cells by SCNT.[36]^4
In addition to animal cloning, SCNT also has great potential in human
therapy. For instance, nuclear transfer embryonic stem cells (ntESCs)
derived from elderly and patient donor cells may serve as useful cell
sources for disease modeling in vitro and cell or tissue replacement
therapy.[37]^5^,[38]^6 Overall, these studies indicated that the future
development prospect for SCNT is attractive.
Despite the huge potential, several technical hurdles limit the
practical application of SCNT. One of the limitations is that the
survival rate of cloned animals is extremely low. In addition to cattle
and sheep with relatively high reproductive cloning efficiency
(5%–20%), the overall reproductive cloning efficiency of all other
species is very low (1%–5%);[39]^7 even with the successful generation
of SCNT blastocyst in rhesus monkeys, the blastocyst rate remained at
about 16%.[40]^8 Moreover, in almost all of the cloned mammalian
species, abnormalities in extra-embryonic tissue also have been
frequently observed.[41]^6
Many studies have shown that SCNT embryo developmental arrest may be
related to zygotic genome activation (ZGA) failure.[42]^5^,[43]^9
Moreover, epigenetic abnormalities also are one of the main causes for
abnormal development of SCNT embryos,[44]^10^,[45]^11 in which H3K9me3
prevents the development of SCNT embryos at the pre-implantation stage
in mice and humans by inhibiting the transcriptional activation of
ZGA-related regions.[46]^1^,[47]^12 Importantly, the heterotopic
expression of H3K9me3 demethylase Kdm4a/4b/4d helps to eliminate the
barrier of ZGA[48]^10 and to greatly promote the development of SCNT
embryos.[49]^1^,[50]^12^,[51]^13 Furthermore, the loss of the H3K27me3
imprinting genes may cause abnormal activation of Xist, which leads to
heterochromatin of the X chromosome,[52]^14 eventually hindering the
development of SCNT embryos after implantation.[53]^5^,[54]^15
Additionally, the correct regulation of some key genes in SCNT embryo
development is also crucial for SCNT embryo development. Previous
studies have identified a number of maladjusted genes in 2-cell SCNT
embryos of mouse,[55]^16 and the numbers and roles of the these genes
determine the fate of each cloned embryo.[56]^4 Therefore, comparative
transcriptome analysis based on single-cell sequencing[57]17, [58]18,
[59]19, [60]20, [61]21, [62]22, [63]23, [64]24 identifies reprogramming
driver factors that may be a feasible method to improve the efficiency
of cloning.
In our study, we first systematically analyzed the molecular
differences of embryos with diverse developmental fates; the results
show that the developmental fate of the embryos can be directly
reflected in the transcriptome level. Next, we found some abnormal
functional pathways in NT 2- and 4-cell arrest embryos, respectively.
Finally, we comprehensively identified the reprogramming driver factor
of SCNT embryos from transcription factors (TFs), pluripotent genes,
and epigenetic modification factors. The results of this study will
provide new insights into the molecules’ mechanisms of SCNT
reprogramming.
Results
Global Profile and Pattern Analysis
To verify the reliability of transcriptome data of in vitro
fertilization (wild-type [WT]) and NT embryos with different
developmental fates, we analyzed the correlation of expression levels
among the repeated samples within the group after quality control. The
correlation between each repeat sample reached 0.60 ([65]Figure 1B),
showing the biological replicates of each sample are highly
reproducible.
Figure 1.
[66]Figure 1
[67]Open in a new tab
The Dynamic Changes of the Global Transcription Profile
(A) Flowchart of this study. (B) The intra-class correlation
coefficients of expressed genes in normal and NT embryos. (C) The
cluster analysis of the embryos samples with the different
developmental fate. (D) The correlation of the embryo samples with a
different developmental fate.
Based on the cluster relationship ([68]Figure 1C) and correlation
([69]Figure 1D) of the samples with different developmental fates, we
found that the cells with the same development fate and stages can be
clustered together, among which WT Metaphase II (MII) Oocyte had the
closest relationship with WT Zygote ([70]Figure 1C), and the
correlation between them was also highest (r = 0.96) ([71]Figure 1D).
Interestingly, NT 2-cell Kdm4b/4d was close to WT 2-cell but far away
from NT 2-cell arrest. The same result was also observed in NT 4-cell,
indicating that histone demethylase can alter the developmental fate of
NT embryos. Moreover, WT Morula was greatly related to WT Blastocyst
(r = 0.79) ([72]Figure 1D), and both can form an independent
branch, showing that a unique developmental fate exists in Morula and
blastocyst. These results displayed that the developmental fate of the
embryos can be directly reflected in the transcriptome level, which can
be used for further screening of molecular markers.
To further explore the dynamics of gene expression for NT or WT embryos
with different developmental fate at 2-cell and 4-cell stages, we
performed 28 pairwise comparisons on a total of eight embryos to obtain
the differentially expressed genes (DEGs; q < 0.05, log2 (fold change)
[log2FC] > 1; [73]Figure S1A). Compared with MII Oocyte, the number of
DEGs in 2-cell and 4-cell had an increased wave, which may be related
to maternal-zygotic transition. There were fewer DEGs (290) between NT
Kdm4b 2-cell and NT Kdm4d 2-cell, and the NT Kdm4d 2-cell was closer to
WT 2-cell, which exhibited that both Kdm4b and Kdm4d functions have
crucial roles to overcome the H3K9me3 barrier during SCNT embryo
development.
ZGA Genes Identification
ZGAs in 2-cell embryos are believed to be important for embryonic
development;[74]^10 we identified 1,634 upregulated genes (p < 0.05,
FPKM [fragments per kilobase of exon model per million mapped reads] ≥
5; [75]Figure 2A) in WT 2-cell compared with the MII Oocyte.[76]^25
These genes showed a stage-specific expression pattern and could be
classified into six different groups ([77]Figure 2B). There was a total
of 492 genes in groups 1 and 2, and these genes were highly expressed
in 2-cell samples. We found these genes significantly enriched in
transcription, cell cycle, and cell-division-related biological
processes, implying that transcriptional mechanisms are gradually
established with the ZGA.
Figure 2.
[78]Figure 2
[79]Open in a new tab
Enrichment Analysis of Key Functional Pathways in NT and WT Two-Cell
Embryos
(A) Volcano plot comparing the gene expression profiles of WT 2-cell
embryo and MII Oocyte. Upregulated and downregulated genes are colored
red and green, respectively. (B) Analysis of pre-implantation embryos
transcript profiles and pathway enrichment. (C) Analysis of expression
profile correlation between NT 2-cell arrest and normal embryos. (D) GO
enrichment analysis of ZGA genes in 2-cell embryo with different
developmental fates. GO pathway enrichment degree using rich factor to
measure: the higher the rich factor value, the greater the enrichment
degree of this pathway. The boxplot shows the expression level of the
enrichment genes of four representative pathways.
We further explored whether ZGA occurred correctly in mouse SCNT
embryos; the result showed that the 2-cell embryos and MII Oocyte were
obviously distinguished, as expected. Surprisingly, the samples of NT
2-cell arrest and NT 2-cell to blast had similar expression patterns
([80]Figure 2C). Therefore, we identified upregulated genes in NT
2-cell arrest and NT 2-cell to blast embryos to dissect the potential
molecular differences among different embryos. These genes activated in
WT or NT 2-cell embryos were evidently enriched in the same five
pathways ([81]Figure 2D), that is, translation, positive regulation of
telomerase activity, ribosome biogenesis, blastocyst formation, and RNA
metabolic process, which was consistent in other studies.[82]26,
[83]27, [84]28, [85]29 The expression levels of genes related to four
of these pathways (except translation) ([86]Figure 2D) in WT or NT
2-cell embryos were significantly higher than MII Oocytes. Among which,
the gene expression level in WT 2-cell embryos was the highest and in
NT 2-cell arrested embryos was relatively low, suggesting that the
incomplete activation of these genes may lead to the abnormal
development of NT 2-cell embryos.
Molecular Markers Screening in 2- and 4-Cell Arrest Embryos
To identify key molecular markers of SCNT embryos with distinct
developmental potentials, we screened a total of 3,708 significantly
upregulated genes (fold change [FC] > 5, q < 0.05) in 2-cell samples
(WT 2-cell, NT 2-cell arrest, and NT 2-cell to blast). These genes
showed stage-specific expression patterns and could be clustered into
seven different groups ([87]Figure 3A). We identified 595 (group 1) and
421 genes (group 3) that were activated in only NT 2-cell to blast and
WT 2-cell embryos, respectively. Gene Ontology (GO) analysis showed
that these genes were related in apoptotic signaling pathways, positive
regulation of transcription metabolic processes, positive regulation of
telomerase activity, etc., which indicated that the cells have
undergone a great transition, in which a large number of maternally
stored RNAs are degraded rapidly and replaced by newly synthesized ZGA
RNAs.[88]^5
Figure 3.
[89]Figure 3
[90]Open in a new tab
Cluster Analysis and Molecular Markers Screening
(A) Cluster and GO annotation analysis of ZGA genes. (B) KEGG analysis
of NT 4-cell normal and WT 4-cell embryo upregulated genes; red markers
for both shared functional pathways. (C) NT 4-cell to blast embryos and
WT 4-cell embryos compared with NT 4-cell arrest embryo differentially
upregulated genes and the pathways were shown in the Venn diagram. (D )
GO analysis of upregulated genes in NT 4-cell to blast embryos; both
shared pathways in NT 4-cell to blast and WT 4-cell embryos were marked
in red. (E) GO analysis of upregulated genes in WT 4-cell embryos; both
shared pathways in NT 4-cell to blast and WT 4-cell embryos were marked
in red. (F) Expression level of genes enriched in NT 4-cell embryo
representative signaling pathway.
Based on the analysis above, we identified four key pathways of
apoptosis, autophagy, endocytosis, and DNA repair that were
significantly activated in NT 2-cell to blast compared with NT 2-cell
arrest embryos ([91]Figure 3A). The overall expression levels of these
pathway-related genes in NT normal embryos were significantly higher
than that in NT arrest embryos. Notably, the genes enriched in
autophagy,[92]^30 endocytosis, and apoptosis pathways were mainly
activated in NT 2-cell embryos, but the genes involved in DNA repair
were primarily activated in NT 4-cell embryos, indicating that the
temporal activation of these pathways is important in adapting the
transition of cell identity during reprogramming.
In addition to 2-cell arrest, 4-cell arrest is also observed in SCNT
embryos. We analyzed DEGs in 4-cell embryos with different
developmental fate (WT 4-cell, NT 4-cell arrest, and NT 4-cell to
blast). Moreover, 246 upregulated genes, 7 KEGG pathways, and 34 GO
biological processes were shared in WT or NT 4-cell to blast
([93]Figure 3C), in which some significant GO terms and representative
genes for these pathways were shown in [94]Figures 3D and 3E. We
further observed the overall expression levels of genes enriched in the
four key pathways, including stem cell pluripotency maintenance, DNA
repair, cell cycle, and autophagy ([95]Figure 3F). The expression
levels of NT 4-cell to blast and WT 4-cell embryos were significantly
higher (p < 0.001) than those of NT 4-cell arrest embryos, which showed
that the DEGs in these key pathways in NT embryos might be vital for
the normal development of NT 4-cell embryos. Besides, the seven most
prominent pathways contain ubiquitin-mediated proteolysis and
nucleotide excision repair and are more prominent in NT 4-cell to blast
than in WT 4-cell embryos ([96]Figure 3B), which may mean that NT
embryo reprogramming requires excessive activation of genes involved in
protein degradation and nucleotide repair.
Identification and Functional Annotation of TFs and Pluripotent Genes in NT
Embryos
In the process of SCNT, embryo-specific TFs play important roles in
activating embryo-specific expressed genes and maintaining the
expression of genes that normally develop embryos.[97]^31 Therefore, we
screened 97 and 34 TFs with differential expression in NT 2-cell and
4-cell normal embryos, respectively, and found these TFs to be involved
in four pathways both in 2-cell and 4-cell ([98]Figure 4A). Next, we
observed the specific expression levels of these TFs and found that
Id1, Pou6f1, Cited1, and Zscan4c as the representative TFs were highly
expressed in NT normal embryos compared with the NT arrest embryos
([99]Figure 4A), which was compatible with previous studies in which
these TFs performed an essential function in cell growth,
differentiation, and embryonic development.[100]32, [101]33, [102]34,
[103]35, [104]36 Moreover, the peaks of these three key TFs (except
Cited1) had significant differences between NT arrest and normal
embryos ([105]Figure 4E), indicating that they may be partially
responsible for the developmental arrest of NT embryos.
Figure 4.
[106]Figure 4
[107]Open in a new tab
The Dynamic Changes of Representative TFs
(A) The bar graph is the GO annotation pathway for upregulated TFs of
NT 2-cell normal embryos (top) and NT 4-cell normal embryos,
respectively (bottom). The same GO pathways of both were marked with
the same color. (B) The expression levels of pluripotency maintenance
genes in different fate embryos. (C) Genomic views of the transcript
level of representative pluripotency maintenance genes. (D)
Co-expression network analysis of pluripotent genes. The red line
indicates a strong correlation of gene expression levels, the blue line
indicates a weak correlation, and the size of the point represents the
normalized expression levels of genes. The boxplot on the right shows
the overall expression level of each group of genes. (E) Genome browser
view of three representative TFs, Zscan4c, Id1, and Pou6f1.
We next screened 17 pluripotency maintenance genes highly expressed in
NT 2-cell or 4-cell embryos ([108]Figure 4B); the expression level of
these pluripotent genes in NT normal embryos was higher than arrest
embryos, indicating that pluripotency maintenance genes may function as
positive factors in facilitating the developmental progress of NT
embryos. Furthermore, we analyzed the relative expression levels of
these genes in NT embryos and found that the key pluripotent gene Ascl2
was activated only in NT 2-cell, but some key genes, including Nanog,
Dppa2, and Sall4, were first activated in NT 2-cell and activated again
in NT 4-cell, which was in agreement with the peak distribution of
these genes at the transcriptional level ([109]Figure 4C).
Surprisingly, Dppa2 as a positive regulator of transcription of ZGA
genes has been proved.[110]^37 These results suggest that abnormal
expression of pluripotency maintenance genes in NT embryos is also
likely to be one of the important reasons for their developmental
arrest. The co-expression network analysis of these genes
([111]Figure 4D) showed that the different cooperative regulatory
relationships of pluripotent genes existed in different embryos,
in which Ascl2, Bmp7, and Ctnnb1 had high expression levels but weak
correlation in NT 2-cell to blast; but in NT 4-cell to blast, both the
expression levels and correlation of Nanog and Sall4 were high, and
Bmp7 and Ctr9 had high expression but weak correlation.
Comprehensive Analysis of the Key Epigenetic Modification Factors
To evaluate the effects of epigenetic modification factors on NT embryo
development, including DNA/RNA demethylase, histone demethylase, and
histone acetylase, we found that Tet1 and Tet2 were significantly
activated in NT 4-cell to blast and NT 2-cell to blast embryos,
respectively ([112]Figure 5A), suggesting that the activation of both
enzymes is a positive factor for the development of NT embryos.[113]38,
[114]39, [115]40 Intriguingly, the differential expression of these
genes was closely related to the silence of individual exons at the
transcription ([116]Figure 5A), in which the peaks on the last exon
differed greatly between NT 2- and 4-cell arrested and normal embryos.
Remarkably, Gadd45a as a key player in DNA demethylase activation was
found to have high expression in NT 2-cell and 4-cell, further
suggesting that it may function with Tet1 and Tet2 to promote NT embryo
development.[117]^41
Figure 5.
[118]Figure 5
[119]Open in a new tab
Identification and Excavation of Epigenetic Markers of Different Types
of Embryos
(A) The expression levels and genome browser view of genes related to
DNA demethylation. (B) The expression levels and genome browser view of
histone acetylases Kat8 and Kat2b. (C) The expression levels and genome
browser view of genes related to RNA methylation. (D) Co-expression
network analysis of representative epigenetic markers.
Histone methylation is a very important epigenetic modification in the
development of SCNT embryos.[120]^42 We screened a total of
eight histone demethylases, in which Kdm3a, Kdm7a, Kdm4b, and Kdm4d had
prominent high expression levels in normal NT 2-cell embryos. Previous
experiments had confirmed that SCNT embryos injected with Kdm4b/4d mRNA
developed to the blastocyst stage with high efficiency,[121]^25 which
was consistent with our results that SCNT 2-cell embryos injected with
Kdm4b/4d are closer to WT 2-cell embryos ([122]Figure 6A).
Interestingly, Kdm4d mRNA injection can greatly improve the blastocyst
rate of SCNT embryos to above 80% when cumulus cells, Sertoli cells, or
MEF cells were used as donor cells[123]^10 ([124]Figure 6F), so we
speculate that Kdm3a and Kdm7a may be candidate genes of NT 2-cell
development embryos. On the contrary, Kdm3b, Kdm5a, Kdm5b, and Ctnnb1
were significantly higher in NT 4-cell to blast, in which NT Kdm5b
4-cell embryos were closer to WT 4-cell embryos ([125]Figure 6A). The
result indicated that Kdm3b, Kdm5a, Kdm5b, and Ctnnb1 may be key
factors for NT 4-cell embryonic development. Next, we classed some
genes activated in NT 2-cell embryos into five groups ([126]Figure 6B)
and performed GO analysis of shared genes between different embryos (NT
2-cell and WT 2-cell, WT 4-cell, and NT Kdm5b 4-cell) ([127]Figures 6C
and 6E); the results further showed that Kdm4b/4d and Kdm5b as key
genes can properly rescue 2-cell and 4-cell arrest, respectively.
Figure 6.
[128]Figure 6
[129]Open in a new tab
Histone Methylation Is Important to the Development of SCNT Embryos
(A) The T-SNE clusters of 2-cell and 4-cell embryos with different
development potentials. Data are from Liu et al.[130]^25 (B) Clustering
and GO annotation analysis of NT Kdm4b/4d 2-cell embryos upregulated
genes. (C) Shared genes of group 1 in (B) and WT 2-cell and GO
analysis. (D) Shared gene of group 2 and groups 3–5 in (B) and WT
2-cell and GO analysis. (E) NT Kdm5b 4-cell and WT 4-cell shared genes
and GO analysis. (F) Kdm4d mRNA injection improves the development of
SCNT embryos when cumulus cells, Sertoli cells, and MEF are donor
cells. The percentage of embryos that reaches the indicated stages was
shown in figures (data are reanalyzed from Matoba et al.[131]^10).
Histone acetylation is also a crucial epigenetic modification during
development, in which co-knockdown of Hdac1 and Hdac2 (but not
individually) can result in the failure of preimplantation
development.[132]^43 We found 10 genes ([133]Figures 5B and 5D) with
the relevance of histone modification, in which Srcap and Kat2b
specifically activated in NT 2-cell to blast; Hdac4, Yeats4, Atxn7,
Arid4b, and Brca2 had a significant wave in NT 4-cell to blast; and
Elp6, Eid1, and Kat8 in both NT 2-cell and 4-cell normal embryos had
significantly high expression. The result indicated that the abnormal
expression of genes related to histone modification is one of the
important factors for the developmental failure of NT embryos.
Chemical modification of RNA is also an important epigenetic
modification that directly and rapidly manipulates the transcriptome to
regulate gene expression.[134]^44 In [135]Figure 5C, Zfp217 was highly
expressed in NT 2-cell normal embryos, but Mettl3 showed significantly
low expression, which was consistent with a previous study in which
Zfp217 can inactivate the N6-methyladenosine [m(6)A] methyltransferase
Mettl3 by arresting the methylation of these mRNAs to increase the
efficiency of reprogramming.[136]^45 The result showed that the low
expression of Zfp217 in NT 2-cell arrested embryos may be one of the
reasons for developmental arrest.
We further performed co-expression networks analysis of the above
identified key analytical markers ([137]Figure 5D) and found that
histone acetylation genes Elp6 and histone demethylation enzymes Kdm5a
both have a high expression level and strong correlation in WT embryos,
but in NT 2-cell to blast, Srcap and Eid1 related to histone
acetylation play the main roles in the normal development of NT 2-cell.
Moreover, the expression levels of Kdm5b and Kdm7a in NT 4-cell to
blast were significantly higher than in others, showing that Kdm7a may
function with Kdm5b to promote NT 4-cell embryo development.[138]^25 In
short, the results revealed that coordinate regulation of different
epigenetic modification factors is necessary for the normal development
of embryos.
Discussion
To elucidate the potential molecular mechanism of embryo development,
we systematically characterized the global transcriptomes profiles of
NT embryos and WT embryos with different developmental fates based on
the SCNT embryo biopsy system of a previous study,[139]^25 which
collectively suggests that the developmental fate of the embryos can be
directly reflected in the differences of transcriptome level.
Furthermore, compared with NT arrest embryos, some key functional
pathways activated in NT 2- and 4-cell normal embryos were
comprehensively investigated. As shown in [140]Figure 7, the genes
enriched in autophagy, endocytosis, and apoptosis pathways were
dramatically activated in NT 2-cell, but stem cell pluripotency
maintenance, DNA repair, cell cycle, and autophagy key pathways may
play important roles in the normal development of NT 4-cell embryos.
Figure 7.
[141]Figure 7
[142]Open in a new tab
The Probable Timing Activation Pathways and Reprogramming Signatures in
SCNT Embryos Identified by Comparative Single-Cell Transcriptomes
Genes marked red have been proved experimentally in previous studies.
In previous studies, Kdm4b/4d and Kdm5b have been confirmed to be key
genes in NT 2-cell and 4-cell embryos, respectively.[143]^10^,[144]^25
Strikingly, Kdm4d mRNA injection can significantly improve the
developmental efficiency of SCNT embryos, regardless of the donor cell
types.[145]^10 Numerous previous studies also have shown that the SCNT
reprogramming process requires coordinated regulation among multiple
factors.[146]^5^,[147]^46 Consistently, we identified some potential
molecular markers in SCNT embryos, in which Id1, Pou6f1, Cited1, and
Zscan4c as the representative TFs were highly expressed in NT normal
embryos compared with the NT arrested embryos, showing that these
factors act as important drivers in NT normal embryos. Histone
acetylases Kat8, Elp6, and Eid1 were co-activated in both NT 2-cell and
NT 4-cell to facilitate embryo normal development. As in the previous
study, Zfp217 can inactivate the m(6)A methyltransferase Mettl3 to
increase the efficiency of reprogramming.[148]^45 Moreover, Gadd45a as
a key player functions in DNA demethylase activation and coordinately
regulates with Tet1 and Tet2 to improve the efficiency of NT
reprogramming.
Taken together, our studies identified some reprogramming driver
factors by comparative analysis of single-cell transcriptome, hoping to
provide a new insight into deciphering the potential molecular
mechanisms and improving the efficiency of SCNT reprogramming.[149]^47
In future work, we will further analyze the effects of epigenetic
disorders on reprogramming by integrating other multi-omics data, such
as DNA methylation and histone modification data.[150]^36^,[151]48,
[152]49, [153]50
Materials and Methods
Dataset Collection
The single-cell RNA sequencing (RNA-seq) data of this study were
downloaded from Gene Expression Omnibus (GEO) database under accession
number GEO: [154]GSE70605.[155]^25 There are two embryonic types,
including SCNT embryos and in vitro fertilization (WT) embryos as a
control. WT samples include cumulus cell, MII Oocyte, Zygote, 2-cell,
4-cell, 8-cell, Morula, Blastocyst, Inner cell mass (ICM), and
Trophectoderm (TE), and each stage has three to six replicates. For
SCNT or mRNA injection SCNT samples, three to eight replicates were
performed for each stage, these stages can be divided into NT arrest
and NT develop to blastocyst (NT to blast).
Data Processing
All RNA-seq data used FastQC
([156]http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and
FASTX-Toolkit (version 0.0.13) software evaluation to remove
low-quality samples, and the remaining samples in each class were
statistically enough. Filtered clean reads were mapped to the mouse
genome (mm9)[157]^10 with HISAT2[158]^51 (version 2.1.0). Read counts
of gene expression were performed using HTseq-count (Python package).
Transcriptome assembly was performed using Stringtie (version
1.3.3)[159]^51^,[160]^52 and Ballgown[161]^51 (R package), and
expression levels of each gene were quantified with FPKM[162]^10 to
eliminate the effects of sequencing depth and transcript length.
Analysis of DEGs
The genes with FPKM ≥ 0.1 in each embryo sample were regarded as
expressed genes.[163]^53 R package DEseq2[164]^54 was used to identify
DEGs. The p value adjusted for multiple tests and the absolute value of
log2FC were obtained to identify DEGs, in which the genes with log2FC >
1 and adjusted p < 0.05 were considered upregulated genes. Otherwise,
they were (log2FC < −1, adjusted p < 0.05) regarded as downregulated
genes.[165]^55
Gene Clustering and Gene Set Enrichment Analysis
After normalization of DEGs, we used the hierarchical clustering
algorithm[166]^23^,[167]56, [168]57, [169]58 hclust(d, method =
“complete,” members = NULL) to identify cluster-specific genes based on
their expression profile similarity. GO and Kyoto Encyclopedia of Genes
and Genomes (KEGG)[170]^59 pathway enrichment analysis of DEGs for each
group were performed using the Database for Annotation, Visualization
and Integrated Discovery (DAVID) Bioinformatics Resource
([171]https://david.ncifcrf.gov/home.jsp).[172]^60 GO terms for each
functional cluster were a summary of representative terms, and p values
(p < 0.05) were plotted to show the significance. The KEGG
systematically analyzed the gene function and genomic information
database, and studied the gene and expression information as a whole
network. In our studies, GO and KEGG analysis were visualized by bar
plot and bubble graph, respectively.
Statistical Analysis
In this study, the normalization of expression values was conducted by
Z score standardization. The formula is as follows:
[MATH:
Z=(x<
mo
linebreak="badbreak">−μ)/σ. :MATH]
The Pearson’s correlation coefficients[173]^61^,[174]^62 (denoted as ρ)
were applied in statistics to estimate the relationship between two
samples. The formula is as follows:
[MATH: ρ=∑i=1n(Xi−X¯)(<
msub>Yi−Y¯)∑i=1n(X
i−X¯)2∑i=1n(Y
i−Y¯)2, :MATH]
where X and
[MATH: X‾
:MATH]
indicate the true expression value and mean in group 1, respectively,
and expression value Y and mean
[MATH: Y‾
:MATH]
are in group 2.
The t test[175]61, [176]62, [177]63 was used as a common statistical
test to compare the difference between the two means relative to the
variation.
Data Visualization
In this study, R/Bioconductor ([178]https://www.bioconductor.org/)
software packages were used for data visualization. For example, the
t-Distributed Stochastic Neighbor Embedding (t-SNE) plot and Venn plot
were produced using R packet Scater and VennDiagram, respectively, and
the density graph, (Principal Component Analysis) PCA, bubble graph,
and so on were generated with the R packet ggplot2
[179]http://ggplot2.org/. The genome browser view was obtained by using
the Integrative Genomics Viewer (IGV).[180]^64 The R package for
weighted gene co-expression network analysis (WGCNA)[181]^65 was used
to construct weighted gene co-expression networks. The flowchart of
this study is shown in [182]Figure 1A.
Author Contributions
Y.Z. conceived and designed the study. M.S. did most of the
bioinformatics analysis. P.C. and H.L. were helpful for materials
collection. L.Z. participated in some statistical analyses. Y.Z. and
H.L. wrote the paper. All authors read and approved the manuscript.
Conflicts of Interest
The authors declare no competing interests.
Acknowledgments