Abstract

Background

   Guinea pigs exhibit numerous physiological similarities to humans, yet
   the details of their preimplantation embryonic development remain
   largely unexplored.

Results

   To address this, we conducted single-cell sequencing on the
   transcriptomes of cells isolated from the zygote stage through
   preimplantation stages in guinea pigs. This study identified seven
   distinct cell types within guinea pig preimplantation embryos and
   pinpointed the timing of zygotic gene activation (ZGA). Trajectory
   analysis revealed a bifurcation into two lineage-specific branches,
   accompanied by alterations in specific pathways, including oxidative
   phosphorylation and vascular endothelial growth factor (VEGF).
   Additionally, co-expressed gene network analysis highlighted the most
   enriched functional modules for the epiblast (EPI), primitive endoderm
   (PrE), and inner cell mass (ICM). Finally, we compared the similarities
   and differences between human and guinea pig epiblasts (EPIs).

Conclusion

   This study systematically constructs a cell atlas of guinea pig
   preimplantation embryonic development, offering fresh insights into
   mammalian embryonic development and providing alternative experimental
   models for studying human embryonic development.

Supplementary Information

   The online version contains supplementary material available at
   10.1186/s12864-024-10815-z.

   Keywords: Guinea pigs, Single-cell sequencing, Early embryonic
   development

Introduction

   Early embryonic development, especially mammalian embryonic development
   (including human), has been studied mainly using mouse and non-human
   primate embryos as models [[38]1]. Mice are extensively used due to
   their quick reproductive cycle and the advancement of technologies such
   as gene editing and embryo manipulation. The preimplantation
   development in mice spans approximately five days, progressing from a
   single fertilized egg through multiple cell divisions, zygotic gene
   activation, compaction, polarization, and cavitation to form the
   blastocyst cavity. The initial lineage separation occurs during the
   transition from the morula to the early blastocyst stage [[39]2].
   Subsequently, the totipotent inner cell mass (ICM) differentiates into
   the primitive endoderm (PrE) and the pluripotent epiblast (EPI)
   [[40]3–[41]5]. Extensive research has focused on cell fate
   determination in mouse preimplantation embryos. Although human and
   mouse preimplantation embryos exhibit morphological similarities, mice
   develop early blastocysts at embryonic day 3.25 (E3.25) whereas humans
   do so at day post fertilization 5 (d.p.f 5). Late blastocysts form at
   E3.75 in mice and d.p.f 6–7 in humans. The first lineage separation
   begins in the morula of mice and the expanding blastocyst of humans,
   with the second lineage separation occurring in the late blastocyst
   [[42]6, [43]7]. Human embryonic development research has been limited
   in part due to ethical and other constraints, and the maturity of
   technologies such as micromanipulation and gene editing has facilitated
   the use of mice and other alternative models for research. However, as
   mentioned above, mouse embryo development still exhibits differences
   from human embryo development in certain aspects. Therefore, it is
   essential to study embryos from various species to gain a comprehensive
   understanding of mammalian embryonic development.

   Guinea pigs, also known as Dutch pigs, share several physiological
   similarities with humans. Firstly, their immune systems closely
   resemble those of humans, making guinea pigs excellent models for
   immunization studies [[44]8–[45]10]. Neuroanatomically mature, guinea
   pigs have well-characterized brain and neuroendocrine development, with
   earlier brain maturation than rats [[46]11, [47]12]. They are among the
   few laboratory rodents that exhibit a complete estrous cycle, including
   follicular and luteal phases similar to humans, unlike most rodents
   such as mice, rats, and hamsters, which have distinct and incomplete
   estrous cycles lacking a functional luteal phase [[48]13]. Crucially,
   guinea pigs have a placental pattern similar to humans [[49]14–[50]16]
   and possess unique developmental advantages over mice and rats
   [[51]17–[52]21]. Additionally, like humans, guinea pigs cannot
   synthesize their own vitamin C, and research indicates that vitamin C
   deficiency in these animals can negatively impact fertility and
   pregnancy [[53]22, [54]23].

   Due to the numerous similarities between guinea pigs and humans, the
   use of guinea pigs was prevalent in the early 1970s, surpassing that of
   mice, rats, and hamsters. However, this usage has since declined
   [[55]24]. The limitations in genome assembly and the immaturity of gene
   editing technologies have contributed to this decrease. Given these
   similarities and limitations, further investigation into guinea pig
   embryonic development is essential to bridge gaps in our understanding
   of mammalian development.

   In this study, we hypothesize that guinea pig embryos undergo zygotic
   genome activation, followed by the first and second cell fate
   determinations, all occurring in a relatively orderly manner before
   implantation under physiological conditions. To address this, we
   propose to employ single-cell transcriptome sequencing technology to
   explore the lineage decisions and gene expression patterns during
   guinea pig preimplantation embryo development. Our objectives include
   characterizing cell types in these embryos, pinpointing the timing of
   zygotic genome activation (ZGA) in guinea pigs, and delineating two
   critical cell fate-determining branches with specific pathway
   alterations across developmental stages. Additionally, we aim to
   identify functional modules predominantly enriched during the second
   lineage segregation involving three distinct cell types. Lastly, we
   will investigate interspecies differences in the preimplantation
   epiblast (EPI) population between humans and guinea pigs.

Methods

Collection of guinea pig preimplantation embryos and isolation of single cell
samples

   Pregnant guinea pigs were euthanized using carbon dioxide, and the
   reproductive organs (ovaries, oviducts, and uterus) were excised and
   immersed in pre-warmed (37 °C) phosphate-buffered saline (PBS). To
   harvest zygotes, two-cell, four-cell, and eight-cell stage embryos, the
   fallopian tube was incised at the uterine horn junction, and 500 μl of
   pre-warmed embryo collection solution (DMEM/F12 supplemented with 2%
   fetal bovine serum, FBS) was injected using a 1 ml syringe. The embryos
   were then flushed out from the fallopian tube’s fimbriae. For the
   collection of morulae and blastocysts, the uterus was cut near the
   cervix, and a syringe needle was inserted near the tubal-uterine horn
   junction to inject 1 to 2 ml of preheated embryo collection solution,
   facilitating the flushing of embryos from the uterus. The zona
   pellucida was subsequently removed by treating the embryos with Acidic
   Tyrode’s Solution for 15–30 s. Embryos without zona pellucida were
   treated with Accutase alone or a mixture of Accutase and TrypLE (1:1)
   for 3–30 min until the cells were loosened. Single cells were then
   dispersed in DPBS containing 0.2% BSA using a mouth pipette for
   sampling. To separately sample mural trophectoderm (mTE) cells, some
   blastocyst-stage embryos were microdissected along the inner cell mass
   (ICM) using a laser membrane rupture device on a micromanipulation
   platform. The mTE and the ICM plus polar trophectoderm (pTE) groups
   were then digested and sampled individually.

Construction and sequencing of single-cell RNA-seq libraries

   Cells were lysed, and RNA with polyA tails, primarily mRNA, was reverse
   transcribed using an oligo(dT) primer to synthesize the first strand of
   cDNA. A splice sequence was appended to the 3’ end of the cDNA
   utilizing the template-switching activity of the reverse transcriptase.
   The second strand of cDNA was synthesized at this junction sequence
   using a TSO (template-switching oligo) primer, which displaced the RNA
   complementary to the first strand of cDNA. PCR amplification followed
   to achieve full-length cDNA products, effectively mitigating 3’ bias
   and rRNA contamination during cDNA synthesis. The resultant single-cell
   cDNAs underwent quality control using a high-sensitivity chip, with
   fragment lengths ranging from 400 bp to 10,000 bp, and the peak of the
   library distribution was around 2,000 bp, indicating qualified cDNA
   libraries. These qualified single-cell cDNA libraries were fragmented
   using Tn5 transposase, which also added junction sequences to the cDNA
   ends. A final round of PCR amplification of the labeled cDNAs was
   conducted using sequencing primers, and the amplified cDNAs were
   purified with magnetic beads to yield on-board libraries of 200–600 bp.
   All steps of the cDNA library construction, including lysis, reverse
   transcription, amplification, and purification, were carried out
   in-house by our lab.

   The sequencing platforms chosen for this study were Illumina’s
   NovaSeq6000 and BGI’s MGISEQ2000. The data volume generated from a
   single-cell library on the paired-end PE150/100 machines was
   approximately 1.5G-2G, sufficient for comprehensive analysis. The
   library construction method employed was Smart-seq2 [[56]25].

Bioinformatics workflow

   Upon receiving raw FASTQ files, we conducted quality control with
   FastQC and MultiQC to generate comprehensive quality reports. The
   subsequent adapter trimming was performed using Trim Galore.
   Post-quality check, the data were aligned with HISAT2, and expression
   quantification was achieved using featureCounts. The resulting
   expression matrix was then processed through the “Seurat” package for
   standard single-cell workflow analysis. Annotated cells underwent
   differential expression analysis, Gene Ontology (GO) and Kyoto
   Encyclopedia of Genes and Genomes (KEGG) enrichment analysis,
   developmental trajectory construction with Monocle2, and
   high-dimensional gene co-expression network analysis using hdWGCNA to
   examine lineage-specific gene expression characteristics. Additionally,
   we conducted interspecies comparisons. Detailed descriptions of the
   algorithms used and their corresponding parameters are provided in the
   following sections.

Preprocessing of scRNA-seq data

   The quality of the initial sequencing data, represented by the FASTQ
   file, is assessed, encompassing the GC content, base quality at each
   position, base distribution, presence of junctions, and junction
   content. Sequencing files that exhibit no evident quality issues,
   contamination from adapter sequences or other species, or sequences of
   identical length are retained. Adapter sequences are automatically
   detected and removed using “trim_galore,” which also eliminates
   low-quality bases scoring below q20 (--quality 20). The sequenced reads
   are aligned to the guinea pig reference genome (Cavpor3.0) using the
   Hisat2 (v2.2.1) software package [[57]26]. Unmatched data is filtered
   out using “samtools” (v1.17) [[58]27], resulting in bam files. The gene
   expression matrix is subsequently quantified by counting the number of
   molecules per gene, guided by the annotation file, through
   “featureCounts” (Liao et al., 2014).

Dimensionality reduction clustering

   The expression matrix derived from the upstream analysis was imported
   into R, and the Seurat package(v4.4.0) [[59]28] was employed to create
   Seurat objects. Following a simple merge, three batches of data were
   filtered based on QC-specific criteria: genes expressed in at least 3
   cells, and thresholds for nFeature_RNA (> 5000), nCount_RNA
   (> 1,000,000), and nCount_RNA (< 10,000,000). The standard Seurat
   workflow was then applied, including “NormalizeData”,
   “FindVariableFeatures”, and “ScaleData”. Principal component analysis
   (PCA) was conducted using the “RunPCA” function. Clustering was
   achieved through “FindNeighbors” and “FindClusters”, and the results
   were visualized using Uniform Manifold Approximation and Projection
   (UMAP) for dimensionality reduction. The “FindAllMarkers” function,
   with default parameters, was used for identifying characteristic genes
   of each cluster. The default parameters used in “FindAllMarkers” for
   this study are “only.pos = FALSE, min.pct = 0.25, thresh.use = 0.25”.
   Cell type annotation was performed manually. Default parameters were
   used throughout, with dimensions selected from 1 to 15 (dims = 1:15)
   and a resolution of 0.8.

Identification of the period of occurrence of ZGA

   To detect changes in gene expression during successive developmental
   stages, we conducted a differential analysis comparing each time point
   to its preceding one. Genes exhibiting significant differences were
   identified using the criteria of absolute average log2 fold change
   (avg_log2FC) greater than 1 and a p-value less than 0.05. Visualization
   of these differential genes was achieved through scatter plots and
   volcano plots, both of which were generated using the “ggplot2” package
   [[60]29].

DEG analysis

   The analysis was conducted using the built-in “FindMarkers” function
   within the Seurat package, which employs the “wilcox” test for
   identifying differentially expressed genes. The specific parameters are
   “verbose = FALSE, test.use = ‘wilcox’, min.pct = 0.1,” with all other
   parameters set to default.

GO and KEGG enrichment analysis

   The initial step involved utilizing the “AnnotationHub” package
   [[61]30] to retrieve annotations specific to guinea pigs. Following
   this, GO and KEGG enrichment analyses were conducted on the outcomes of
   the differential expression analysis, employing the “clusterProfiler”
   package(v4.10.1) [[62]31]. Visualization of the results was
   accomplished through the “GOplot” package [[63]32].

Pseudotime analysis

   To investigate the two genealogical separations, single-cell trajectory
   analysis was conducted using the R Monocle2 package with the DDRTree
   method and default parameters. The top 3000 genes were selected as the
   input for this trajectory analysis. Subsequently, heatmaps were
   generated to display significantly changed genes, which were clustered
   based on their pseudo-temporal expression patterns. Pathway enrichment
   analysis was then performed using gene sets from KEGG. In a similar
   vein, the “AddModuleScore_UCell” function [[64]33] was employed to
   integrate the gene set scoring results into the corresponding matrix of
   the Monocle object, and the “plot_cell_trajectory” function was used to
   visualize the pathways.

High dimensional weighted gene co-expression network analysis (hdWGCNA)

   To pinpoint genes linked to the second lineage segregation in guinea
   pigs, a co-expression gene network analysis was executed using the
   “hdWGCNA” package [[65]34]. The process involved constructing metacells
   individually for each sample and cell cluster using the hdWGCNA
   function, with 50 cells amalgamated per metacell. For each cell
   population, a subset of Seurat objects was established for the targeted
   cell population, followed by the application of the standard hdWGCNA
   pipeline. This pipeline was implemented sequentially through the
   execution of the following functions with default parameters:
   TestSoftPowers, ConstructNetwork, ModuleEigengenes, ModuleConnectivity,
   and RunModuleUMAP.

Species comparison

   The dataset utilized in this study, sourced from [66]GSE136447
   [[67]35], comprised scRNA-seq profiles of human preimplantation EPI
   cells aligned with the human genome. The dataset included FPKM values
   for 13,915 homologous genes, derived from a combination of 106 human
   ICM and EPI single cells and 315 guinea pig ICM and EPI scRNA-seq
   profiles. To refine the dataset, filtering was applied to retain 10,698
   genes common to both human and guinea pig, based on the criterion that
   these genes had a log2(FPKM + 1) value of at least 4 in at least one of
   the 106 human or 315 guinea pig cells. The raw FPKM values for these
   10,698 genes from both species were subsequently normalized and scaled
   using the Seurat package in R. Principal Component Analysis (PCA) was
   conducted on these normalized and scaled FPKM values via the prcomp
   function in R. The PCA results for the first three major components
   were visualized using MatLab (MathWorks) [[68]36]. Notably, 952
   homologous genes in human and guinea pig significantly contributed to
   PC1, with absolute PC1 values greater than 0.8 standard deviations.
   Clustering and visualization of 12,475 genes were performed using the
   heatmap function on the R platform. Further analysis focused on genes
   with PC1 values greater than 0.8 and less than − 1.75 standard
   deviations, encompassing 250 and 702 genes respectively, for KEGG
   enrichment using the clusterProfiler package.

Animals

   Females and males representing the outbred Hartley strain of guinea
   pigs (Cavia porcellus) were purchased from Beijing Vital River
   Laboratory.

Results

Acquisition of high-quality guinea pig preimplantation embryos

   To acquire high-quality single-cell data from guinea pig
   preimplantation embryos, we employed morphological observation to stage
   these embryos based on post-fertilization time and microscopic
   morphological features. Embryos were harvested from guinea pigs at
   various embryonic days ranging from E0.5 to E5.5, encompassing stages
   from Zygote to Blastocyst, including 2-cell, 4-cell, 8-cell, early
   morula, late morula, early blastocyst, and blastocysts stages.

   Some blastocysts were further dissected into Inner Cell Mass (ICM) plus
   polar Trophoblast Cells (pTE) and mural Trophoblast Cells (mTE) groups
   using microdissection. These embryos were then enzymatically
   dissociated into single cells. We collected a total of 599 samples from
   39 pregnant guinea pigs (Fig. [69]1A, B). Following quality control
   (QC), we obtained final data from 551 high-quality single-cell samples
   derived from guinea pig preimplantation embryos. The morphology of each
   stage of embryos and the dissociated single cells was documented
   (Fig. [70]1C, Fig.[71]S1A, B).

Fig. 1.

   [72]Fig. 1
   [73]Open in a new tab

   Preimplantation Embryo Sampling in Guinea Pigs (A) Embryo sampling
   conducted at different preimplantation stages. (B) Summary of the
   number of embryos and cells collected at each time point. (C)
   Representative bright-field microscopy images of in vivo acquired
   embryos. Scale bar: 50 μm

Single-cell analysis of guinea pig preimplantation embryos

   The post-QC single-cell data were subjected to UMAP dimensionality
   reduction and classified into a total of 9 clusters. Given the limited
   research on guinea pig preimplantation embryos, we relied on cell
   markers from human and mouse preimplantation embryos to characterize
   the single-cell data of guinea pig embryos (Fig. [74]2B; Fig. [75]S1D)
   [[76]37–[77]40]. The joint morphological pre-staging of sample
   information facilitated accurate cell annotation of clusters (Fig.
   [78]S1C). We defined Totipotency (Toti) to include Zygote, 2-cell, and
   4-cell stages, as well as 8-cell, morula, Trophoblastic Ectoderm (TE),
   and Primitive Endoderm (PrE) based on morphological observations and
   corresponding markers (Fig. [79]2A). However, the definitions of the
   Inner Cell Mass (ICM) and Epiblast (EPI) were more ambiguous. The
   regulation of the pre-implantation Epiblast (EPI) in various species
   has been shown to be regulated by Nanog [[80]41–[81]43]. Consequently,
   we aimed to use Nanog as a specific marker for EPI in cell cluster
   characterization. However, unexpectedly, Nanog expression was not
   detected in the available single-cell data (Fig. [82]2B). Given that
   guinea pig embryos initiate implantation at E6 [[83]21], and
   theoretically, Nanog should be expressed in both ICM and EPI at E5.5,
   which was not observed. Further investigation revealed a gene named
   LOC100731016, which, upon inquiry in the NCBI database, was identified
   as the Nanog2 gene of the guinea pig. Nanog belongs to a gene family
   that includes NANOG1, its tandem duplication gene NANOG2, and the
   pseudogenes NANOGP2-P11. NANOG and NANOG2 have been demonstrated to be
   functionally equivalent and to activate specific stem cell regulatory
   pathways [[84]44]. Thus, we adopted LOC100731016 (Nanog2) as an
   EPI-specific marker for cell cluster characterization (Fig. [85]2B).

Fig. 2.

   [86]Fig. 2
   [87]Open in a new tab

   Transcriptomic analysis of preimplantation embryos in guinea pigs (A)
   UMAP visualization of 551 single-cell transcriptomes from guinea pig
   preimplantation embryos. Points are colored by cell types. (B) Dot plot
   of selected marker genes for various cell types. (C) Visualization of
   the top 5 upregulated and downregulated genes in each cluster. (D)
   Heatmap showing the top 25 differentially expressed genes (DEGs) in
   each cluster. (E) UMAP plot of all cells with projected expression
   patterns of the following marker genes for each cluster: Zscan4 (8 C),
   Nanog2 (EPI), Cdx2, Gata3 (TE), and Sox17, Gata6 (PrE)

   Additionally, Dnmt3b was highly expressed in EPI, indicating a
   hypermethylation state, akin to findings in human embryo studies
   [[88]45]. By subgrouping based on sampling information and marker
   expressions such as Tfcp2l1 and LOC100731016, the ICM group was also
   defined. The final classification comprised 7 cell classes: Totipotency
   (Toti), 8 C, Morula, ICM, TE, EPI, and PrE (Fig. [89]2A). Multiple
   marker gene enrichment and specific up- and down-regulated genes were
   also clearly differentiated (Fig. [90]2C-E).

Guinea pig ZGA occurs at the 4–8 cell stage

   Mammalian embryonic development commences with the formation of a
   fertilized egg following sperm-egg fusion. The initial stage of
   embryonic development is predominantly governed by a series of maternal
   factors supplied by the egg cell’s cytoplasm. During this phase, the
   zygotic genome remains transcriptionally silent, enabling the zygote to
   maintain a state of totipotency. As the maternally deposited mRNA
   degrades, the zygotic genome starts to be transcribed, leading to the
   activation of the genome, the mobilization of embryonic gene products,
   and the clearance of maternal factors. This process, known as the
   maternal-to-zygotic transition (MZT), involves two key events: the
   degradation of maternal mRNA and the production of nascent mRNA, which
   signifies the activation of the zygotic genome (ZGA) [[91]46, [92]47].
   The coordination of these events is crucial. MZT is a critical phase in
   developmental studies, as the transitions that occur during this period
   are essential for embryonic patterning, which is heavily influenced by
   the actions of transcriptional activators [[93]48–[94]50].

   Advances in histological sequencing have enabled the development of a
   sensitive and accurate method for sequencing in continuous time to
   assess the activation of the zygotic genome. We analyzed differences in
   gene expression between adjacent time periods for samples ranging from
   the zygote to the morula stage. The onset of ZGA is marked by
   significant changes in gene expression patterns and transcript levels,
   transitioning the embryo from a state of minimal transcription to one
   where thousands of genes are actively transcribed. This shift triggers
   a cascade of gene expression changes that influence subsequent cell
   fates [[95]51].

   Analysis of the differentially expressed genes (DEGs) revealed that the
   transcriptomic differences in guinea pig embryos at the zygote, 2 C,
   and 4 C stages are minimal, suggesting a transcriptionally quiescent
   state. In stark contrast, the transition from 4 C to 8 C stages
   exhibited significant transcriptomic alterations (Fig. [96]3A),
   aligning with the anticipated dynamics associated with the onset of
   ZGA. We delved into the gene expression patterns during the 4 C to 8 C
   developmental stages (Fig. [97]3B; Fig. [98]S2,B). Gene Ontology (GO)
   enrichment analysis indicated that the up-regulated genes were
   predominantly involved in protein synthesis and transport
   (Fig. [99]3C). Conversely, Kyoto Encyclopedia of Genes and Genomes
   (KEGG) pathway analysis highlighted the pentose phosphate pathway
   (Fig. [100]3D). which is crucial for providing precursors for nucleic
   acid synthesis. The enrichment of this pathway during the 4 C to 8 C
   stages suggests vigorous DNA replication and synthesis. Additionally,
   Gene Set Enrichment Analysis (GSEA) and GO confirmed that numerous
   synthesis reactions are actively occurring at this stage. (Fig.
   [101]S2).

Fig. 3.

   [102]Fig. 3
   [103]Open in a new tab

   The zygotic genome activation (ZGA) of guinea pigs (A) Percentage
   difference (Δ percent of cells) and log-fold change based on the
   Wilcoxon rank-sum test results for differential gene expression
   comparing 2 C versus zygote(Left), 4 C versus 2 C(Middle), and 8 C
   versus 4 C(Right). Red: upregulated genes; Blue: downregulated genes.
   (B) Volcano plots of differential gene expression comparing 4 C versus
   zygote (Left) and 8 C versus zygote (Right). (C) Gene Ontology (GO)
   circle plot displaying gene-annotation enrichment analysis. Red
   indicates upregulated gene-associated GO terms, relative to the z-score
   of the analysis. (D) KEGG Chord plot illustrating the relationship
   between the list of selected KEGG terms (upregulated genes in 8 C
   versus 4 C) and their corresponding genes, along with the log fold
   change (logFC) of the genes. The left half of the KEGG Chord plot
   displays whether the gene expression is upregulated, and the right half
   represents different KEGG terms with various colors

   Pseudotime analysis of preimplantation embryonic development in guinea
   pigs provides insights into lineage specification, a process
   extensively investigated in humans and mice. In mice, lineage
   specification typically unfolds in two stages. The initial stage occurs
   at the morula stage, delineating the inner cell mass (ICM) and
   trophectoderm (TE) cell fates, with TE later giving rise to
   extraembryonic tissues like the placenta. The subsequent stage
   transpires in the blastocyst phase, where the ICM differentiates into
   epiblast (EPI) and primitive endoderm (PrE) lineages, with EPI
   eventually forming the individual and PrE contributing to the yolk sac
   [[104]3, [105]52]. In human studies, however, two contrasting views
   have emerged. One perspective posits that TE, EPI, and PrE are
   specified concurrently during human preimplantation development,
   suggesting a one-step process [[106]41]. The alternative view advocates
   for an intermediate ICM state during lineage specification, akin to the
   two-step model observed in mice [[107]53–[108]55].

   To investigate the lineage specification of preimplantation embryonic
   development in guinea pigs, we analyzed embryonic data spanning from
   the 8 C to the blastocyst stages. The pseudotime analysis revealed a
   bifurcating trajectory with two primary branches emerging from an
   initial branch at the first branching point. Subsequently, one of these
   branches further diverged into two additional branches at the second
   branching point. Mapping cell types onto these branches indicated that
   the initial branch was predominantly composed of 8 C and 16 C morula
   cells. The first branching point delineated the trophectoderm (TE) and
   inner cell mass (ICM) branches. Following the proposed developmental
   timeline, the ICM branch subsequently differentiated into epiblast
   (EPI) and primitive endoderm (PrE) branches at the second branching
   point (Fig. [109]4A). Additionally, we overlaid the sampling time and
   pseudotime values onto the branches, identifying the first lineage
   separation in guinea pig preimplantation embryonic development around
   day E4.5, with the second lineage separation occurring around day E5.0
   (Fig. [110]4B). To elucidate the key drivers of lineage specification,
   we visualized the top five genes with the highest contribution to the
   trajectory construction. Among these, Argfx, a gene homologous to a
   frame gene frequently observed in mammalian embryonic development
   studies from 8 C to morula stages and known to influence developmental
   regulation [[111]56], exhibited a comparable expression pattern in
   guinea pigs. The projection of specific marker genes further validated
   the accuracy of the trajectory construction (Fig. [112]S3; Fig.
   [113]S4). Notably, the ICM branch in guinea pigs was found to contain a
   mixture of EPI cells, differing from mice where distinct ICM cell
   clusters can be identified based on significant molecular features.
   Similar to humans, guinea pigs exhibit a high degree of similarity
   between ICM and EPI cell clusters. Overall, the pseudotime analysis of
   our data offers a comprehensive view of the three-lineage specification
   in guinea pig preimplantation embryos. Furthermore, we depicted the
   cell proportion dynamics at various time points, aligning with the
   developmental trajectory (Fig. [114]4C).

Fig. 4.

   [115]Fig. 4
   [116]Open in a new tab

   Two rounds of lineage separation in guinea pig preimplantation embryos
   (A) UMAP plot showing the developmental trajectory across different
   sampling times. (B) UMAP plot illustrating the developmental trajectory
   of 8-cell stage (8 C), morula, trophectoderm (TE), inner cell mass
   (ICM), epiblast (EPI), and primitive endoderm (PrE) cells. (C) Bar plot
   representing the proportion of each cell type at different sampling
   times. (D) Heatmap displaying trends of differentially expressed genes
   as a function of inferred pseudotime. Genes in each row are grouped
   into six clusters based on their expression patterns. (E-G) These UMAP
   plots represent the distribution of pathway scores across different
   pseudotime points. The color gradient on the plot indicates the pathway
   scores, with dark blue representing low scores and light yellow
   representing high scores.

   To explore the gene expression dynamics during lineage specification,
   we ranked the top 1000 genes by ascending q-values and generated a
   heatmap depicting differential gene expression across pseudotime,
   aiming to pinpoint genes with significant expression shifts over time.
   These genes were grouped into six clusters based on their expression
   patterns (Fig. [117]4D). The heatmap clusters illustrated distinct gene
   expression profiles that evolved with pseudotime. To delve deeper into
   pathway expression trends, we utilized Ucell to score relevant
   pathways, focusing on key processes such as oxidative phosphorylation,
   purine metabolism, and the VEGF pathway (Fig. [118]4E-G). Notably, the
   oxidative phosphorylation pathway exhibited elevated scores in ICM and
   EPI cells, aligning with observations of enrichment in preimplantation
   EPI cells across various species [[119]57, [120]58]. These findings are
   congruent with prior research. Furthermore, the VEGF pathway showed
   higher scores in TE and PrE cells, which is consistent with its role in
   angiogenesis, as TE and PrE later contribute to the vascular-rich
   placenta and yolk sac, respectively. The observed expression patterns
   were congruent with the pathway enrichment analysis, underscoring the
   relevance of these pathways in early embryonic development.

Functional characteristics of ICM, EPI and PrE

   To delineate the molecular characteristics of each lineage, we employed
   hdWGCNA (high-dimensional Weighted Gene Co-expression Network Analysis)
   [[121]34] to uncover gene expression patterns associated with lineage
   specification (Fig. [122]5A; Fig. [123]S5B). We selected a soft
   threshold power of 7 to construct the co-expression network (Fig.
   [124]S5A). Following high-dimensional gene co-expression analysis, we
   identified 19 functional modules (Fig. [125]S5A-B; Fig. [126]S5D), with
   particular attention to those associated with the second lineage
   segregation. Utilizing correlation-assisted enrichment module
   screening, we identified the most significant modules enriched for EPI,
   ICM, and PrE as 3, 6, and 13 modules, respectively (Fig. [127]5B; Fig.
   [128]S5C). The hub genes of these modules were determined by selecting
   the top 25 hub genes for each. Visualization revealed that Nanog2 was
   the central regulator in the 3 modules most significantly enriched for
   EPI, whereas Sox17 was the predominant core regulator enriched in PrE
   (Fig. [129]5C), aligning with previous findings [[130]59].

Fig. 5.

   [131]Fig. 5
   [132]Open in a new tab

   Identification of co-expressed gene modules associated with ICM, EPI,
   and PrE (A) Schematic representation of the second cell fate decision
   for preimplantation embryos. (B) Module activities in different
   clusters estimated using the hdWGCNA algorithm. (C) Co-expression plots
   for modules M3 (EPI), M6 (ICM), and M13 (PrE). (D) Differential gene
   heatmap and KEGG enrichment analysis results for EPI, ICM, and PrE

   To understand the gene expression patterns associated with the second
   cell fate decision, we identified 1209 genes by intersecting the
   markers identified by hdWGCNA and the findmarker function. We then
   analyzed these genes using KEGG to plot and examine a heatmap. The
   analysis revealed that EPI cells primarily involve pathways such as
   Hippo, Notch, oxidative phosphorylation, embryonic development, and
   cell adhesion. In contrast, ICM cells were associated with embryonic
   development and oxidative phosphorylation, while PrE cells were
   enriched in pathways like Ras, Rap1, and VEGF. These cell types
   exhibited partially overlapping yet distinct pathway expression
   profiles (Fig. [133]5D).

Comparison of human and guinea pig EPI characteristics

   It is now generally accepted that preimplantation EPI is characterized
   by a naïve state, with the potential to produce a complete individual
   (but not to develop extra-embryonic tissues) and the ability to form
   chimeras [[134]60, [135]61]. Also similar to humans, after implantation
   into the endometrium, both guinea pig and human EPIs show developmental
   events of cavitation and formation of a double-layered intervertebral
   disc [[136]62]. EPI plays an important role in understanding the
   developmental mechanisms of pluripotent stem cells as well as the
   development of regenerative medicine. To compare the characteristics of
   EPI in human and guinea pig preimplantation embryos, we combined
   previously published human data with our guinea pig data. We focused on
   the dynamics of naïve genes and the EPI pluripotency transition process
   (EPST), which have been documented in humans [[137]35]. Our analysis
   revealed that, compared to human EPST, guinea pigs lacked expression of
   Il6r and Nr0b1, showed reduced expression of Utf1, and exhibited
   increased expression of Lifr (Fig. [138]6C -D).

Fig. 6.

   [139]Fig. 6
   [140]Open in a new tab

   Comparison of preimplantation EPI in humans and guinea pigs (A)
   Three-dimensional Principal Component Analysis (PCA) of the EPI
   lineage, determined by the expressed genes among all groups of EPIs
   during development in humans (hu) and guinea pigs (GP). In total,
   10,698 out of 13,915 annotated genes expressed in human and guinea pig
   cells (human, 106 cells; guinea pig, 315 cells) were used. (B) Heatmap
   of 952 genes that highly contributed to PC1 (greater than 0.8 standard
   deviation of PC1). (C) Violin plots showing expression levels of
   pluripotency genes during the EPI pluripotency transition in human
   embryos. (D) Violin plots showing expression levels of selected genes
   from (C) in guinea pig embryos

   Furthermore, we performed PCA on the integrated data. The analysis
   indicated a.

   clear separation of human and guinea pig cell data along the PC1 axis,
   highlighting species differences as the most significant factor
   (Fig. [141]6A). We identified 13,915 genes expressed in both species.
   Along the PC1 axis, we classified these genes into significantly
   positive and significantly negative categories. KEGG enrichment
   analysis revealed notable differences in pathways such as
   phosphatidylinositol metabolism, pyrimidine and purine metabolism, and
   oxidative phosphorylation (Fig. [142]6B). Along the PC2 and PC3 axes,
   the cellular projections of human and guinea pig overlapped, suggesting
   similar developmental trajectories during the EPI pluripotency
   transition process (Fig. [143]6A). These findings suggest both
   conservation and some differences in the development of human and
   guinea pig preimplantation EPI.

Discussion

   In our study, we systematically analyzed single-cell data from 62
   guinea pig preimplantation embryos, encompassing 551 single cells,
   using single-cell sequencing technology to explore their developmental
   characteristics. Our analysis covered key developmental events from the
   zygote stage through to embryo implantation, including zygotic gene
   activation (ZGA), ICM and TE cell fate decisions, EPI and PrE cell fate
   decisions. We identified specific time windows for these critical
   events and observed corresponding gene expression changes.
   Additionally, we compared the similarities and differences between
   human and guinea pig preimplantation EPIs by integrating previously
   reported human embryonic data. Our findings offer fresh perspectives on
   guinea pig embryonic development and expand the range of species
   available for studying mammalian embryogenesis, thereby providing new
   options for comparative studies.

   We employed single-cell RNA-seq for lineage tracing in guinea pig
   preimplantation embryos. Following QC, the data were subjected to
   dimensionality reduction and clustered into seven distinct cell types.
   The preimplantation EPI population in guinea pigs exhibits a unique
   core marker compared to other species, with high expression of the
   Nanog2 gene, which is functionally similar to Nanog. Additionally, we
   observed that Totipotency (Toti) cells highly express Sox2, 8 C cells
   express Zscan4, and ICM cells express Tfcp2l1. In contrast, the TE
   population is characterized by high enrichment of Krt8, Krt18, Gata3,
   and Cdx2. The PrE signature marker Sox17, commonly found in human and
   mouse, is also highly enriched in the guinea pig PrE population. These
   findings resulted in high-resolution single-cell transcriptome
   profiles, providing detailed insights into the molecular signatures of
   different cell types during early embryonic development.

   Embryonic development commences with a fertilized egg, initially
   governed primarily by maternal factors. Subsequently, there is a
   gradual decline in maternal deposition and the activation of the
   zygotic genome. The occurrence of ZGA is a conserved process across
   species; however, the timing varies significantly. For instance, human
   ZGA primarily occurs at the 8-cell stage, while in mice, it happens at
   the 2-cell stage, and in cattle, it is at the 8-16-cell stage
   [[144]63]. Our findings indicate that guinea pig ZGA predominantly
   takes place during the 4-cell to 8-cell phase, aligning closely with
   the timing in humans. The enrichment of up-regulated genes during this
   ZGA window is associated with processes such as protein synthesis and
   transport, as well as ribose synthesis. The number of activated genes
   and the enriched functions collectively suggest that zygotic gene
   activation in guinea pig embryos occurs from the 4 C to 8 C stages.

   We conducted pseudotime analysis of transcriptome developmental
   trajectories in guinea pig embryos, which revealed a clear bifurcation
   into two distinct branches. Analysis of the embryo sampling points
   indicated that the ICM/TE cell fate decision occurs around E4.5, while
   the EPI/PrE cell fate decision takes place earlier, at E5.0. Despite
   the clear branching, we observed a high degree of similarity between
   early ICM and EPI, with significant overlap in trajectory branching.
   This challenge in distinguishing EPI/PrE lineages is also present in
   human preimplantation embryos. Our findings suggest that EPI may have
   emerged first, followed by PrE fate. In mice, the presence of Fgf4
   (secreted by EPI cells) is crucial for PrE fate decision [[145]64]. A
   hypothesis regarding human EPI/PrE cell fate decision posits that ICM
   is closer to EPI at the transcriptome level, potentially leading to
   confusion between ICM and EPI [[146]6]. Our data indicate that guinea
   pigs exhibit a similar pattern to humans in this aspect.

   Furthermore, we examined the gene expression characteristics of the
   second lineage decision. Employing hdWGCNA, we identified functional
   modules most pertinent to ICM, EPI, and PrE. For instance, at the core
   of regulation for EPI and PrE are Nanog2 and Sox17. Nanog and Sox17 are
   key regulators in the cell fate determination of the EPI and PrE
   lineages in mice. During the specialization of the EPI lineage, Nanog,
   Sox2, and Oct4 work synergistically, promoting the expression of FGF4
   while suppressing Gata6 and Fgf signaling, thus driving the
   specification of the ICM lineage. In contrast, Sox17 plays a crucial
   role in the specification of the PrE lineage. FGF signaling facilitates
   PrE formation by promoting Gata6 expression and inhibiting Nanog, with
   Sox17 acting as a downstream regulatory factor.

   Given that Nanog2 is functionally equivalent to Nanog, it is intriguing
   to explore whether the regulatory mechanism in guinea pig EPI/PrE
   follows a similar pathway. This question remains open and warrants
   further investigation.

   Recognizing that the analyzed functional modules could be expressed
   across multiple cell populations, we intersected the most enriched
   module genes of ICM, EPI, and PrE with genes identified by the
   Findmarker function to pinpoint specific genes for these three cell
   types. We then enriched these for further analysis. The pathways
   enriched in the respective cells are similar and conserved compared to
   those in human and mouse, suggesting that the guinea pig offers certain
   advantages and is a rational choice as an experimental animal for
   studying human embryonic development.

   Trophoblast (TE) cells play a crucial role in implantation.
   Traditionally, it was thought that all TE cells possess equal potential
   to interact with and invade the endometrium. However, recent research
   has revealed that polar TEs (pTE) and mural TEs (mTE), which have
   distinct localizations, exhibit different potentials [[147]6, [148]41,
   [149]65]. In species such as mice, rats, pigs, and cows, embryos attach
   to the endometrium through mTE, while in humans and non-human primates,
   it is the pTE that facilitates attachment. In guinea pigs, similar to
   mice and rats, it is the mTE that is oriented towards attachment and
   invasion of the endometrium [[150]66–[151]69]. Despite these findings,
   the specific mechanisms by which mTE and pTE influence embryo
   attachment to the endometrium remain elusive, making this an intriguing
   area of inquiry.

   The guinea pig genome available on NCBI have not been assembled to the
   level of individual chromosomes, which limits the resolution of genome
   assembly. Consequently, research on guinea pig chromatin accessibility,
   epigenetic modifications of transcription factor binding, and related
   aspects is constrained. The current study is predominantly confined to
   the transcriptome level, yet transcription factors and epigenetic
   modifications are crucial regulators during embryonic development. This
   represents a limitation of the study. Therefore, constructing a more
   comprehensive and refined reference genome for the guinea pig is a
   critical step for advancing future research in this species.

Conclusions

   To summarize, the current research is the first to systematically
   establish a single-cell global transcriptional profile of guinea pig
   preimplantation embryos, which is characterized by a wide array of
   similarities with human embryos. Embryonic development is a pivotal
   topic in developmental biology and broader biological research. Ethical
   constraints limit the study of human embryos, and while mice have
   served as model organisms for embryonic development research, they
   exhibit differences from humans in certain aspects. The investigation
   of guinea pig preimplantation embryos offers a novel perspective and
   method for comprehending human preimplantation embryonic development.
   As a consequence, deciphering molecular mechanisms underlying
   inter-transcriptomic hallmarks correlated with early embryogenesis in
   guinea pigs might open up new possibilities for elaborating an
   excellent ex vivo embryological rodent-based model. The latter seems to
   be a highly suitable for exploration of both maternal-to-embryonic
   transition for controlling gene expression and developmental timeline
   arising from embryonic cell commitment/differentiation in humans.
   Moreover, this guinea pig model aimed at investigating experimental and
   applied embryology in humans might provide a powerful tool for
   transgenic and biomedical research with the use of such assisted
   reproductive technologies as cloning by somatic cell nuclear transfer
   and in vitro fertilization of oocytes either by their coincubation with
   capacitated spermatozoa or by their intracytoplasmic microinjection
   with single sperm cells.

Electronic supplementary material

   Below is the link to the electronic supplementary material.
   [152]Supplementary Material 1^ (3.2MB, jpg)

Acknowledgements