Abstract The regulatory mechanisms governing cell fate determination, particularly lineage diversification during mammalian embryonic development, remain poorly understood with in‐depth regulatory paradigms yet to be fully elucidated. Here, leveraging the epigenetic landscape of mouse gastrula, p‐Enh is identified, an enhancer located within the first intron of Cdx2 and epigenetically pre‐marked in the primitive streak region, as a pivotal regulator for posterior tissue development in mouse embryos. Morphological and single‐cell transcriptomic analyses confirmed embryonic lethality phenotype with disrupted posterior tissue development trajectories in p‐Enh‐KO embryos. Molecularly, apart from regulating the neighboring coding‐gene Cdx2 in cis, the findings suggest that p‐Enh also modulates the global transcriptome and epigenomic landscape, which might through the transient production of eRNA in trans. Further investigation revealed p‐Enh‐derived eRNAs participate in the regulatory cascades of TGF‐β signaling by directly interacting with SMAD4 protein. Combinatorial modulation of TGF‐β signaling and p‐Enh‐eRNA abundance can largely rescue the posterior development deficiency in in vitro gastruloids through a Cdx2‐independent mechanism. Thus, a potential model is proposed in which the broadly distributed p‐Enh transcripts within the nucleus can serve as essential cross‐modular coordinators, priming the posterior development of mouse embryo. Keywords: enhancer RNA, posterior tissue development, Pre‐marked enhancer, TGF‐β signaling __________________________________________________________________ A gastrula‐premarked enhancer within the first intron of Cdx2 governs posterior embryonic development via both regulating Cdx2 expression in cis and coordinating TGF‐β signaling in trans to ensure proper posterior body development. Functional rescue in gastruloids reveals a Cdx2‐independent pathway, highlighting p‐Enh‐eRNA as key epigenetic coordinators in posterior tissue development. graphic file with name ADVS-12-e00895-g007.jpg 1. Introduction Gastrulation stands as a pivotal developmental process to set up the embryonic body plan, characterized by the differentiation of the epiblast into three germ layers.^[ [56]^1 , [57]^2 ^] Studies reported that the posteriorly positioned primitive streak (PS) and adjacent posterior epiblast region can integrate the intricate developmental signals, and then serve as a progenitor reservoir with the competence to give rise to a diverse range of tissues, spanning from mesodermal tissues such as the anlage of somitic, limb and cardiac mesoderm to the posterior neural lineage of the spinal cord.^[ [58]^3 , [59]^4 , [60]^5 ^] Nonetheless, how cells reside in the PS region coordinate the extrinsic signals and intrinsic molecular features and sustain the competence for prospective cell fate differentiation remains largely elusive. Vital developmental morphogens lay the foundation for tissue patterning and homeostasis across various organs.^[ [61]^6 ^] The cross‐talk among different pathways forms a “signal cocktail” with an optimal concentration niche, which facilitates diverse cell differentiation programs. For instance, during gastrulation, Wnt and TGF‐β signaling pathways are activated in the PS and mesodermal cells, then diffuse from the organized sources and form morphogen gradients, thus instructing the cell fate of surrounding tissues.^[ [62]^7 , [63]^8 ^] However, β‐catenin and SMAD4, the downstream effectors of Wnt and TGF‐β signaling, are broadly expressed throughout the entire embryo during gastrulation.^[ [64]^9 ^] Distinct sets of downstream developmental genes are then activated to promote cell fate diversification in different regions of the embryo.^[ [65]^1 ^] The specific underlying mechanisms for how these signals anchor precisely to a specific set of developmental genes remain as mysteries. The selective chromatin responsiveness, manifested as various chromatin accessibilities, chromatin modifications, or chromatin topologies at regulatory loci, to signal effectors and co‐factors, have been reported as an important dimension in orchestrating the regulative forces of extrinsic signals.^[ [66]^10 ^] Notably, the regulatory elements are frequently docking platforms for transcription factors (TFs) and are usually annotated as enhancers or promoters.^[ [67]^11 ^] Dynamics of regulatory elements contribute to the regulation of lineage‐specific transcriptional programs under distinct developmental conditions.^[ [68]^12 ^] Mechanistic studies indicate that regulatory elements can not only regulate the expression of neighboring genes but also hold the proper cell fate plasticity through direct enhancer‐promoter chromatin interactions or self‐transcribed enhancer RNAs (eRNAs).^[ [69]^13 , [70]^14 ^] In particular, the discovery of eRNAs has expanded the understanding of enhancer functions from modulating local gene targets in cis to even genome‐wide chromatin remodeling and cross‐modular interactions with the transcriptional machinery in trans.^[ [71]^15 , [72]^16 , [73]^17 ^] Recently, we and others have reported that enhancers crucial for prospective tissues development are frequently pre‐marked by epigenetic modifications or TFs’ binding prior to the corresponding gene expression,^[ [74]^13 , [75]^18 , [76]^19 , [77]^20 , [78]^21 , [79]^22 ^] a phenomena termed “chromatin priming”. However, limited knowledge has been acquired concerning the coordination between chromatin regulation and the extrinsic signals in the establishment of cellular developmental competence and the subsequent lineage diversification of the PS region. In this study, through combined bioinformatic analysis and experimental screening of the H3K27ac pre‐marked regulatory elements in the PS regions of the mouse gastrula, we identified a regulatory element, p‐Enh, which is located in the first intron of Cdx2. Transgenic enhancer reporter assay revealed that p‐Enh is pre‐activated in the PS region at the late gastrulation stage (E7.5) and consistently activated in the tailbud region during early organogenesis. Phenotypic analyses of the p‐Enh knockout embryo mutants revealed embryo lethality at around E10.5 stage with severe disruptions of posterior tissue development. Single‐cell RNA‐seq confirmed that the cellular abundances of posterior tissues, such as neuromesodermal progenitors (NMP), limb mesenchyme, and presomitic mesoderm, were heavily decreased. Mechanistically, besides regulating its neighboring coding‐gene Cdx2, p‐Enh could transiently express abundant eRNAs, in the nuclei of the PS cells at E7.5 but not in later‐stage embryos. The genetic removal of p‐Enh in in vitro differentiated NMP leads to dramatic remodeling of epigenomic landscape specifically at SMAD4 binding hotspots, thus reducing SMAD4 binding. RNA pull‐down assays demonstrated that eRNAs derived from p‐Enh directly interact with SMAD4 protein. Furthermore, functional assays demonstrate that combinatorial activation of TGF‐β signaling and eRNAs enrichment can largely rescue the p‐Enh‐KO defects. Thus, we propose that the transient expressed p‐Enh‐eRNAs might act as trans‐modular coordinator, which bridges the specific interactions with morphogen effectors and specific genomic loci of master regulators, therefore contributing to the priming of posterior tissue development. 2. Results 2.1. Screening and Identification of Pre‐Marked Distal Regulatory Elements within PS Cells in the Mouse Gastrula During mouse gastrulation, the PS cells first emerge at the E6.5 stage, then gradually form a morphologically distinct structure at the posterior embryonic region, and finally establish a diverse cell fate‐specified progenitor pool at the late gastrulation stage (E7.0 to E7.5).^[ [80]^4 , [81]^23 ^] PS cells possess a remarkable capacity to give rise to a diverse range of tissues, encompassing mesodermal cells such as somites,^[ [82]^24 ^] limb mesenchyme,^[ [83]^25 ^] and cardiac mesoderm (Figure [84]1a).^[ [85]^26 ^] Moreover, it has been recognized that the PS region of the mouse gastrula also harbors a newly identified bi‐potent NMP, which can contribute to both the posterior spinal cord and mesoderm (Figure [86]1a).^[ [87]^27 , [88]^28 , [89]^29 ^] Thus, the mechanisms governing PS cell development are fundamental for subsequent patterning of diverse tissues and organs. Figure 1. Figure 1 [90]Open in a new tab Screening and identification of pre‐marked distal regulatory elements within PS cells in the mouse gastrula. a) The diagram illustrating the lineages of posterior tissue development during mouse embryonic development. The precursor cells contributing to the posterior tissue are denoted in red shade. b) The ranking of distal regulatory elements (DREs) and corresponding heatmap based on quantified H3K27ac ChIP‐seq data.^[ [91]^18 , [92]^31 ^] The top‐30 significant DREs are shown based on calculated ‐log[10](p‐value). p‐values are calculated by using a modified two‐sided t‐test (two‐sided) in MAnorm2 (see Statistical analysis). Heatmap is constructed based on log[2](normalized count). DREs with retained abundant H3K27ac at E9.5 are marked with asterisk. c) The heatmap visualizing the eRNA qPCR results correlated with Figure [93]1b. Multiple tissue samples are obtained from embryos at E7.0, E7.5, and E9.5. Tissue samples are stratified into three groups based on developmental trajectories. For each DRE locus, two independent primer sets with distinct targeting segments are used for qPCR test, all tests are replicated in pairs to ensure reliability. The DREs highlighted in bold indicating DREs subjected to transgenic assay. d) The LacZ staining revealing spatiotemporal activities of distinct DREs at key developmental stages (E7.0, E7.5, E9.5). (Some transgenic embryos with no detectable positive signals are labeled with “no signal”) Scale bar: 500 µm. e) The genome browser snapshots showing chromatin status near p‐Enh (DRE‐1). The tracks including published H3K27ac ChIP‐seq,^[ [94]^18 , [95]^31 ^] ATAC‐seq,^[ [96]^122 ^] ChIP‐seq,^[ [97]^38 , [98]^39 , [99]^40 , [100]^41 , [101]^42 , [102]^43 , [103]^44 ^] GRO‐seq,^[ [104]^45 ^] and RNA‐seq^[ [105]^123 ^] datasets. f) RNAscope targeting p‐Enh‐eRNA during gastrulation (E6.5, E7.0, E7.5). The schematic of the mouse gastrula is presented on the left side. Zoomed‐in view showing the enrichment of p‐Enh‐eRNA in the E7.5 PS. The labeling scheme employs “A” for anterior region, “P” for posterior region, “Epi” for epiblast, “Mes” for mesoderm, and “End” for endoderm. Scale bar: 50 µm. Previously, we identified the epigenetic pre‐mark at regulatory elements associated with developmental genes during gastrulation. This mechanism may serve as a widely available mechanism that primes prospective cell fate for following organogenesis.^[ [106]^18 ^] Here, to pinpoint the specific regulatory elements responsible for the prospective developmental competence of the PS cells, we first employed MAnorm2^[ [107]^30 ^] to search for pre‐marked elements with low H3K27ac at E7.0 stage, abundant H3K27ac enrichment at E7.5 stage, sustainable H3K27ac signal at E9.5 tailbud region (Figure [108]1b). In total, 1828 distal regulatory elements (DREs) were identified with significant upregulation (twofold, p‐value < 0.01) of H3K27ac enrichment within the posterior tissue from E7.0 to E7.5 stage (Figure [109]S1a, Supporting Information). Genomic Regions Enrichment of Annotations Tool (GREAT) analysis of the top‐100 DREs, ranked by p‐value, revealed that these DREs are related to genes involved in anterior‐posterior (A‐P) patterning, animal organ morphogenesis, neural tube development, as well as mesenchymal cell differentiation, which further indicate that the pre‐mark of H3K27ac at DREs are functionally related to the posterior tissue development (Figure [110]S1b, Supporting Information; Table [111]S1, Supporting Information). To determine the enrichment of H3K27ac during the following organogenesis, we checked the H3K27ac distribution at the DREs in E9.5 tailbud tissues by incorporating the published dataset.^[ [112]^31 ^] As shown in Figure [113]1b, among the top‐30 ranked DREs, 10 DREs (marked with asterisk) maintain abundant H3K27ac enrichment in E9.5 tailbud. Enhancer RNAs, which are transcribed from corresponding enhancers and often pre‐loaded with RNA polymerase II and H3K27ac, could be used to represent the enhancer activity and also be involved in the regulation of gene expression and cell fate determination.^[ [114]^32 , [115]^33 , [116]^34 , [117]^35 , [118]^36 ^] To further refine the scope of potential functional DREs, we determined the abundance of eRNA transcripts for the top‐30 DREs across various tissues of embryos at different stages (E7.0, E7.5, E9.5) using RT‐qPCR (Figure [119]1c). Based on the expression pattern of these eRNAs, we clustered the eRNAs into three groups: posterior early‐activated group with relatively high and specific eRNA expression in the PS cells of E7.5 stage; posterior late‐activated group with obvious eRNA expression in the E9.5 trunk and tailbud; and posterior no‐enriched group (Figure [120]1c). Next, to determine the innate regulatory activity during mouse embryogenesis, we used in vivo transgenic enhancer reporter system to validate the DREs from the above three groups (Figure [121]S1c, Supporting Information).^[ [122]^18 , [123]^37 ^] We performed transgenic reporter assays with the top‐10 DREs, spanning the three eRNA groups (highlighted in bold in Figure [124]1c) and found that 6 out of 10 DREs could drive LacZ expression in the tested developmental stages (E7.0 to E9.5). Notably, the majority (5 out of 6) of DREs with LacZ signals belonged to the early‐activated group (DRE‐1, 2, 3, 5, and 8). Only one DRE (DRE‐6) from the late‐activated group showed LacZ signal (1 out of 3), and no positive results were observed in the posterior no‐enriched group (Figure [125]1d; Figure [126]S1d, Supporting Information). Within these six DREs, DRE‐1 exhibits the most pronounced caudal tissue specificity of LacZ signals. DRE‐1 is de novo activated in the E7.5 PS cells with extraembryonic activity in E7.0 embryos, and its posterior region‐specific enhancer activity is sustained in the tailbud region till later organogenetic stages (E8.5, E9.5, and E11.5) (Figure [127]1d; Figure [128]S1e, Supporting Information). Further exploration of the genomic features of DRE‐1 revealed that the evolutionarily conserved DRE‐1 element is located in the first intron of the coding‐gene, Cdx2, and harbors pervasive binding sites for epigenetic activators (MED1 and p300),^[ [129]^38 , [130]^39 ^] signal effectors (β‐catenin, TCF‐1, and SMAD4),^[ [131]^40 , [132]^41 , [133]^42 ^] and master regulators (T and SOX2) (Figure [134]1e; Figure [135]S1f, Supporting Information).^[ [136]^43 , [137]^44 ^] Motif enrichment and sequence composition analyses revealed that DRE‐1 is GC‐rich and enriched for binding motifs of key developmental transcription factors (Figure [138]S1g,h, Supporting Information). Published gene run‐on sequencing (GRO‐seq)^[ [139]^45 ^] using in vitro PS counterpart also demonstrated the existence of nascent transcripts from the DRE‐1 locus (Figure [140]1e). Given the high tissue‐specificity of DRE‐1 in the posterior tissue, we named the element as posterior enhancer (p‐Enh) in the following studies. Consistent with the GRO‐seq and eRNA quantification results (Figure [141]1c,e), RNA in situ visualization using RNAscope with strand‐specific probes revealed that the p‐Enh region gives rise to bi‐directional transcripts, with the sense‐oriented transcript being more abundant and thus referred to as “p‐Enh‐eRNA” in the following text. Specifically, the abundant p‐Enh‐eRNA signals were detected in the PS and adjacent epiblast region of E7.5 embryos (Figure [142]1f, Figure [143]S2a, Supporting Information). There were no eRNA signals could be detected in PS cells of earlier embryonic stages (E6.5 and E7.0), or in the tailbud of the organogenetic embryos (E9.5) (Figure [144]1f; Figure [145]S2b,c, Supporting Information). Additionally, despite robust expression of the Cdx2 in the ExE region, we found no signals of eRNA in these cells (Figure [146]S2d, Supporting Information). This result persuasively eliminated the possibility that p‐Enh‐eRNA act as a byproduct of unspliced intronic fragments of newly transcribed Cdx2. Taken together, we identified a DRE, p‐Enh, which is pre‐marked by H3K27ac specifically in posterior epiblast at E7.5, possessing the posterior tissue‐specific enhancer activity from late gastrulation stage to organogenesis stage. Notably, p‐Enh shows transient eRNA transcription with strict spatial‐temporal specificity, occurring only in the E7.5 posterior epiblast. 2.2. Genetic Removal of p‐Enh Causes Severe Posterior Tissue Developmental Defects and Embryonic Lethality at the Organogenetic Stage To investigate the potential functional role of p‐Enh during mouse embryo development, we genetically knocked‐out the p‐Enh (≈1.3 kb) in mouse embryo using CRISPR/Cas9 system (Figure [147]S3a, Supporting Information). No live mouse individual with homozygous p‐Enh removal (p‐Enh‐KO) could be acquired (Figure [148]S3b, Supporting Information). Further analyses identified that p‐Enh‐KO embryos are deceased around the E10.5‐E11.5 stage with noticeable posterior tailbud developmental abnormalities and hindlimb absence (Figure [149]2a; Figure [150]S3c,d, Supporting Information). TUNEL assay further confirmed that the apoptotic signals become prominent in the E10.5 p‐Enh‐KO embryos, predominantly enriched at the tailbud and the brain regions (Figure [151]S3e, Supporting Information). Figure 2. Figure 2 [152]Open in a new tab Genetic removal of p‐Enh causes severe posterior tissue developmental defects and embryonic lethality at the organogenetic stage. a) Bright field images of WT and p‐Enh‐KO mouse embryos at different development stages (E8.5, E9.5, E10.5, and E11.5). The white dotted lines representing the top baseline of the head. Arrowheads indicating the tailbud region. The tailbud and limb are outlined with dotted lines, with tailbud in white, forelimb in black, and hindlimb in yellow. FL: forelimb; HL: hindlimb; TB: tailbud. Scale bar: 500 µm. b) Top left: the schematic diagram describing the sampling strategy of single‐cell RNA‐seq. Right: UMAP plots of single‐cell RNA‐seq data from WT and p‐Enh‐KO embryo from E8.5 to E10.5, color‐coded by cluster identities. Bottom left: the percent of cluster cells in WT embryos displaying progressive cell‐type complexity changes. c) Differential abundance of cell clusters with pronounced changes in p‐Enh‐KO compared with WT based on log[2](fold change). d) Enrichment score illustrating expression of top‐100 marker genes in each cell clusters. The enrichment score of upregulated genes are labeled with red color, and the downregulated genes in blue. p‐values for enrichment analysis are calculated using Fisher's exact test (one‐sided) and adjusted for multiple testing using Benjamini‐Hochberg (BH) method. e) Boxplot illustrating normalized read count of marker genes with significant expression changes between WT and p‐Enh‐KO embryos at E8.5 in indicated clusters. p‐values are calculated by using a two‐sided Wilcoxon rank sum test in Seurat FindMarkers function, with significantly differentially expressed genes defined as genes with log2(fold change) > 0.25, BH‐adjusted p‐values < 0.05 and fraction of detected cells > 0.25. To systematically characterize the molecular phenotypes of p‐Enh‐KO embryos, we performed single‐cell transcriptomic analyses in p‐Enh‐KO and wildtype counterparts from the same litter covering successive developmental stages from E8.5 to E10.5 (Figure [153]2b). In total, 180930 cells across three sequential stages were sequenced with a median coverage of 23783 unique molecular identifiers (UMIs) per cell. 127069 cells were retained for further analysis after rigorous quality filters (Figure [154]S3f, Supporting Information; Experimental Section). By referencing the published embryo atlas dataset,^[ [155]^46 , [156]^47 ^] 21 cell clusters of the WT embryos with precise annotation were identified (Figure [157]2b; Figure [158]S3g, Supporting Information). Generally, our annotation unveiled diverse clusters with unique set of markers spanning all three germ layer derivatives. For example, the neural tube cells were defined by the expression of Sox2, Pax6, and Fabp7, while NMPs were distinguished by Epha5, T, and Fgf8 (Figure [159]S3g, Supporting Information). Additionally, the dynamics of cell type proportions across developmental stages also reflect the sequential cell fate differentiation during organogenesis (Figure [160]2b). Thus, the single‐cell transcriptome profile of early organogenesis ranging from E8.5 to E10.5 faithfully captures the in vivo developmental events. Next, we projected the single‐cell transcriptome of p‐Enh‐KO embryos onto the WT reference, and found that the major cell type compositions between the WT and p‐Enh‐KO embryos were comparable (Figure [161]2b). The prominent apoptotic signals identified in E10.5 p‐Enh‐KO embryos were further confirmed by the assessment of apoptosis and hypoxia activity hallmarks expression (Figure [162]S4a, Supporting Information). To mitigate the potential bias caused by apoptotic cells, we focused our analyses on the E8.5 and E9.5 embryos in the following sections. Differential cellular abundance analysis highlighted a substantial decrease in posterior derived tissues of p‐Enh‐KO embryos, such as NMP, presomitic mesoderm, and limb mesenchyme (Figure [163]2c; Figure [164]S4b, Supporting Information). Following this, we calculated the enrichment scores of marker gene sets within each cluster. Remarkably, while the most significant cell density changes were observed at E9.5 (Figure [165]2c), the marker gene expression patterns, especially for NMP and presomitic mesoderm cells, already showed severe abnormalities at E8.5 in p‐Enh‐KO embryos (Figure [166]2d,e; Figure [167]S4c, Supporting Information). The alterations of gene expression prior to those of cellular phenotypes indicates that p‐Enh may function at an earlier stage when the posterior tissues were differentiated from the PS region in the gastrula. 2.3. Developmental Anomalies in both A‐P Axis Patterning and Limb Development in p‐Enh‐KO Embryos Given the phenotypes of p‐Enh‐KO were mainly associated with the posterior tissue development, we focused on cell types comprising the embryonic A‐P axis from the single‐cell reference atlas, encompassing forebrain/midbrain/hindbrain, neural tube, NMP, presomitic mesoderm, and paraxial mesoderm. Sub‐clustering of these single cells by UMAP further divided the cells into 7 clusters (Figure [168]3a). RNA velocity and pseudo‐time transcriptional ordering analyses showed a clear bi‐directional trajectory originated from NMP toward the neural and mesodermal lineages, respectively (Figure [169]3a). This result also aligns with the model that two distinct developmental trajectories exist in neural tube cells: one originating from the anterior brain cells and the other from the bipotential NMP in the posterior region.^[ [170]^27 , [171]^28 ^] The developmental signal gradients (RA, Wnt, and FGF signaling) and A‐P axis‐related marker genes alignment^[ [172]^48 , [173]^49 ^] signify the robust correlation between the plotted single‐cell transcriptome and the A‐P axis patterning process in vivo (Figure [174]S5a–d, Supporting Information), suggesting that the reconstructed UMAP plot effectively reflects cellular distribution along the in vivo A‐P axis, and highlight NMP as the origin for posterior neural tube cells (Figure [175]3a). Figure 3. Figure 3 [176]Open in a new tab Developmental anomalies in both A‐P axis patterning and limb development in p‐Enh‐KO embryos. a) Left: cartoon diagram demonstrating the structure and cell types in tailbud region. RA, Wnt, and FGF signaling show gradient activities along the A‐P axis. Right: UMAP visualizing the subclusters and RNA velocity projection field. The dashed lines delineate diverse developmental trajectories within the neural tube. b) Cell density distribution of the A‐P axis related cells along 1D transcriptional axis. The dashed line indicating boundary between anterior‐derived and NMP‐derived neural tube cells. The black arrows indicating noticeable cell density decrease in p‐Enh‐KO embryos compared with WT. c) Heatmap showing scaled expression of trajectory‐based expressed genes based on NMP‐derived tissues (NMP, posterior neural tube, presomitic mesoderm, paraxial mesoderm‐1, and paraxial mesoderm‐2). Examples of key TFs are labeled. Scaled expression is calculated based on the output of the NormalizeData function from the Seurat package, and further normalized to a [0, 1] range per gene using min‐max scaling. d) In situ hybridization results showing the expression patterns of Hoxb9 in WT and p‐Enh‐KO embryos at different developmental stages. Dashed lines outlining the Hoxb9‐positive regions in posterior neural tube and somites. Similar results are obtained from embryo replicates. Scale bar: 500 µm. e) Expression patterns of Tcf15 in WT and p‐Enh‐KO embryos at different developmental stages indicating shrinkage of the undifferentiated progenitor of aPSM. Yellow dashed lines indicating Tcf15‐negative tailbud region. Similar results are obtained from embryo replicates. Scale bar: 500 µm. f) Lightsheet imaging of whole‐mount RNAscope targeting Uncx4.1 in WT and p‐Enh‐KO embryo samples at E8.5 and E9.5. The clear strip‐like pattern of Uncx4.1 in WT embryos disturbs in E9.5 p‐Enh‐KO embryos. Scale bar: 500 µm. g) UMAP visualizing the projection field of limb related cell clusters with RNA velocity. Colors indicating different cell types. h) Density plots illustrating the distribution of limb related cells in WT and p‐Enh‐KO embryos at different developmental stages. i) Boxplot showing the expression (log[2](Normalized Read Count +1)) of limb marker genes in WT and p‐Enh‐KO samples. p‐values are calculated by using a two‐sided Wilcoxon rank sum test in Seurat FindMarkers Function. j) In situ hybridization of forelimb marker Tbx5 and hindlimb marker Pitx1 in WT and p‐Enh‐KO embryos. Scale bar: 500 µm. Subsequently, we compared the cellular distribution between wildtype and p‐Enh‐KO embryos along with the reconstructed digital A‐P map. By linearly plotting the single cells along the reconstructed A‐P axis, the p‐Enh‐KO embryo exhibited a noticeable reduction in cell density in NMP and their immediate developmental progenies (Figure [177]3b). Trajectory‐based gene expression analyses revealed significant disruptions in the developmental programs during both posterior neural tube and mesodermal lineage development in p‐Enh‐KO embryos (Figure [178]3c). Experimental validation showed that the expression territory and abundance of Hoxb9, a marker for posterior neural tube and mesodermal cells with the M‐shaped distribution, was restricted in the tailbud tip of the p‐Enh‐KO embryos (Figure [179]3d). Additionally, Tcf15, marker for anterior presomitic mesoderm (aPSM) and somite,^[ [180]^50 ^] exhibited a substantial decrease and shrinkage in the undifferentiated progenitor pool for aPSM (Tcf15‐negative region) in E9.5 p‐Enh‐KO embryos (Figure [181]3e), reflecting a depletion of posterior progenitors and/or precocious differentiation toward anterior somitic fates. Besides, the highly organized stripe‐like pattern as marked by of Uncx4.1 and Tbx18 in WT embryos^[ [182]^51 , [183]^52 ^] was severely disrupted in p‐Enh‐KO embryos, especially for embryos at E9.5 (Figure [184]3f; Figure [185]S5e, Supporting Information). Interestingly, the anterior somite structure appears to be normal in the E8.5 p‐Enh‐KO embryos (Figure [186]3f). Given the different trajectory routes for the early‐anterior (early PS‐derived) and late‐posterior somitogenesis (NMP‐derived) during mouse embryonic development,^[ [187]^24 ^] the disrupted Uncx4.1 distribution indicated that p‐Enh may specifically regulate the NMP‐derived late‐posterior somitogenesis process in the mouse embryo. Additionally, we also observed a notable decrease in the posterior‐derived limb mesenchyme cell population in E9.5 p‐Enh‐KO embryos (Figure [188]2c). To delve into these abnormalities, we specifically analyzed cells associated with limb mesenchyme and further categorized them into three sub‐clusters representing forelimb progenitors, hindlimb progenitors, and other mesenchyme based on marker gene expression. RNA velocity analysis confirmed that both forelimb and hindlimb progenitors seem to share one common developmental origin (Figure [189]3g). Cell density analyses reveal comparable patterns between WT and p‐Enh‐KO embryos at E8.5, whereas there is a pronounced reduction in hindlimb progenitors in E9.5 p‐Enh‐KO embryos (Figure [190]3h). Molecular analyses identified that genes related to hindlimb development, such as Hoxc10, Hoxd11, Pitx1, and Tmem119, were significantly down‐regulated in p‐Enh‐KO embryos, while the markers of forelimb showed limited expression alterations (Figure [191]3i). Further experimental validation of the typical hindlimb marker Pitx1 confirmed the absence of hindlimb structure in the p‐Enh‐KO embryos with the normal distribution of forelimb cells as marked by Tbx5 (Figure [192]3j). Moreover, to assess whether additional developmental defects beyond the A‐P axis and hindlimb development were present in p‐Enh‐KO embryos, we examined the development of other major embryonic structures, including the brain, branchial arch, and heart. The representative markers such as Sox2 for the neural tube, Msx2 for the branchial arch, and Tbx5 and Myh6 for the heart remain largely normal expression patterns in p‐Enh‐KO embryos, indicating no major disruption in these tissues (Figure [193]S6a‐f, Supporting Information). Together, these results indicate that the deletion of p‐Enh specifically leads to severe developmental abnormalities in posterior tissues, manifesting as disrupted aPSM differentiation, paralyzed somitogenesis, and absent hindlimbs, which finally cause embryonic lethality around E10.5 stage. Single‐cell transcriptomics analysis reveals that p‐Enh can regulate the gene expression of NMP‐derived clusters as early as E8.5, p‐Enh‐KO results in a pronounced decrease in cell density in posterior cell clusters at E9.5. 2.4. Two Distinct Mechanisms with Differential CDX2 Dependencies Underlying p‐Enh Function A substantial body of work on various enhancers have proposed mechanisms of enhancer function by regulating adjacent genes.^[ [194]^53 , [195]^54 , [196]^55 , [197]^56 , [198]^57 ^] Considering that p‐Enh is located in the first intron of Cdx2 and the crucial role of Cdx2 in body axis development,^[ [199]^31 , [200]^58 , [201]^59 , [202]^60 ^] we examined whether the expression of Cdx2 was affected in p‐Enh‐KO embryos. As shown, there was no change in Cdx2 expression in the extraembryonic cells, with partial down‐regulation in the posterior regions at E7.5 (Figure [203]4a). Consistently, single‐cell transcriptome data also showed reduced Cdx2 expression in posterior tissues such as NMP and presomitic mesoderm (Figure [204]S7a, Supporting Information). To probe the downstream molecular mechanisms for p‐Enh function and dissect the potential relevance between p‐Enh and its neighboring coding gene‐Cdx2, we prepared the p‐Enh‐KO and Cdx2‐KO mouse embryonic stem cell lines with CRISPR/Cas9 system, respectively (Figure [205]S7b, Supporting Information).^[ [206]^61 ^] It is noteworthy that the Cdx2‐KO cell line features a specific 35 bp deletion in the first exon, causing a frameshift mutation and eliminating functional CDX2 protein without interrupting the p‐Enh element genetically (Figure [207]S7c,d, Supporting Information). Generally, the morphologies of stem cell clones and distributions of pluripotent markers NANOG and OCT4 were comparable among WT, Cdx2‐KO, and p‐Enh‐KO ESCs (Figure [208]S7e, Supporting Information), indicating that the removal of CDX2 protein or p‐Enh element did not affect stem cell pluripotency. Next, we applied the WT, Cdx2‐KO and p‐Enh‐KO cell lines to the newly developed in vitro gastruloids^[ [209]^62 ^] and Trunk‐like structures (TLSs)^[ [210]^63 ^] differentiation systems (Figure [211]4b), which can be used to model the anterior‐posterior axis formation and later somitogenesis in the embryo. We found that the WT embryoids could generate tailbud‐like structures that resemble the caudal region of embryos in both gastruloids and TLSs systems (Figure [212]4c; Figure [213]S7f, Supporting Information). However, the elongation of tailbud regions in both the gastruloids and TLSs for Cdx2‐KO and the p‐Enh‐KO cells was largely compromised in comparison to the WT group. Moreover, in contrast to the WT and Cdx2‐KO gastruloids and TLSs, the embryoids derived from p‐Enh‐KO ESCs remain as large spherical aggregates with no morphological evidence of A‐P axis elongation (Figure [214]4c; Figure [215]S7f, Supporting Information). Geometric analysis of the acquired gastruloids and TLSs further confirmed the consistent and significant defects (p<0.0001) in A‐P axis organization and elongation in both Cdx2‐KO and p‐Enh‐KO embryoids. Moreover, the phenotypic abnormalities were more pronounced in p‐Enh‐KO embryoids compared to Cdx2‐KO counterparts (Figure [216]4d; Figure [217]S7g, Supporting Information). Figure 4. Figure 4 [218]Open in a new tab a) The dynamic expression pattern of Cdx2 in WT and p‐Enh‐KO embryos at different development stages. Similar results are obtained from embryo replicates. Scale bar: 500 µm. b) Schematic overview of gastruloid and TLS differentiation system. The length, width, and tail are indicated. c) Representative bright‐field images of gastruloids at D5 from WT, Cdx2‐KO, and p‐Enh‐KO groups. The length, width, and tail are labeled and measured for morphological assessment. Scale bar: 500 µm. d) The geometric results summarizing the morphological features for gastruloids of each genotype. WT (n = 35); Cdx2‐KO (n = 26); p‐Enh‐KO (n = 38). The p‐values are calculated based on one‐way ANOVA. e) Principal component analysis (PCA) of transcriptome of WT, Cdx2‐KO and p‐Enh‐KO gastruloid. Three biological replicates are used. f) Venn plot showing a modest percentage of shared genes between Cdx2‐KO affecting genes and p‐Enh‐KO affecting genes (adjusted p‐value<0.01; fold change >2). Differential expression analysis was performed by using DESeq2. p‐values are calculated by using a Wald test (two‐tailed) in DESeq2 and corrected for multiple testing by using Benjamini‐Hochberg (BH) method. g) Examples of Cdx2‐KO and p‐Enh‐KO overlapped DEGs. Statistical significances are determined by one‐way ANOVA. h) Heatmap showing differentially expressed genes in WT, Cdx2‐KO, and p‐Enh‐KO groups. Significant terms of Gene Ontology (GO) in Group3 and Group5 are shown. The adjusted p‐values are calculated using hypergeometric test (one‐sided) and adjusted for multiple testing using Benjamini‐Hochberg (BH) method. i) Transcriptome projection of genes regulated by Cdx2 and/or p‐Enh onto in vivo single cell dataset. The cluster enrichment scores are listed below, which are calculated based on AddModuleScore function in Seurat package. j) The schematic diagram illustrating the experimental design of Dox‐induced iCdx2 in gastruloid system. The time window for adding Dox is highlighted in green. k) RNAscope results targeting Tbx6 and Uncx4.1 in gastruloids with different genotypes. Strip‐like patterns of Uncx4.1 only appear in WT gastruloids and cannot be rescued by iCdx2 overexpression. Scale bar: 100 µm. l) Heatmap showing differential expressed genes in WT, p‐Enh‐KO and p‐Enh‐KO/iCdx2 groups. Gene examples are listed. Boxplot showing the average expression of genes across different genotypes. The Gene Ontology (Biological Process) of Group 4‐B and Group 5‐B are shown on the right. The adjusted p‐values are calculated using hypergeometric test (one‐sided) and adjusted for multiple testing using BH method. To comprehensively determine the molecular architectures of gastruloids derived from Cdx2‐KO and p‐Enh‐KO cells, we performed transcriptomic profiling of the gastruloids with three biological replicates for each group (Figure [219]S8a, Supporting Information). Our analysis revealed a significant decrease in Cdx2 expression in both Cdx2‐KO and p‐Enh‐KO samples, which was further experimentally confirmed (Figure [220]S8b,c, Supporting Information). Principal component analyses (PCA) revealed distinct transcriptomic features for Cdx2‐KO and p‐Enh‐KO samples (Figure [221]4e). Additionally, p‐Enh‐KO exhibited a greater number of differentially expressed genes (DEGs) compared to Cdx2‐KO (p‐Enh‐KO: 608 up and 1451 down, Cdx2‐KO: 819 up and 455 down) (Figure [222]S8d, Supporting Information). Remarkably, only a small portion of DEGs were overlapped between Cdx2‐KO and p‐Enh‐KO group (330/1274, 25.9% of Cdx2‐KO DEGs; 330/2059, 16% of p‐Enh‐KO DEGs) (Figure [223]4f), suggesting both shared and distinct regulatory roles of Cdx2 and p‐Enh during A‐P axis formation and elongation in embryoids. Despite constituting a small fraction, the shared target genes among the overlapping DEGs, including regulators of A‐P axis patterning such as Hox genes and T,^[ [224]^31 , [225]^64 , [226]^65 ^] exhibited substantial downregulation in both Cdx2‐KO and p‐Enh‐KO gastruloids (Figure [227]4g). qPCR results of representative genes from different groups confirmed expression changes (Figure [228]S8e, Supporting Information). Next, we categorized these DEGs (Figure [229]4f) into 5 clusters based on gene expression patterns, including the shared DEGs (Group 1), Cdx2‐KO specific up (Group 2), Cdx2‐KO specific down (Group 3), p‐Enh‐KO specific up (Group 4), and p‐Enh‐KO specific down (Group 5) (Figure [230]4h). Gene ontology (GO) analysis revealed that genes in Group 3 were enriched in the biological process related to anterior/posterior patterning and somitogenesis process, consistent with the well‐known functions of CDX2 protein.^[ [231]^58 ^] As for p‐Enh‐KO specific down (Group 5) genes, the pathways related to a broad spectrum of mesodermal development, such as gland development, mesenchyme development, and heart morphogenesis, were enriched (Figure [232]4h; Figure [233]S8f, Supporting Information). Following this, we conducted an enrichment analysis of the DEGs of Group 1, 3, 5 with the constructed in vivo single‐cell A‐P atlas (Figure [234]3a), and found that Group 1 and Group 3 exhibited a clear posteriorized enrichment, displaying high enrichment scores with NMP, presomitic mesoderm, and posterior neural tube clusters. However, genes in the p‐Enh‐KO specific down group (Group 5) showed a strong correlation with NMP and the mesodermal clusters (Figure [235]4i), indicating p‐Enh may participate into posterior mesoderm development through a CDX2‐independent mechanism. To further validate this notion, we established the inducible Cdx2 overexpression cell line on the background of p‐Enh‐KO (p‐Enh‐KO/iCdx2) and subjected this cell line into gastruloid differentiation system (Figure [236]4j; Figure [237]S8g, Supporting Information). We observed that the expression of the presomitic mesoderm marker Tbx6, a downstream target of T and Cdx2 network,^[ [238]^60 ^] was reactivated in the p‐Enh‐KO/iCdx2 gastruloids (Figure [239]4k). However, the A‐P elongated morphology as well as the stripe‐like structure marked by Uncx4.1 transcripts remained compromised in p‐Enh‐KO/iCdx2 (Figure [240]4k). Transcriptome analyses showed that more than 50% of p‐Enh‐KO specific down genes (651/1224, 53% of Group 5) failed to be fully rescued by the inducible Cdx2 overexpression in comparison with the WT counterpart (Figure [241]4l; Figure [242]S8h, Supporting Information), which enriched organ formation‐related genes. In particular, genes involved in the response to the transforming growth factor beta (TGF‐β) pathway remains abnormal even with the restoration of CDX2 (Figure [243]4l). Regulon analyses summarizing the upstream regulators for the un‐rescuable genes revealed that major effectors involved in TGF‐β signaling, such as SMAD1 and SMAD9, were specifically enriched (Figure [244]S8i, Supporting Information). In summary, in vitro embryoid data showed that p‐Enh can regulate posterior tissue development partially through the neighboring gene Cdx2. Moreover, these results also suggest that a previously unrecognized CDX2‐independent mechanism may contribute to p‐Enh function, possibly involving cellular responsiveness to TGF‐β signaling pathway. 2.5. Genome‐Wide Epigenomic Remodeling Suggests a Potential Functional Relevance between p‐Enh and TGF‐β Signaling To assess whether p‐Enh regulates specific target genes via direct chromatin contacts, we performed 4C‐seq using the p‐Enh locus as the viewpoint in NMP cells. In consistent with the published HiC dataset,^[ [245]^66 ^] we found that p‐Enh exhibits interactions with neighboring chromatin regions, within a 3 Mb window surrounding its genomic locus (Figure [246]S9a, Supporting Information). By integrating 4C‐seq data with transcriptomic changes upon p‐Enh deletion, we identified three candidate genes—Cdx2, Rasl11a, and Slc46a3—that showed both physical interaction and differential expression (fold change > 2) (Figure [247]S9a, Supporting Information). However, functional analysis through shRNA‐mediated knockdown of Slc46a3 in p‐Enh‐KO cells failed to rescue the developmental defects (Figure [248]S9b,c, Supporting Information), indicating that in cis chromatin interactions with distal genes, such as Slc46a3, is not responsible for p‐Enh functions. To systematically determine the downstream molecular cascades of p‐Enh, we profiled the H3K27ac modification enrichment from both WT and p‐Enh‐KO NMP cells acquired from directed NMP differentiation system (Figure [249]5a).^[ [250]^67 ^] The comparative analysis of chromatin activities marked by H3K27ac revealed the occurrence of pervasive epigenomic remodeling in p‐Enh‐KO NMP cells, especially at distal intergenic chromatin regions (Figure [251]5b; Figure [252]S9d–f, Supporting Information). GREAT analyses identified that peaks with weakened enrichment of H3K27ac in p‐Enh‐KO cells are strongly enriched with SMAD binding capabilities (Figure [253]5b). Following this, we also incorporated published dataset of SMADs ChIP‐seq data,^[ [254]^68 , [255]^69 ^] as well as CDX2 ChIP‐seq^[ [256]^31 ^] data from in vitro sample counterpart, and checked the distribution of these signaling effectors around peak sets with changed H3K27ac level in p‐Enh‐KO cells (Figure [257]5c). We observed enrichment of TGF‐β signal effectors (SMADs) binding with these peaks, but not CDX2 binding (Figure [258]5c). These results demonstrate that the loss of p‐Enh element in NMP cells lead to significant decrement of H3K27ac at TGF‐β signal effectors chromatin binding sites. To test whether epigenetic remodeling of H3K27ac at these regions coincides with the change of chromatin binding for SMAD proteins, we performed ChIP‐qPCR analyses for the co‐SMAD effector, SMAD4, in both WT and p‐Enh‐KO NMP cells. As shown, the bindings of SMAD4 are significantly down‐regulated in p‐Enh‐KO cells (Figure [259]S9g, Supporting Information). These results, together with previous transcriptome data (Figure [260]4f,l), indicate that p‐Enh could play roles in mediating the chromatin responsiveness to TGF‐β signal effector, in a CDX2‐independent mechanism. Figure 5. Figure 5 [261]Open in a new tab Functional interplay between p‐Enh and the TGF‐β signaling pathway. a) Schematic of 2D NMP differentiation system. H3K27ac ChIP‐seq data are employed to assess the chromatin activity of WT and p‐Enh‐KO NMPs. b) Heatmap showing differential H3K27ac peaks in WT and p‐Enh‐KO NMPs. GO terms of top‐ranked GREAT analysis from differential H3K27ac peaks are shown. No GO terms enriched in p‐Enh‐KO gain group under default parameters. c) Heatmaps illustrating chromatin binding patterns of multiple TGF‐β signaling downstream effectors and CDX2 at chromatin regions with altered H3K27ac enrichment in Figure [262]5b. Published datasets for SMAD2, SMAD3, SMAD4 in TGF‐β‐activated condition,^[ [263]^68 ^] SMAD2/3 in TGF‐β‐activated condition,^[ [264]^69 ^] and CDX2 ChIP‐seq^[ [265]^31 ^] are used. d) Co‐staining of Cdx2 mRNA and p‐Enh‐eRNA using RNAscope in E7.5 mouse embryos. Both transverse and sagittal sections are shown. The labeling scheme is consistent with Figure [266]1f, and the nuclear‐cytoplasmic boundary is outlined in a vibrant yellow color. Scale bar: 50 µm. e) TGF‐β signaling activity determined by gene set (GESA: MM9770) using AUCell (v1.20.2) in E7.5 mouse embryo. The schematic diagram is adapted and modified from the spatial transcriptome of gastrulation.^[ [267]^9 ^]f) RNAscope targeting p‐Enh‐eRNA and co‐staining with SMAD4 protein in E7.5 mouse embryo sections. Zoomed‐in view showing the detailed signals at posterior region. The labeling scheme is consistent with Figure [268]1f. The co‐localization spots of p‐Enh‐eRNA with SMAD4 proteins are further magnified for the clear presentation. The co‐localization analysis is performed using ImageJ plugin ComDet 0.5.5. Scale bar: 10 µm. g) Diagram showing the RNA pull‐down assay workflow. h) Western blot (WES) results. The bands corresponding to SMAD4 protein (≈64 kDa) are indicated. Two biological replicates are included in each condition to ensure robustness. i) Diagram showing modulation of TGF‐β signaling and eRNA in gastruloid differentiation system. Activin A (20 ng mL^−1) or SB421542 (10 µM) are added into differentiation system from the end of D1. j) Bright‐field images of D5 gastruloids in different conditions. Scale bar: 200 µm. k) Geometric features (Axial length, width, tail length, and length/width ratio) of D5 gastruloids in Figure [269]5j. Each group includes multiple biological replicates (n = 16–31) to ensure statistical robustness. Mean values for selected bars are indicated in blue text. The p‐value are calculated based on one‐way ANOVA. l) Expression pattern of Cdx2 in WT, p‐Enh‐KO and p‐Enh‐KO (Activin A) gastruloids. The p‐values are calculated based on one‐way ANOVA. m) Hierarchical clustering of transcriptome from WT, p‐Enh‐KO and p‐Enh‐KO (Activin A) gastruloids. n) Heatmap illustrating expression pattern of p‐Enh specific regulated genes (defined in Figure [270]4h,l) in different groups. Subclusters A and B are delineated according to the rescue efficacy of iCdx2 overexpression. Boxplot showing the average expression of genes in each subclusters. p‐values are calculated by using Wilcoxon rank sum test (two‐sided). Recent studies have found that a broad spectrum of TFs bind with RNA molecules, highlighting the conserved and essential nature of interactions between TFs and RNA in vertebrate development.^[ [271]^70 ^] We found that p‐Enh can transcribe eRNAs, which are largely distributed in the posterior regions at the late‐gastrulation stage (E7.5) (Figure [272]1f). Co‐detection of p‐Enh transcripts and Cdx2 mRNA in the E7.5 mouse gastrula revealed that both transcripts were specifically enriched in the posterior region (Figure [273]5d). Moreover, sub‐cellular distribution analyses showed that the p‐Enh‐eRNAs were prominently enriched in the nuclei, forming puncta‐like structure at multiple loci within a single nucleus. In contrast, the transcripts of the neighboring coding‐gene, Cdx2, were universally located in the cytoplasm (Figure [274]5d). Meanwhile, the signaling activity of TGF‐β superfamily was also enriched in the posterior embryonic region at E7.5, represented by the spatial gene expression enrichment of typical TGF‐β signaling target genes (Figure [275]5e). To investigate the spatial association between p‐Enh‐eRNA and TGF‐β signaling, we performed co‐staining of p‐Enh‐eRNA with SMAD proteins in E7.5 embryos. Over 40% of nuclear p‐Enh‐eRNA puncta co‐localized with SMAD4 (Figure [276]5f), and ≈30% overlapped with phosphorylated SMAD1/5‐positive foci (Figure [277]S10a, Supporting Information). The spatial co‐occurrence of p‐Enh transcripts and TGF‐β signaling effectors hints an intriguing involvement role of p‐Enh‐eRNA in TGF‐β signaling transduction. To further validate this possibility, we performed RNA pull‐down followed by Western blot to directly examine the physical interaction between p‐Enh‐eRNA and SMAD4 (Figure [278]5g). Using biotin‐labeled antisense probes targeting p‐Enh‐eRNA, we found that SMAD4 protein was specifically enriched in the probe (+) group purified from WT NMP cells. Moreover, the abundance of SMAD4 was increased upon p‐Enh‐eRNA overexpression (Figure [279]5h). These results provide direct biochemical evidence supporting a physical interaction between p‐Enh‐eRNA and SMAD4, reinforcing that p‐Enh‐eRNA may participate in TGF‐β signaling by engaging with SMAD4. To further validate the involvement of eRNA and TGF‐β in p‐Enh's functions, we performed gastruloid differentiation experiments under multiple conditions (Figure [280]5i). First, to test whether p‐Enh‐eRNA could compensate for the loss of p‐Enh, we generated p‐Enh‐KO cells with nuclear‐enriched overexpression of eRNA (p‐Enh‐KO/eRNA‐OE)^[ [281]^71 , [282]^72 ^] (Figure [283]S10b, Supporting Information). However, p‐Enh‐KO/eRNA‐OE gastruloids did not exhibit obviously morphological change at D5 compared to the p‐Enh‐KO group, suggesting that eRNA alone is insufficient to restore axial elongation. Next, we activated TGF‐β signaling using Activin A in gastruloid system (Figure [284]5i). Notably, Activin A treatment markedly ameliorated the differentiation defects in p‐Enh‐KO gastruloids (Figure [285]5j,k). Conversely, inhibition of TGF‐β signaling with SB431542 induced extensive cell death, indicating the essential roles of TGF‐β signaling in A‐P axis patterning (Figure [286]S10c, Supporting Information). To further assess the potential interplay between p‐Enh‐eRNA and TGF‐β signaling, we performed a combinatorial treatment in which p‐Enh‐KO gastruloids were subjected to both eRNA overexpression and Activin A treatment. Compared to Activin A alone, the p‐Enh‐KO/eRNA‐OE (Activin A) group exhibited further improvement in morphological features, including axial length, tail length, and the length/width ratio. Specially, the length and length/width ratio, which reflect the anterior‐posterior polarity, of the p‐Enh‐KO gastruloids exhibit gradual alleviation of the defects and finally reach a comparable level to the WT group (Figure [287]5k). These results suggest a cooperative effect between p‐Enh‐eRNA and TGF‐β signaling in modulating posterior developmental programs. As the phenotypic improvement mainly occurred upon TGF‐β activation, we next examined the associated transcriptomic profiles (Figure [288]S10d, Supporting Information). Notably, Cdx2 expression remained low even after Activin A treatment (Figure [289]5l), indicating that the rescue effect was not mediated through this neighboring gene. This supports the notion that the differentiation defects could be attributed to the interactions between p‐Enh and TGF‐β signaling in a Cdx2‐independent way, as shown in Figure [290]4l. In contrast to the expression pattern of Cdx2, a list of genes crucial for A‐P axis patterning and mesoderm development, including Hes7, Tgfb2 and Mixl1, exhibited marked upregulation upon TGF‐β activation. This underscores the ability of TGF‐β to effectively ameliorate the phenotype associated with p‐Enh‐KO (Figure [291]S10e, Supporting Information). Hierarchical clustering result also revealed that Activin A treatment effectively rescued genes expression abnormalities caused by p‐Enh depletion (Figure [292]5m). Specifically, p‐Enh regulated genes that failed to be rescued by Cdx2 re‐expression (subcluster B of Group 4 and Group 5 in Figure [293]4l) were largely restored with Activin A treatment (Figure [294]5n). Together, these findings suggest that the direct crosstalk between p‐Enh‐eRNA and the TGF‐β signaling pathway, which facilitates the chromatin binding of signaling effectors, is responsible for p‐Enh function, thus holds substantial biological importance for the development of posterior tissues. 3. Discussion In this study, we have identified a pre‐marked regulatory element, p‐Enh, exhibiting highly restricted spatial‐temporal activity and significant biological importance during mouse early embryonic development. The pivotal role of p‐Enh in posterior tissue development is underscored by the embryonic lethality observed in p‐Enh‐KO embryos, accompanied by severe posterior tissue growth failure, disruption of A‐P axis gene expression pattern, and dysfunctions across multiple cell types (Figures [295]2 and [296]3). Mechanism analyses through transcriptomic dissection pinpointed the co‐occurrence of two distinct mechanisms underlying the p‐Enh function: (i) regulating the expression level of its neighboring gene Cdx2 in cis, thereby affecting the downstream cascades of CDX2 (Figure [297]4); (ii) transient production of nuclei‐distributed eRNAs, potentially involved in coordinating the nuclear responsiveness to the extrinsic developmental signals, such as TGF‐β signaling pathway (Figure [298]5), through direct interaction with signal effectors. Thus, we propose a molecular model for H3K27ac pre‐marked developmental enhancers, wherein genetic sequence‐based in cis local regulation and RNA transcripts‐based in trans genome‐wide regulation co‐exists to facilitate tight spatiotemporal control of global molecular architecture and thus set up the molecular blueprint for future tissue development (Figure [299]6 ). Figure 6. Figure 6 [300]Open in a new tab Model of the regulation of posterior developmental genes through p‐Enh with distinct in cis and in trans mechanisms. The gastrula pre‐marked p‐Enh plays dual roles in governing posterior tissue development: (i) regulating the neighboring gene Cdx2 (in cis), which in turn affects their shared downstream targets, such as Hox and T genes; (ii) modulating the global transcriptome and epigenome via transient expression of broadly nuclei‐distributed eRNA (in trans), by mediating the interplay between p‐Enh‐eRNA and signaling effector such as SMAD4, which may contribute to the modulation of TGF‐β signaling. p‐Enh‐KO embryos exhibit embryonic lethality, aberrant posterior tissue development trajectories, and failure to initiate hindlimb formation. Recently, there has been a growing interest in dissecting the molecular hierarchy of gene regulatory networks (GRNs) that govern rapid cell fate commitment.^[ [301]^9 , [302]^18 , [303]^73 , [304]^74 ^] Previous studies have primarily focused on identifying specific pioneer transcription factors capable of steering the developmental competence toward specific developmental lineages.^[ [305]^75 , [306]^76 , [307]^77 ^] In line with the pioneering factors, emerging evidences also indicated that the epigenetic priming or pre‐mark at regulatory elements could offer a notable feature prior to cell fate decision.^[ [308]^13 , [309]^78 ^] However, it remains unclear whether the “chromatin priming” of regulatory elements holds potential functional roles in biological process beyond merely serving as a marker. Here, we determined one H3K27ac pre‐marked regulatory element, p‐Enh, which is specifically activated in the PS and adjacent epiblast at late gastrulation stage, and sustains the epigenetic activity till late organogenesis. Phenotypic and molecular analysis demonstrated that genetic loss of p‐Enh leads to dramatic developmental defects when the posterior lineage diversifies. Moreover, the precedence of transcriptomic disruptions over phenotypic anomalies strengthens the evidence for the importance of the priming effects. Thus, future evaluation of pre‐marked regulatory elements and further phenotyping through incorporating multi‐dimensional data may better elucidate the function of novel crucial regulatory elements. A considerable amount of research has put forth on how enhancers exert their functions, particularly through the regulation of neighboring genes. Here, through in‐depth transcriptomic dissection, chromatin interaction capture and genetic analyses, as reflected by the transcriptomic comparison among multiple genetic edited cell lines (WT, p‐Enh‐KO, Cdx2‐KO, p‐Enh‐KO/iCdx2), we found that only 16% genes were regulated through the p‐Enh‐Cdx2 axis (Figure [310]4f). Meanwhile, TGF‐β signaling responsive genes were highly enriched in the cluster of p‐Enh specifically regulated genes (Figure [311]4l; Figure [312]S8i, Supporting Information). Further, through profiling H3K27ac pattern in both WT and p‐Enh‐KO cells, we found that the global distribution of H3K27ac was pervasively remodeled in p‐Enh‐KO cells (Figure [313]5b). Interestingly, the genomic regions with changed H3K27ac level are enriched with TGF‐β signaling effectors but not CDX2 binding (Figure [314]5c, Figure [315]S9g, Supporting Information). Thus, the regulatory mechanisms underlying p‐Enh function are intricately linked with TGF‐β signaling, suggesting a complex interplay rather than a straightforward linear molecular cascade acting through its neighboring gene Cdx2 or functioning solely through in cis chromatin interactions with distal targets such as Slc46a3 (Figure [316]4; Figure [317]S9a–c, Supporting Information). p‐Enh possesses both in cis and in trans regulatory mechanisms in the developmental control of posterior tissues. The discovery of enhancer‐derived transcripts largely broaden the current scope of the molecular mechanisms governing chromatin structure organization and gene expression regulation.^[ [318]^79 , [319]^80 ^] The abundant expression and nuclear localization of p‐Enh‐eRNAs at multiple loci within individual nuclei in the PS and adjacent epiblast region suggest their potential roles in global GRNs instead of localized functions. Their broad distribution within nuclei may contribute to the observed pervasive chromatin activity changes in p‐Enh‐KO cells (Figure [320]5a–d). Furthermore, a recent study has underscored the extensive interaction between TFs and RNA molecules.^[ [321]^70 ^] Here, we demonstrated that p‐Enh‐eRNA not only physically interacts with TGF‐β effectors, SMAD4 and pSMAD1/5, but also regulates the chromatin binding of the interacting TF partners. Importantly, the increment of TGFβ signaling activity and restore of eRNA abundance synergistically rescue the differentiation defects in gastruloid. Drawing from the existing research and our findings, we hypothesize that the widespread eRNAs within individual nuclei can function as trans‐modular coordinators to interact with signaling downstream effectors, potentially serving as “regulatory units”, then facilitate the chromatin binding of interacting TF partners to the designated chromatin regions, thereby influencing the long‐range modulation of global GRNs for proper developmental programming (Figure [322]6). Even though the comprehensive nature of our data, we do acknowledge that future in‐depth characterization of the structure of p‐Enh‐eRNA and systematic identification of its potential interacting partners, of this transcript will finally clarify the complex regulatory network and promote a global understanding of mouse embryogenesis. In summary, to delve into the molecular mechanisms underlying epigenetic pre‐marked enhancers, we systematically dissect both the in cis and in trans paths for p‐Enh, proposing this as a novel strategy of developmental enhancer to coordinate multiple downstream modules genome‐widely, and then establish a responsive cellular state for forthcoming lineage diversification. 4. Experimental Section Animals Use, Embryo Staging, and Collection Animal experiments were performed at the Animal Core Facility, under program project SIBCB‐S308‐1807‐025 approved by the Institutional Animal Care and Use Committee in Center for Excellence in Molecular Cell Science, CAS, and the program project GZLAB‐AUCP‐2022‐10‐A04 approved by Institutional Animal Care and Use Committee of Guangzhou National Laboratory. Timed matings were set up between sexually mature (older than six weeks of age) of wildtype C57BL/6 or p‐Enh‐KO. Plugged female mice were picked after mating and marked as embryonic day 0.5 (E0.5). Female mice were sacrificed for embryos collection at specific gestational stages. Embryos were isolated from the uterus and carefully transferred into PBS in petri dishes, and extra‐embryonic tissues were removed using needles under Olympus stereoscope. The developmental stage and morphology of embryos were reconfirmed according to the Theiler stage methodology. The acquired embryos were ready for further experiments including whole‐mount in situ hybridization, iDISCO, RNAscope, or single cell transcriptome profiling. qPCR Quantification of eRNA Expression To ensure sufficient RNA input for reliable detection of low‐abundance enhancer RNAs, microdissected tissues were pooled from ≈50 mouse gastrula‐stage embryos for each sample. For E9.5 stage samples, tissues from 5 embryos were microdissected and combined per sample. The pooling strategy was designed to enrich for the target regions and provide adequate RNA yield for downstream analyses. Total RNA was extracted using the TRIzol reagent (Invitrogen, 15596018) following the manufacturer's standard protocol. Briefly, tissues were homogenized in TRIzol, followed by phase separation with chloroform and RNA precipitation with isopropanol. After washing with 75% ethanol, RNA pellets were air‐dried and resuspended in RNase‐free water. RNA concentration and purity were assessed using a NanoDrop spectrophotometer. Reverse transcription was carried out using random primers and reverse transcriptase (Tiangen, KR108) to synthesize complementary DNA (cDNA) from total RNA, enabling unbiased detection of eRNAs. The resulting cDNA was used as the template for quantitative PCR (qPCR) with region‐specific primers to measure eRNA expression levels. mESC Maintenance and Differentiation mESC lines were maintained in standard feeder‐free culture conditions which were supplemented with 3 µM CHIR99021 (Selleck Chemicals, S1263), 1 µM PD0325901(Selleck Chemicals, S1036) and 10 ng mL^−1 mouse leukemia inhibitory factor (Millipore, ESG1107) as previously described.^[ [323]^81 ^] The NMP differentiation was performed as previously reported.^[ [324]^67 ^] In brief, 50000 cells were seeding in N2B27 medium containing 10 ng mL^−1 bFGF (Shanghai Pufei Biotechnology, 1106‐010) in Matrigel pre‐coated 35 mm dish and cultured for two days. The N2B27 medium contained a 1:1 ratio of DMEM/F12:Neurobasal medium (GIBCO) supplemented with 1xN2 (GIBCO), 1xB27 (GIBCO), 1x GlutaMax (GIBCO), 40 mg mL^−1 BSA (Roche), penicillin/streptomycin (GIBCO) and 0.1 mM 2‐mercaptoethanol. Cells were pulsed with 5 µM CHIR99021 during 24 h to 36 h. At the end of D3, the NMPs were ready for transcriptome profiling, chromatin conformation capture assays, and ChIP‐seq analysis. Gastruloids and Trunk‐Like Structure (TLS) Generation Gastruloids and TLSs were generated using established protocols with minor adjustments.^[ [325]^63 , [326]^82 ^] The basic N2B27 medium contained a 1:1 ratio of DMEM/F12: Neurobasal medium (GIBCO) supplemented with 1xN2 (GIBCO), 1xB27 (GIBCO), 1x GlutaMax (GIBCO), penicillin/streptomycin (GIBCO) and 0.1 mM 2‐mercaptoethanol. The initial 4 days of differentiation for both gastruloids and TLSs were consistent. In brief, 200–250 single mESC were seeded in 40 µL N2B27 medium into wells of the 96‐well V bottom Ultra‐low Attachment Plates (S‐BIO, MS‐9096VZ) and allowed to aggregated for 48 h. Subsequently, the spheres were treated with 3 µM CHIR99021 in 150 µL N2B27 medium for 24 h. After this, medium was refreshed every 24 h with the same volume of basic N2B27 medium. On the final day of differentiation, gastruloids were cultured with basic N2B27, while TLSs were embedded in 5% Growth‐Factor‐Reduced Matrigel (Corning, 356231) in N2B27 medium. At the end of day 5, the gastruloids and TLSs were subjected for morphology analysis, transcriptome sample collection, or immunofluorescence. In the TGF‐β signaling modulation condition, Activin A (20 ng mL^−1) or SB421542 (10 µM) are added into differentiation system from the end of D1. Transgenic Embryo Enhancer Activity Screening Transgenic embryo enhancer activity screen was performed following methods as previously reported.^[ [327]^37 ^] In brief, enhancer candidates with posterior epigenetic activity were selected from our published dataset and ranked according to H3K27ac increasing enrichment at posterior region from E7.0 to E7.5 mouse gastrula.^[ [328]^18 ^] Selected fragments were cloned from mouse genomic DNA and ligated into PB‐pHsp68‐LacZ plasmid using seamless cloning following the manufacturer's instructions (Beyotime, D7010M). Then, the acquired PB‐DREs‐Hsp68‐LacZ plasmids and PBase mRNA were microinjected into the cytoplasm of a fertilized egg with the FemtoJet Microinjection System (Eppendorf). The injected embryos were then cultured to the 2‐cell stage in KSOM medium at 37 °C, 5% CO2 in standard incubator, and the 2‐cell embryos were transferred to the oviduct of pseudo‐pregnant ICR females and marked as 0.5 dpc. Finally, embryos were collected at appropriate stages and subjected to further genotyping and LacZ staining following the protocol previously reported.^[ [329]^37 ^] Whole‐Mount In Situ Hybridization The mRNA expression in E6.5‐E10.5 embryos was assessed by whole‐mount in situ hybridization using the digoxigenin‐labelled antisense RNA probes as described previously.^[ [330]^83 ^] Primers for amplifying probe templates are listed in Table [331]S4 (Supporting Information). In brief, embryos at different stages were collected and fixed with 4%PFA overnight and dehydrated stepwise in graded series of methanol. Embryos can be stored in methanol for a minimum overnight and up to 1 week at −20 °C. Embryos were rehydrated through 75%, 50%, and 25% methanol at room temperature, washed three times with DPBS and treated with 10 µg mL^−1 proteinase K (Invitrogen, AM2548) in PTW for suitable time. After post‐fixation for 30 min, ≈500 ng of digoxigenin‐labelled RNA probe was incubated with the embryo at 70 °C overnight. The embryos were then washed with TBST and incubated with 1:2000 diluted anti‐digoxigenin AP antibody (Roche, 11093274910) in blocking buffer. Incubating on a rocker platform at 4 °C overnight. The next day, embryos were further washed with TBST and stained with NBT/BCIP solution. Images were obtained using Olympus stereoscope. Haematoxylin and Eosin (HE) Staining Embryo samples were fixed in 4% paraformaldehyde/PBS at 4 °C overnight and then processed for paraffin wax embedding. 7‐µm thick sections were cut, dewaxed in xylene, rehydrated through an ethanol series into PBS. HE counterstaining was performed using staining kit (Beyotime, C0105) according to manufacturer's instructions. Images were taken on Olympus VS120 microscope. Single‐Cell Transcriptome Profiling of Mouse Embryos Sexually mature p‐Enh^+/− mice were inter‐crossed and pregnant females at the indicated gestational stages (E8.5, E9.5 and E10.5) were sacrificed for embryo collection. After dissection from the decidua, embryos with the same genotype were combined together and incubated in 200 uL TrypLE (Gibco, 12604013) at 37 °C with shaking (300 rpm) for ≈ 15 min. The tissue pellets were manually triturated with 2 min interval. Immediately following digestion, single cells were washed twice in 1% BSA/PBS and filtered through Falcon 40 µm Cell Strainer (Corning, 352340). Finally, cells were spun at 500 g for 5 min at 4 °C and resuspended in 0.04% BSA/PBS for further quality check and loading onto the 10× Chromium Controller. Processing of the samples was performed using the Chromium Single Cell 3′ library & Gel Bead Kit version 3 (10× Genomics) according to the manufacturer's instructions. Pair‐end 150 bp sequencing was performed on NovaSeq platform. The Generation of Specific Fragments Genomic Knockout ESC Line and Mouse The generation of enhancer knockout or Cdx2 knockout mESCs were performed following protocol published previously with slightly modify.^[ [332]^61 ^] Briefly, small guided RNAs (sgRNAs) were designed according to the instruction from Chop‐chop website ([333]http://chopchop.cbu.uib.no/). sgRNAs were annealed and cloned into linearized px330‐mCherry plasmid (Addgene, Plasmid #98 750). Px330‐mCherry‐sgRNA was transfected into E14TG2a by Lipofectamine 3000 Transfection Reagent (Invitrogen, L3000008) following the manufacturer's instructions. The mCherry positive cells were isolated and collected by BD FACS Aria SORP for further culture for 4 to 5 days until clone formation. Single clones were picked up manually for genotyping and sanger sequencing. The potential off‐target sites were tested based on the Chop‐chop website's indication. Primers used for genotyping and sgRNA sequences targeting p‐Enh and Cdx2 were listed in Table [334]S4 (Supporting Information). For the generation of p‐Enh‐KO mice, we employed the same sgRNAs utilized for p‐Enh knockout in mESCs. Cas9 mRNA and sgRNAs with scaffolds were obtained through in vitro transcription using the MEGAscript Kit (Invitrogen). Subsequently, these Cas9 mRNAs and sgRNAs were microinjected into the cytoplasm of fertilized ova utilizing the FemtoJet Microinjection System (Eppendorf). Following microinjection, the embryos were subjected to culture under standard culture conditions (37 °C, 5% CO[2]) in KSOM medium until reaching the 2‐cell stage. These 2‐cell embryos were then surgically transferred to the oviducts of pseudo‐pregnant females. Upon reaching adulthood, the resulting offspring were subjected to genotyping analysis. RNA Extraction and Quantitative PCR (qPCR) and RNA‐seq Library Preparation and Sequencing For the in vitro culture cells and embryo samples, total RNA was extracted using TRIzol reagent (Invitrogen, 15596018). cDNA was reverse‐transcribed with FastQuant RT Super Mix (Tiangen, KR108), following the manufacturer's instructions. Quantitative PCR analysis of mRNA levels for different markers was performed using diluted acquired cDNA with Stormstar SYBR green qPCR master mix (DBI Bioscience, DBI‐143) with specific primers, and all primers used in qPCR analysis were listed in Table [335]S4 (Supporting Information). RNA‐seq libraries were prepared from 300 ng pure total RNA using NEBNext Ultra II RNA Library Prep kit (NEB, E7775L). Two highly reproducible biological replicates (each with two technical replicates) were performed. Pair‐end 150 bp sequencing was performed on NovaSeq platform. Immunofluorescence For the cultured cells, protocol for immunofluorescence was performed as described previously.^[ [336]^84 ^] Briefly, cultured cells were washed twice with PBS to remove detached cells, and fixed with 4% paraformaldehyde in PBS (pH 7.3) for ≈ 30 min at room temperature, followed by permeabilized and blocked with blocking buffer (0.5% Triton X‐100/5%BSA in PBS) for 1 h. Next, primary antibodies targeting NANOG (Abcam, ab80892. 1:200) and OCT4 (Santa Cruz, sc‐5279. 1:200) were diluted in blocking buffer and incubated with samples overnight at 4 °C. Then, samples were washed three times with PBS containing 0.3% Triton‐X100 (PBS‐Tri), followed by incubation with secondary antibodies (1:400 diluted) in PBS‐Tri for 2 h at room temperature. After wash steps and DAPI staining, samples were imaged using Leica TCS SP8 STED confocal microscope. For the gastruloids and embryo samples, immunolabeling‐enabled 3D imaging of solvent‐cleared organs (iDISCO) was used for the whole‐mount immunolabeling as previously reported.^[ [337]^85 ^] In brief, samples were dehydrated with methanol, bleach in chilled fresh 5% H[2]O[2] in methanol overnight at 4 °C. After washing and rehydration, samples were incubated in blocking buffer which containing 0.2% TritonX‐100/20% DMSO/0.3 M glycine in PBS at 37 °C overnight. Samples were washed in PBS/0.2% Tween‐20 with 10 µg mL^−1 heparin (PTwH) for 1 h twice. Next, the primary antibodies (CDX2 (Abcam, ab76541), SOX2 (Abcam, ab97959), MyHC (R&D Systems, MAB4470)) targeting specific protein were diluted in PTwH/5% DMSO/3% donkey serum (1:200) and incubated with samples at 37 °C overnight. Then, samples were washed with PTwH for 1 day and incubated with specific secondary antibodies (1:400 diluted). Finally, after washing steps and nuclear labeling, solvent‐based tissue clearing was carried out which involving dichloromethane (Sigma, 270997‐12×100ML) and dibenzyl ether (DBE) (Sigma, 108014‐1KG). OLYMPUS IXplore SpinSR and LiTone XL Light‐sheet Microscope were used for the imaging. RNAscope Fresh mouse embryo samples were fixed in 4% PFA/PBS at 4 °C overnight. The fixed samples were further embedded in OCT and cryo‐sectioned at 20 µm thickness. The RNAscope procedure was then performed using RNAscope Multiplex Fluorescent Reagent Kit v2 (Advanced Cell Diagnostics, 323100) according to the RNAscope Multiplex Fluorescent Reagent Kit v2 Assay User Manual (Document Number 323100‐USM). Probes targeting from Advanced Cell Diagnostics Uncx (Cat No. 521431), Tbx6 (Cat No. 498251‐C2), p‐Enh‐eRNA (Cat No. 1199081‐C2), p‐Enh‐eRNA (antisense) (Cat No. 1198881), Cdx2 (Cat No. 438921‐C3) were used. For the simultaneous detection of RNA and protein in embryo samples, the procedural guidelines outlined in the RNAscope Multiplex Fluorescent v2 Assay were followed and combined with Immunofluorescence – Integrated Co‐Detection Workflow (ICW) (Document Number MK 51–150). The sample pre‐treatment step using protease digestion (as applied in RNA‐only detection) was replaced with 1% Triton X‐100 treatment to preserve both RNA and protein signals. SMAD4 polyclonal antibody (Invitrogen, PA5‐34806, diluted at 1:100), Phospho‐Smad1/5 (Ser463/465) (41D10) (Cell Signaling Technology, 9516T, diluted at 1:200) were used. Images were acquired using OLYMPUS IXplore SpinSR, Zeiss LSM 980 confocal microscope, and LiTone XL Light‐sheet Microscope. Western Blotting The cultured cells were washed 3 times with PBS, lysed with lysis buffer and heated at 100 °C for 10 mins. Protein solution was run on SDS‐PAGE and the gels were transferred to nitrocellulose membranes (GE). After blocking in 5% skimmed milk at room temperature for 1 h, membranes were then incubated with primary antibodies targeting CDX2 (Abcam, ab76541) overnight at 4 °C. Next, membranes were washed 3 times with TBST and incubated with HRP‐conjugated secondary antibody at room temperature for 1 h. immunoreactive bands were visualized with SperSignal West Pico Chemiluminescent Substrate (Thermo Scientific, 34579) and detected using Tanon‐5500 imaging system. Automated western blotting analysis were performed on WES (ProteinSimple) system using a Size Separation Master Kit with Split Buffer (12–230 kDa) in accordance with the manufacturer's instructions. Capillary‐based immunoassays were performed using the Wes Simple Western method (SMAD4, Invitrogen, PA5‐34806. diluted at 1:50) and an anti‐rabbit detection module (ProteinSimple). Compass for Simple Western program (ProteinSimple) was used for further illustration. eRNA Overexpression To achieve stable nuclear expression of p‐Enh‐eRNA, we employed the previously reported sno‐vector system.^[ [338]^71 , [339]^72 ^] In brief, p‐Enh‐eRNA sequence was cloned into pZW1‐snoVector with flanking SNORD116‐13 and SNORD116‐14. Subsequently, mESCs were transfected with the pZW1‐snoVector construct and selected using 1 mg mL^−1 G418 for Kan/Neo resistance. Chromatin Immunoprecipitation (ChIP) and ChIP‐seq Libraries Preparation The protocol for chromatin immunoprecipitation (ChIP) was described previously.^[ [340]^18 ^] Cross‐linked cells were lysed in Solution I (10 mm HEPES, pH 7.9, 0.5% NP40, 1.5 mm MgCl2, 10 mm KCl, 0.5 mm DTT) and SDS lysis buffer (1% SDS, 10 mm EDTA, 50 mm Tris‐HCl, pH 8.0), and the samples were sheared for 14 cycles (30 s on/off) in a Bioruptor Pico (Diagenode, Belgium) to achieve an average fragment size of 200–300 bp. Solubilized fragmented chromatin was immunoprecipitated with antibody against H3K27ac (Active Motif, 39133) or SMAD4 polyclonal antibody (Invitrogen, PA5‐34806). Antibody‐chromatin complexes were pulled down using Dynabeads Protein G (Invitrogen, 10004D) on a magnetic separation rack, washed several times, and then eluted from the magnetic beads. Reverse crosslink was performed subsequently under 65 °C for at least 4 h. ChIP‐DNA was treated with RNase A and Proteinase K, and precipitated with ethanol. ChIP‐DNA was finally solved in nuclease‐free water and quantified using Qubit. Sequencing libraries were generated by using NEBNext Ultra DNA library preparation kit (NEB, E7370). Libraries were quality‐controlled and quantified using a Qubit 2.0 Fluorometer (Life Technologies) and Agilent 2100 Bioanalyzer (Agilent Technologies). High‐throughput sequencing was performed on a NovaSeq instrument. Circular Chromosome Conformation Capture (4C) Libraries Construction The protocol for 4C and 4C‐seq libraries construction were reported previously with slight modifications.^[ [341]^86 ^] Briefly, ≈5 million NMP cells were cross‐linked with 1% formaldehyde/PBS for 12 min, followed by quenching with 0.125 M glycine for 5 min at room temperature. The cells were then centrifuged at 500 g at 4 °C for 5 min. After removing the supernatant, the cell pellets were frozen with liquid nitrogen and stored at −80 °C. The next day, the pellets were resuspended in 1 mL cold lysis buffer (50 mm Tris–HCl pH 7.5, 150 mm NaCl, 5 mM EDTA, 0.5% NP‐40, 1% Triton X‐100) containing protease inhibitors and incubated on ice for 30 min. After a 5 min centrifugation at 2,500 g at 4 °C, the supernatant was carefully removed. The nuclei pellets were then resuspended in 50 µL 0.5% SDS, incubated on ice for 30 min, heated up to 62 °C, and incubated for 10 min. Subsequently, 160 µL 1.56% Triton X‐100 was added to the reaction tube, and the mixture was incubated at 37 °C for 30 min. For the first round of digestion, 200 U Dpn II was used, and the reaction was incubated at 37 °C for 3 h. Following complete digestion, the reaction tube was incubated at 65 °C for 20 min. After heat‐inactivation, the ligation process was carried out with the T4 DNA ligase system at 37 °C for 4 h. Subsequently, reverse cross‐linking and precipitation were performed, and the DNA was dissolved in 450 µL H2O. In the next step, the second‐round digestion with Csp6l was carried out at 37 °C for 3 h. After heat‐inactivation and ligation, DNA was purified with AMPure beads (Beckman Coulter, A63881). Finally, the DNA product was amplified with reading primer and reverse primer, and Illumina sequencing indexes were added following the manufacturer's instructions. High‐throughput sequencing was performed on a NovaSeq instrument. RNA Pull‐Down Assay The RNA pull‐down assay was performed as previously described with minor modifications.^[ [342]^87 , [343]^88 ^] Briefly, a set of biotinylated antisense DNA probes targeting the p‐Enh region were synthesized with a Biotin‐TEG modification at the 3′ end (probe sequences listed in Table [344]S4, Supporting Information). NMP cells were subjected to UV crosslinking using a UV crosslinker (Analytik Jena, CL‐3000) at 254 nm, with an energy setting of 120 mJoules cm^− ^2, repeated three times to ensure efficient crosslinking. Cells were then pelleted by centrifugation, washed with ice‐cold PBS, snap‐frozen in liquid nitrogen, and stored at −80 °C until further use. Approximately 20–30 million cells were used for one reaction. For lysis, the frozen cell pellets were resuspended in lysis buffer (50 mm Tris‐Cl pH 7.0, 10 mm EDTA, 1% SDS) supplemented with protease inhibitors, PMSF, and Superase‐In (Invitrogen), followed by sonication using a Bioruptor Pico (Diagenode, Belgium). The lysates were centrifuged at 14000 × g for 10 min at 4 °C, and the supernatant was collected for hybridization. A small aliquot of the supernatant was retained as the total nuclear lysate (TNL) control. For hybridization, the lysates were incubated with biotinylated probes (1 µL of 100 µM probes mix per 1 mL of lysate), supplemented with one‐fourth volume of 5× hybridization buffer (50 mm Tris‐HCl pH 7.5, 10 mm EDTA, 1.5 m NaCl, 50% formamide). The mixture was rotated at 39 °C for 3 h, followed by incubation with 100 µL of pre‐washed streptavidin magnetic beads (Dynabeads Streptavidin C1, Invitrogen, 65001) for an additional 3 h at the same temperature. Beads were then washed five times with 2× SSC containing 0.5% SDS at 37 °C to remove nonspecifically bound material. Protein complexes bound to the probes were eluted by incubating the beads in 100 µL biotin elution buffer (12.5 mm D‐biotin, 7.5 mm HEPES pH 7.5, 75 mm NaCl, 1.5 mm EDTA, 0.15% SDS, 0.075% sarkosyl, 0.02% sodium deoxycholate) with rotation at room temperature for 20 min, followed by heating at 65 °C for 10 min. The supernatant was transferred to a new tube, and the beads were eluted once more with an additional 100 µL of elution buffer. The two eluates were combined, and proteins were precipitated by adding four volumes of pre‐chilled acetone and incubating at −80 °C overnight. The precipitated proteins were collected by centrifugation at 15000 × g for 30 min at 4 °C and used for subsequent Western blot analysis. Quantification and Statistical Analysis–Enhancer Screening Based on H3K27ac ChIP‐seq Data—Data Processing H3K27ac ChIP‐seq data for E7.0 and E7.5 mouse embryo posterior tissues were obtained from the NCBI Gene Expression Omnibus (GEO) under accession numbers [345]GSE98101.^[ [346]^18 ^] Additionally, H3K27ac ChIP‐seq data for E9.5 mouse embryo tailbud were obtained from [347]GSE84899.^[ [348]^31 ^] Processing of all ChIP‐seq samples started with raw sequencing reads. First, Trim Galore (version 0.4.4_dev) was used to remove adapter and low‐quality sequences by trimming 3′ ends of reads.^[ [349]^89 ^] The resulting reads were then aligned to mouse reference genome mm10 using Bowtie (version 1.2.2) with the parameters “–chunkmbs = 512 ‐I = 0 ‐X = 1000 –best ‐m = 1”.^[ [350]^90 ^] After removing duplicates, we performed peak calling using MACS (version 1.4.2) with the parameters “–shiftsize = 100 –nomodel –keep‐dup = all”.^[ [351]^91 ^] Subsequently, we merged all resulting peaks of samples into a consensus list of genomic regions and counted the reads within those regions using MAnorm2_utils (version 1.0.0) with the parameters “–min‐peak‐gap = 150 –typical‐bin‐size = 2000 –shiftsize = 100 –filter = blacklist”.^[ [352]^30 ^] The blacklist regions of mm10 were obtained from Amemiya et al.^[ [353]^92 ^] Screening Distal Regulatory Elements Activated at E7.5 Mouse Embryo Posterior Tissues We used MAnorm2 (version 1.2.2) to identify differential H3K27ac regions between E7.0 and E7.5 embryo posterior tissue samples with the cutoffs p‐value < 0.01 and fold change > 2.^[ [354]^30 ^] Distal regulatory elements activated in E7.5 mouse embryo posterior tissues were defined as the differential H3K27ac regions which both exhibited increased H3K27ac levels in E7.5 samples and located greater than 1.5 kb from transcription start sites (TSSs) of genes. The top‐30 most significant elements are shown in Figure [355]1B. The gene annotation file of mm10 was obtained from the GENCODE project.^[ [356]^93 ^] TF Motif Enrichment Analysis Position Frequency Matrices (PFMs) of vertebrate transcription factors were acquired from the JASPAR CORE database^[ [357]^94 ^] (JASPAR2024_CORE_vertebrates_non‐redundant_pfms_jaspar.txt). For TF motif enrichment analysis, we employed two tools: FIMO^[ [358]^95 ^] (version 5.5.5) and MotifScan^[ [359]^96 ^] (version 1.3.0). FIMO identified 189 significantly enriched TF motifs with the significance threshold p‐value < 0.0001. Motifscan detected 100 TF motifs in the p‐Enh regions under the threshold p‐value < 0.0001. Single Cell RNA Sequencing Data Analysis—Data Processing For each dataset, raw sequencing data was preprocessed using cellranger (10x Genomics Cell Ranger 5.0.0) with default settings, including alignment to mouse reference genome mm10, filtering, barcode counting and unique molecular identifier (UMI) counting.^[ [360]^97 ^] The output files (matrix.tsv.gz, barcordes.tsv.gz and features.tsv.gz) generated by cellranger were subsequently used for downstream analysis. Cell doublets were estimated by scrublet pipeline (version 0.2.3) and discarded.^[ [361]^98 ^] Cells with fewer than 1000 detectable genes or with higher than 5% UMIs from mitochondrial genes were excluded. Genes detected in more than 1% total number of cells were retained for following analysis. The detailed quality control information was listed in Table [362]S2 (Supporting Information). Construction of Mouse Embryo Single‐Cell Transcriptome Reference Map scRNA‐seq datasets were utilized from E8.5, E9.5, and E10.5 WT embryos to establish a reference map with the R package Seurat (version 4.0.1).^[ [363]^99 ^] First, we normalized gene expression using the NormalizeData function and identified variable genes using the FindVariableFeatures function for each dataset. Specifically, gene count matrices were normalized by the total number of UMIs per cell and then multiplied by a scale factor (10000), followed by log‐transformation. After using variance‐stabilizing transformation (VST), top 2000 genes sorted by variance were selected as variable genes. Next, the datasets were integrated by identifying cell pairwise correspondences, referred to as “anchors”, between datasets based on the consistently variable genes using the FindIntegrationAnchors function.^[ [364]^100 ^] Batch correction was performed with the identified anchors using the IntegrateData function.^[ [365]^100 ^] Subsequently, principal component analysis was performed using the RunPCA function and top 30 principal components were chosen for cell clustering. Cell clusters were identified using the FindClusters function with a resolution parameter of 0.5, which employed a K‐nearest neighbor (KNN) graph‐based clustering approach to iteratively group cells together.^[ [366]^101 ^] For visualization, UMAP dimensionality reduction was performed with top 30 principal components using the RunUMAP function.^[ [367]^102 ^] For cell type annotation, we detected marker genes of cell clusters by comparing each cell cluster with all other cell clusters using Wilcoxon rank sum test, with fraction of expressed cells > 0.25 in ether of the two cell populations, adjusted p‐value < 0.05 and log2(fold change) > 0.25. Based on the marker genes of cell clusters, 22 major cell types were identified. Subsequently, the marker genes of cell types were detected again using the same method as described above. The complete list of marker genes for all cell types can be found in Table [368]S3 (Supporting Information). Projecting p‐Enh‐KO Embryo Cells onto Reference Map p‐Enh‐KO cells were projected onto reference map by using the Seurat package. For scRNA‐seq datasets from E8.5, E9.5, and E10.5 p‐Enh KO embryos, gene expression normalization and variable gene identification were first performed using NormalizeData and FindVariableFeatures functions respectively. Next, the anchors between p‐Enh‐KO datasets and the reference map were identified using the FindTransferAnchors function. Finally, p‐Enh‐KO cells were classified and annotated using the TransferData function, as well as projected onto the reference UMAP using the MapQuery function. Differential Expression Analysis Between WT and KO Embryos Comparisons between WT and p‐Enh‐KO were done for each cell type and at each stage. By using Wilcoxon rank sum test, differential expression genes (DEGs) were detected with fraction of expressed cells > 0.25 in ether WT or p‐Enh‐KO cells, adjusted p‐value < 0.05 and log2(fold change) > 0.25. Enrichment analysis between DEGs and the top 100 significant marker genes of cell types were performed using Fisher's exact test. Calculating Module Scores of a Gene Set at Single Cell Levels For a gene set of interest, the module scores in single cells were computed using the AddModuleScore function of the R package Seurat. Specially, all expressed genes were first divided into bins based on their averaged expression across all cells, and then control genes were randomly selected from each bin. Module scores were calculated as the difference between the average expression of the gene set of interest and the average expression of the control gene set.^[ [369]^103 ^] Analysis of Anterior‐Posterior Axis Pattering and Limb Development For analysis of anterior‐posterior axis pattering, the cells of Forebrain/Midbrain/Hindbrain, NMP, Neural tube, Paraxial mesoderm‐1, Paraxial mesoderm‐2, and presomitic mesoderm were selected for reclustering. For analysis of limb development, the cells of Limb mesenchyme and Mesenchyme were selected for reclustering. Consistent with the previous analysis strategy, we first constructed a WT reference and then projected cells of corresponding cell types from p‐Enh‐KO embryos onto the reference. For RNA velocity analysis, the count matrices of pre‐mature and mature mRNA abundances were first generated by velocyto (version 0.17.17) counting pipeline, and then used as input for scVelo (version 0.2.4) to estimate RNA velocity.^[ [370]^104 , [371]^105 ^] Trajectory analysis was performed using monocle3 (version 1.0.0) and pseudotime of cells was calculated based on the trajectory.^[ [372]^46 ^] The density plots of WT or p‐Enh‐KO cells along the pseudotime axis were implemented using the function kdeplot function of the python seaborn package, where a Gaussian kernel was used to produce a continuous density estimate and the density was scaled by the number of observations so as to make the total area under the density equal to 1. For posterior neural tube development and the posterior mesodermal lineages development, we selected the top 500 most variable genes, detected by using the FindVariableFeatures function of the Seurat package, as trajectory‐based expressed genes. Gastruloids Transcriptome Data Processing Raw sequencing reads were first trimmed by Trim Galore (version 0.4.4_dev) and then aligned to mm10 reference genome by STAR (version 2.5.2b) with default parameters.^[ [373]^106 ^] After removal of duplicates, featureCounts (version 1.6.5) was used to count reads for genes.^[ [374]^107 ^] The gene annotation file (gencode.vM10.annotation.gtf) was obtained from the GENCODE project. Batch effects were corrected using the ComBat function of the sva package, where the first batch was selected as the reference batch.^[ [375]^108 , [376]^109 ^] Differential expression analysis was performed using the R package DESeq2 (version 1.40.2) and significantly differential expression genes (DEGs) were defined as the genes with adjusted p‐value less than 0.01 and fold change greater than 2.^[ [377]^110 ^] GO terms enriched by DEGs was identified using the R package clusterProfiler (version 4.8.2).^[ [378]^111 ^] The table containing the list of genes specifically regulated by CDX2 or p‐Enh is provided in Table [379]S5 (Supporting Information). H3K27ac ChIP‐seq Data Processing Paired‐end ChIP‐seq reads were trimmed by TrimGalore v0.6.6 with parameters ’‐q 20′, then aligned to the mouse genome assembly (mm10) with Bowtie 2 v2.2.5^[ [380]^112 ^] and sorted with SAMtools v1.13.^[ [381]^113 ^] All reads from mitochondria and chromosome Y were removed. Duplicated reads and reads with mapping quality below 30 were removed. Peaks were called by MACS2 v2.2.7.1^[ [382]^91 ^] with parameters ‘–extsize 200 –nomodel –nolambda –keep‐dup all ‐p 0.01 ‐g mm’. Differential peak analysis was performed with the DESeq2 method in DiffBind 3.8.4.^[ [383]^114 ^] Significant differential peaks were defined with FDR <0.05 and log2(FoldChange)≥1. Bigwig files were normalized in RPKM under 50 bp bin and generated with deepTools v3.5.1 bamCoverage.^[ [384]^115 ^] Annotations of peaks were performed with R package ChIPSeeker v1.5.1.^[ [385]^116 ^] Table containing detailed information about differential H3K27ac ChIP‐seq peaks is provided in Table [386]S6 (Supporting Information). TFs ChIP‐seq Data Processing TFs ChIP‐seq includes SMAD2/3/4, CDX2 from different public datasets with paired‐end or single‐end sequencing. Paired‐end and single‐end reads were trimmed by TrimGalore v0.6.6 with parameters ‘‐q 20′ relatively, then aligned to the mouse genome assembly (mm10) with Bowtie 2 v2.2.5^[ [387]^112 ^] and sorted with SAMtools v1.13.^[ [388]^113 ^] All reads from mitochondria and chromosome Y were removed. Duplicated reads and reads with mapping quality below 30 were removed. Peaks were called by MACS2 v2.2.7.1^[ [389]^91 ^] with parameters ‘–extsize 200 –nomodel –keep‐dup all ‐p 1e‐5 ‐f BAM ‐g mm’. Bigwig files were normalized in RPKM under 50 bp/10 bp bin and generated with deepTools v3.5.1^[ [390]^115 ^] bamCoverage. 4C‐seq Data Processing and Peak Calling Raw sequencing data was processed using pipe4C^[ [391]^86 ^] (version 1.0.0). Subsequently, all resulting RDS files were used to identify reproducible peaks using the PeakC^[ [392]^117 ^] R package (version 0.2). Significant peaks were identified based on a p‐value threshold of p < 0.05. Signaling Pathway Enrichment Analysis GEO‐seq data for E7.5 mouse embryos were obtained from the NCBI GEO under accession numbers [393]GSE120963.^[ [394]^9 ^] Processing of GEO‐seq data samples started with raw sequencing reads. Quality control metrics such as sequence quality scores, GC content, and adapter content were assessed using FastQC (v0.11.9). Low‐quality reads and adapters were trimmed or removed. Then, the high‐quality reads were aligned to a reference genome (mm10) with Hisat2 (v2.2.1).^[ [395]^118 ^] Gene expression quantification was performed using featureCounts (v1.5.3).^[ [396]^107 ^] The count matrix generated representing the number of reads mapped to each gene in each sample. The count matrix was normalized to account for differences in sequencing depth and gene length as Transcripts Per Kilobase of exon model per Million mapped reads (TPM). To evaluate the enrichment of genes response to TGF‐β, AUCell (v1.20.2) was used to calculate the gene‐set activity score (gene set from GESA: MM9770).^[ [397]^119 ^] The score was calculated by the top 10% genes of all genes in expression matrix. Statistical Analysis Analysis of H3K27ac ChIP‐seq data for E7.0 and E7.5 mouse embryo posterior tissues, each condition with two biological replicates, was performed by MAnorm2. For normalization of ChIP‐seq signals, MAnorm2 first identified common peaks between samples as baseline regions. These common peaks are then subjected to MA‐plot (log‐ratio versus average signal intensity), where a linear regression model is fitted to calculate sample‐specific scaling factors. Finally, these scaling factors are applied to adjust the ChIP‐seq signals across all peaks, including sample‐specific peaks. For differential analysis, MAnorm2 employs a modified t‐test based on both the empirical Bayes framework to improve variance estimation for small sample sizes and the mean‐variance modeling to stabilize variance estimates by borrowing information across genomic regions. p‐values were corrected for multiple testing by using Benjamini‐Hochberg (BH) method. Single cell RNA sequencing data analysis was performed by Seurat. Differential gene expression analysis was performed by using the FindMarkers function in Seurat tested by two‐sided Wilcoxon rank sum test followed by BH correction. Analysis of gastruloids transcriptome data, each condition with three biological replicates, was performed by DESeq2. For normalization, DESeq2 calculated sample‐specific size factors as the median of gene expression ratios to correct for sequencing depth variations. For differential expression analysis, DESeq2 employs a negative binomial generalized linear model (NB‐GLM) framework, which involves three key computational steps: 1) estimation of gene‐specific dispersion parameters with empirical Bayes shrinkage to account for biological variability, 2) fitting of the generalized linear model to quantify expression changes, and 3) statistical hypothesis testing using Wald test to assess significant differences between treatment and control groups. p‐values were corrected for multiple testing by using Benjamini‐Hochberg (BH) method. GO enrichment analysis was performed by clusterProfiler where statistical significance is tested by one‐sided hypergeometric test followed by Benjamini‐Hochberg (BH) correction. For the ChIP‐seq data, the differential peak analysis was performed with the DESeq2 method in DiffBind 3.8.4.^[ [398]^114 ^] Significant differential peaks were defined with FDR <0.05 and log2(FoldChange) ≥ 1. For the 4C‐seq data analysis, significant peaks were identified based on a p‐value threshold of p < 0.05 in the PeakC^[ [399]^117 ^] R package (version 0.2). For comparisons involving three or more groups, such as gastruloid morphological measurements and qPCR‐based gene expression analyses, statistical significance was assessed using one‐way ANOVA. For ChIP‐qPCR data comparing two groups, unpaired t‐tests were performed. Statistical analyses were conducted using GraphPad Prism. Bar plots represent mean ± standard deviation (SD), and sample sizes (n) are indicated in the corresponding figure legends. Detailed statistical tests and exact p‐values are labeled in the figures. Conflict of Interest The authors declare no conflict of interest. Author Contributions Y.C., F.T., Q.F., and L.Z. contributed equally to this work. X.Y., N.J., and Y.C. performed Conceptualization: Y.C., F.T., X.Y., L.Z., Q.F., J.L., P.S., M.W., Y.Q., R.S., Y.F., H.J.X., and R.W. performed Investigation: Y.C., X.Y., F.T., Q.F., R.S., and Y.F. performed visualization: X.Y. and N.J. performed funding acquisition: X.Y., N.J., J.L., Z.S., and C.L. supervision: Y.C., X.Y., F.T., Q.F., and N.J. wrote original draft: X.Y., N.J., Y.C., F.T., Q.F., and C.L. wrote, reviewed, and edited. Data and Materials Availability All raw and processed sequence data reported in this paper have been deposited in the Genome Sequence Archive^[ [400]^120 ^] in National Genomics Data Center,^[ [401]^121 ^] China National Center for Bioinformation / Beijing Institute of Genomics, Chinese Academy of Sciences (GSA: CRA014616) that are publicly accessible at https://ngdc.cncb.ac.cn/gsa. Supporting information Supporting Information [402]ADVS-12-e00895-s006.pdf^ (37.4MB, pdf) Supporting Information [403]ADVS-12-e00895-s003.xlsx^ (21KB, xlsx) Supplemental Table 1 [404]ADVS-12-e00895-s004.xlsx^ (11.4KB, xlsx) Supplemental Table 2 [405]ADVS-12-e00895-s002.xlsx^ (36.3KB, xlsx) Supplemental Table 3 [406]ADVS-12-e00895-s005.xlsx^ (13.9KB, xlsx) Supplemental Table 4 [407]ADVS-12-e00895-s007.xlsx^ (20KB, xlsx) Supplemental Table 5 [408]ADVS-12-e00895-s001.xlsx^ (195.7KB, xlsx) Acknowledgements