Abstract Polycomb repressive complexes (PRCs) are important histone modifiers, which silence gene expression; yet, there exists a subset of PRC-bound genes actively transcribed by RNA polymerase II (RNAPII). It is likely that the role of Polycomb repressive complex is to dampen expression of these PRC-active genes. However, it is unclear how this flipping between chromatin states alters the kinetics of transcription. Here, we integrate histone modifications and RNAPII states derived from bulk ChIP-seq data with single-cell RNA-sequencing data. We find that Polycomb repressive complex-active genes have greater cell-to-cell variation in expression than active genes, and these results are validated by knockout experiments. We also show that PRC-active genes are clustered on chromosomes in both two and three dimensions, and interactions with active enhancers promote a stabilization of gene expression noise. These findings provide new insights into how chromatin regulation modulates stochastic gene expression and transcriptional bursting, with implications for regulation of pluripotency and development. __________________________________________________________________ Polycomb repressive complexes modify histones but it is unclear how changes in chromatin states alter kinetics of transcription. Here, the authors use single-cell RNAseq and ChIPseq to find that actively transcribed genes with Polycomb marks have greater cell-to-cell variation in expression. Introduction Embryonic stem cells (ESCs) are capable of self-renewing and differentiating into all somatic cell types^[42]1, [43]2, and their homeostasis is maintained by epigenetic regulators^[44]3. In this context, Polycomb repressive complexes (PRCs) are important histone modifiers, which play a fundamental role in maintaining the pluripotent state of ESCs by silencing important developmental regulators^[45]4. There are two major PRCs: PRC1, which monoubiquitinylates histone 2 A lysine 119 (H2Aub1) via the ubiquitin ligase RING1A/B; and PRC2, which catalyzes dimethylation and trimethylation of H3K27 (H3K27me2/3) via the histone methyltransferase (HMT) EZH1/2. Recently, we discovered that a group of important signaling genes coexists in active and Polycomb-repressed states in mouse ESCs (mESCs)^[46]5. During the transcription cycle, recruitment of histone modifiers or RNA-processing factors is achieved through changing patterns of post-translational modifications of the carboxy-terminal domain of RNAPII^[47]6. Phosphorylation of S5 residues (S5p) correlates with initiation, capping, and H3K4 HMT recruitment. S2 phosphorylation (S2p) correlates with elongation, splicing, polyadenylation, and H3K36 HMT recruitment. Phosphorylation of RNAPII on S5, but not on S2, is associated with Polycomb repression and poised transcription factories, while active factories are associated with phosphorylation on both residues^[48]5, [49]7, [50]8. S7 phosphorylation (S7p) marks the transition between S5p and S2p^[51]9, but its mechanistic role is unclear presently. Our genome-wide analyses of RNAPII and Polycomb occupancy in mESCs identified two major groups of PRC targets: (1) repressed genes associated with PRCs and unproductive RNAPII (phosphorylated at S5 but lacking S2p; PRC-repressed) and (2) expressed genes bound by PRCs and active RNAPII (both S5p and S2p; PRC-active)^[52]5. Both types of genes are marked by H3K4me3 and H3K27me3, a state termed bivalency^[53]1, [54]10. H3K4me3 correlates tightly with RNAPII-S5p^[55]5, a mark that does not distinguish PRC-active and Polycomb-repressed states. The role of PRCs in modulating the expression of PRC-active genes was shown by PRC1 conditional knockout (KO). Sequential ChIP and single-cell imaging showed mutual exclusion of S2p and PRCs at PRC-active genes^[56]5, although PRCs were found to co-associate with S5p. This indicates that PRC-active genes acquire separate active and PRC-repressed chromatin states. It remains unclear whether these two states occur in different cells within a cell population, or within different alleles in the same cell^[57]5. This pattern of two distinct chromatin states could imply a digital switch between actively transcribing and repressed promoters within a population of cells, thereby introducing more cell-to-cell variation in gene expression compared to genes with both alleles in active chromatin states. Motivated by this hypothesis, here, we integrate states of histone and RNAPII modification from a published classification of ChIP-seq data^[58]5 with single-cell RNA-sequencing (RNA-seq) data generated for this analysis. The matched chromatin and scRNA-seq data sets allow us to decipher, on a genome-wide scale, how differences in the chromatin state can affect transcriptional kinetics. A schematic overview of our analysis strategy is shown in Fig. [59]1. We focus on active PRC-target genes that are marked by PRCs (H3K27me3 modification or both H3K27me3 and H2Aub1) and active RNAPII (S5pS7pS2p), and compare these with “active” genes (marked by S5p, S7p, S2p without H3K27me3 and H2Aub1 marks). We quantify variation in gene expression and transcriptional kinetics statistically and by mathematical modeling (Fig. [60]1). In addition, we map the functions of PRC-active genes in the context of pluripotency signaling and homeostasis networks. Further, we analyze the linear ordering and three-dimensional contacts of PRC-active genes on the mouse chromosomes. Finally, we investigate the effect of Polycomb on regulating transcriptional heterogeneity by deletion of Ring1A/B, followed by single-cell profiling. Fig. 1. Fig. 1 [61]Open in a new tab Summary of methodology. OS25 mESCs were cultured and characterized by single-cell RNA-seq using the Fluidigm C1 system, applying the SMARTer kit to obtain cDNA and the Nextera XT kit for Illumina library preparation. OS25 cells are grown in conditions that select for undifferentiated cells (high Oct4-expressing). Libraries from 96 cells were pooled and sequenced on four lanes of a HiSeq. After quality-control analysis of cells, 90 cells out of 96 remained for further analysis. We first unraveled contributions of components of gene expression variation using the scLVM method^[62]13. Removing cell cycle variation and technical noise allowed us to focus on stochastic gene expression. Gene expression variation can be quantified by CV^2 or DM, which is a measure of noise independent of gene expression levels and gene length. To explore the transcriptional kinetics of OS25 ES cells, poisson-beta model^[63]16 was fitted to single-cell gene expression data, leading to estimates of burst frequency and size. Next, histone and RNAPII promoter modifications were obtained from Brookes et al.^[64]5 and integrated with single-cell RNA-seq to investigate relationship between stochastic gene expression and epigenetics. Active genes with no PRC marks are usually in the “on” state with high burst frequencies (k [on]), PRCr genes are mostly “off” and PRC-active genes switch between “on” and “off” states very frequently. Considering the allele-level possibilities, at active genes, both alleles would be in an actively transcribing state. For PRCa genes, both alleles would be in an actively transcribing state, or both alleles would be in a silent PRC-marked state, or only one allele is in PRC-marked state, which, subsequently, would result in noisier gene expression. For PRC-repressed genes, both alleles are expected to be in a silent PRC-marked state Results Single-cell RNA-seq and data processing To investigate how Polycomb repression relates to stochasticity in gene expression, we profiled single-cell transcriptomes of mouse OS25 ESCs cultured in serum and leukemia-inhibitory factor (LIF), previously used to map RNAPII phosphorylation and H2Aub1^[65]5. Single-cell RNA-seq was performed using the Fluidigm C1 system, applying the SMARTer kit to obtain cDNA and the Nextera XT kit for Illumina library preparation. Libraries from 96 cells were pooled and sequenced on four lanes of an Illumina HiSeq2000 (Fig. [66]1; please refer to Methods for details). Next, we performed quality-control analysis for each individual cell data set and removed poor-quality data based on two criteria (as described before in ref. ^[67]11). Cells were removed if: (1) the total number of reads mapping to exons for the cell was lower than half a million and (2) the percentage of reads mapping to mitochondrial-encoded RNAs was higher than 10%. We also compared normalized read counts of genes between cells and found many genes abnormally amplified for three cells. Therefore, we removed these cells, resulting in 90 cells that could be used for further analysis. For these 90 cells, over 80% of reads were mapped to the Mus musculus genome (GRCm38) and over 60% to exons (Supplementary Fig. [68]1A–C). OS25 ES cells are grown under Oct4 selection and do not express early-differentiation markers such as Gata4 and Gata6^[69]5, having the expected features of pluripotency. They are ideal for studying Polycomb repression and its impact on transcriptional cell-to-cell variation as compared to other culture conditions such as 2i (serum-free). ESCs grown in 2i show decreased Polycomb repression and RNAPII poising at well-characterized early-developmental genes^[70]12, therefore making 2i conditions the least ideal conditions to study mechanisms of Polycomb regulation in the pluripotent state. As previously shown^[71]5, we do not observe distinct subpopulations of cells based on key pluripotency factors and differentiation markers in our OS25 single-cell data sets (Supplementary Fig. [72]1D). In addition, we compared single-cell expression profiles of the OS25 ESCs grown under Oct4 with recently published scRNAseq data sets from mESCs cultured in serum + LIF and 2i^[73]11; Principal component analysis using pluripotency genes and differentiation markers shows that OS25 cells are more similar to the subpopulation of pluripotent serum cells, rather than the subpopulation of serum cells that are either “primed for differentiation” or are “on the differentiation path” (Supplementary Fig. [74]1E). Defining chromatin state and gene expression noise We integrated our new single-cell RNA-seq data with a previous classification of gene promoters according to the presence of histone and RNAPII modifications^[75]5 (Fig. [76]1). Comparison of our average single-cell expression profiles with the bulk gene expression (mRNA-seq) profiles from Brookes et al.^[77]5 yields a high correlation (Spearman’s rho = 0.87, Supplementary Fig. [78]1F), suggesting that the chromatin and RNAPII data reflect cells in the same biological state as the single-cell RNA-seq data. Next, we analyzed gene expression variation within the single-cell data. First, we quantified cell-to-cell variation at each mean expression level using the coefficient of variation (Supplementary Fig. [79]2A). Cell-to-cell variation can arise either due to stochastic gene expression itself, or technical noise or confounding expression heterogeneity due to biological processes such as the cell cycle. To isolate pure stochastic gene expression from cell cycle variation in gene expression, we applied a latent variable model^[80]13. This is a two-step approach, which reconstructs cell cycle state before using this information to obtain “corrected” gene expression levels. The method reveals that the cell cycle contribution to variation is 1.2% on average (Supplementary Fig. [81]2B). While this effect is small, when clustering all cells based on G2/M stage markers, we found that cells separate into two groups: one with high expression of G2 and M genes and the other with low expression of these genes (Supplementary Fig. [82]2C). Applying the cell cycle correction removes this effect, leading to a more homogeneous expression distribution of these genes across the cells (Supplementary Fig. [83]2D). To account for the technical noise present in single-cell RNA-seq data, we removed lowly expressed (LE) genes that are most likely to display high technical variability^[84]14, [85]15. Here, a gene is considered as LE if the average normalized read count is less than 10. This results in a set of 11,861 genes with moderate to high mRNA abundance. Subsequently, we use the DM (distance to median) to quantify gene expression variation in mRNA expression^[86]11, since it accounts for confounding effects of expression level and gene length on variation (described in detail in the Methods; Fig. [87]1). Among the 11,861 expressed genes, 7175 have categorized ChIP-seq profiles as defined by Brookes et al.^[88]5; genes excluded have transcription start site (TSS) regions that overlap with other genes, and therefore cannot be unequivocally classified. We defined two major sets of genes based on their PRC marks and RNAPII states: (1) “Active” genes (n = 4483) without PRC marks (H3K27me3 or H2Aub1) but with active RNAPII (S5pS7pS2p), (2) “PRC-active” genes (labeled as “PRCa”; n = 945) with PRC marks (H3K27me3 or H3K27me3 plus H2Aub1), and active RNAPII. To explore the transcriptional kinetics of these genes and describe stochastic gene expression in OS25 ES cells, we estimated their kinetic transcription parameters using a Poisson-beta model described previously^[89]16 (see also in the Methods). PRCa genes have distinct transcriptional kinetics Using the DM measure to quantify gene expression variation in single cells, we observe that histone modifications mediated by PRCs (H3K27me3 or H3K27me3 and H2Aub1) correlate with high levels of variability compared to active genes (those without PRC marks; P < 2.2 × 10^−16 by the two-tailed Wilcoxon rank sum test, Fig. [90]2a). Furthermore, the inferred kinetic parameters provide insight into the expression behavior of genes, showing that active genes have significantly higher burst frequencies than PRCa genes (Fig. [91]2a and Supplementary Fig. [92]3A). This suggests that PRCa genes are more frequently in the “off” state, i.e., more alleles are in the off state at any given point in time, potentially due to the PRC repression of a subset of alleles. Fig. 2. Fig. 2 [93]Open in a new tab Stochastic gene expression of PRCa and active genes. a Comparison of PRCa and active genes reveals that PRCa genes are more variable with lower burst frequency levels than active genes (P < 2.2 × 10^−16 by the two-tailed Wilcoxon rank sum test). Gene expression variation is represented by DM values. b Expression profiles of PRCa genes show bimodal patterns. The distribution of a gene with bimodal expression is assumed to be expressed as a mixture of two normal distributions (LE and HE states; upper panel). PRCa genes have mixed cell states (on average 49% in HE and 51% in LE) indicating they are either in active state (i.e., active RNAPII and no PRC marks) or in repressed state (unproductive RNAPII and with PRC marks) consistent with cellular heterogeneity, suggested in Brookes et al.^[94]5 Error bars represent s.e.m. To ensure that differences between the kinetic parameters are not driven by changes in gene expression levels between the active and PRCa groups, we extracted expression-matched genes of active and PRCa groups (please refer to Methods). These analyses confirmed that PRCa genes have lower burst frequency and higher noise levels than active genes (Supplementary Fig. [95]3B and C). Consequently, the greater cell-to-cell variability for PRCa compared to Active genes is not driven by difference in the mean expression level, but potentially linked to the presence of PRC marks themselves. To explore whether H3K9me3 could contribute to the transcriptional heterogeneity identified at PRCa genes, we analyzed H3K9me3 ChIP-seq data of Mikkelsen et al.^[96]17, and found that only a few expressed PRCa genes (n = 5) are marked by H3K9me3 at their promoter region (2 kb centered on the TSS), making further analysis statistically impossible. Although the literature shows that the DNA of mESCs is hypomethylated, and genes that are marked by Polycomb are usually devoid of DNA methylation^[97]18, [98]19, we checked the extent of DNA methylation at the PRCa gene list considered. We extracted the DNA methylation patterns at proximal promoter regions in mESCs reported in Fouse et al.^[99]19. Only a small proportion of genes (n = 110) has DNA methylation according to this definition. Owing to the small sample size, a statistical assessment will be weak, but comparison of gene expression variation profiles of these genes with the same number of PRCa genes (and same expression levels) that are unmethylated showed that the differences are not significant (Wilcoxon rank sum test, 0.1). This suggests no detectable effect of DNA methylation on transcriptional heterogeneity of PRCa genes (Supplementary Fig. [100]3D). A decrease in the frequency of transcriptional bursting can manifest itself as a more bimodal pattern of gene expression across a cell population. Indeed, we observe that PRCa genes have significantly more bimodal expression profiles compared to active genes (see Methods for bimodality index calculation; Supplementary Fig. [101]3E and Fig. [102]2b). Assuming that the distribution of a gene with bimodal expression can be expressed as a mixture of two log-normal distributions^[103]20 (LE and highly expressed (HE) states), we observe that PRCa genes have mixed cell states (on average 49% of cells in HE and 51% in LE). In contrast, active genes are mostly in the active state as expected (on average 70% in HE and 30% in LE). PRC-repressed genes with unproductive RNAPII and PRC marks, labeled as “PRCr”) are 24% in HE and 76% in LE (Fig. [104]2b). Therefore, expression patterns of PRCa are in between Active and PRCr, suggesting a composite of these two states. We should note that in our kinetic models, decay rates are set to 1 to normalize kinetic parameters so that they are independent of time^[105]16. To investigate whether decay rates have profound effects on kinetic parameters, we integrated published mRNA decay rates in mESCs^[106]21 into our kinetic model. The subtle differences in decay rates across genes did not result in major changes in the inferred kinetic parameters, leaving all major trends unaffected (Supplementary Fig. [107]3F). PRCa genes are important regulators in signaling pathways To investigate potential functions of the cell-to-cell variation in gene expression in PRCa genes, we carried out KEGG pathway enrichment analysis for PRCa genes in our OS25 mESCs (see also Brookes et al.^[108]5). While active genes are enriched in pathways related to housekeeping functions, such as RNA transport, consistent with their uniform and stable expression across cells, PRCa genes are enriched in signaling pathways such as PI(3)K-Akt, Ras signaling, and TGF-beta signaling (Supplementary Table [109]1). These signaling pathways show high levels of cell-to-cell variation compared to pathways related to housekeeping functions (Supplementary Fig. [110]3G). This may be due to transcriptomic fluctuations introduced by cytokine LIF signalling via two signaling pathways: Jak-Stat3 and PI(3)K-Akt^[111]22 (Fig. [112]3). Fig. 3. Fig. 3 [113]Open in a new tab Signaling pathways that are key regulators of pluripotency in mESCs. In OS25 cells there is a selection for undifferentiated cells (high Oct4-expressing). LIF integrates signals into the core regulatory circuitry of pluripotency (Sox2, Oct4, and Nanog) via two signaling pathways: Jak-Stat and PI3K-Akt^[114]22. The Jak-Stat pathway activates Klf4, and the PI3K-Akt pathway stimulates the transcription of Tbx3, which are both PRCa genes. The MAPK pathway antagonizes the nuclear localization of Tbx3. PRCa genes are enriched in Jak-Stat and PI3K-Akt pathways, which show high cell-to-cell variation, suggesting a crucial role of PRCs in modulating fluctuations in signaling pathways that integrate LIF signals into core transcription factor network (Figure adapted from ref. ^[115]22) The Jak-Stat3 pathway activates Klf4, and the PI(3)K-Akt pathway stimulates the transcription of Tbx3 ^[116]22. The expression levels of Klf4 and Tbx3, which are PRCa genes, are noisier than the pluripotency factors Nanog, Sox2, and Oct4. This pattern of noise propagation from the signaling pathways through the downstream transcriptional regulatory network is interesting, as it might indicate the role of PRCs in modulating transcriptomic fluctuations. Chromosomal position effects and stochastic gene expression It is known that neighboring genes on chromosomes exhibit significant correlations in gene expression abundance and regulation, partly due to two-dimensional (2D) chromatin domains^[117]23–[118]26. Is there a similar effect of clustering by chromatin marks and noise in gene expression? To address this, we investigated the positional effects of noise in mRNA expression using the DM values (Methods). If genes cluster together based upon their transcriptional noise, we would expect that the DM values of genes adjacent to noisy genes would be higher than those of genes adjacent to stable genes. Indeed, the noise levels of genes in the neighborhood of noisy genes are significantly higher than those of genes that flank stable genes (P = 1.3 × 10^−4 by the one-tailed Wilcoxon rank sum test, ±50 kb of TSS, Supplementary Fig. [119]4A). This suggests that the genomic neighborhood might influence the frequency of transcriptional bursting. In Fig. [120]4a, we show the association between chromosomal position and gene expression noise. The difference between the mean expression levels of flanking genes between noisy and stable genes is not significant (P = 0.7311 by the two-tailed Wilcoxon rank sum test, ±50 kb of TSS), suggesting that the clusters of genes are not driven by their expression levels. The association between chromosomal position and gene expression noise was most significant at the window size of 50 kb, but weaker at a neighborhood size of 0.5 Mb (Fig. [121]4a). (Please refer to Methods for P value calculation.) Thus, genes tend to be clustered into neighborhood domains by their noise levels, ranging in size up to 0.5 Mb. Fig. 4. Fig. 4 [122]Open in a new tab Chromosomal position effects and stochastic gene expression. a Maps of genes belonging to noisy and stable clusters. Chromosomal positions of genes marked by PRCs and RNAPII in the noisiest clusters. One of the noisy clusters is visualized, DM levels correlate with bimodal expression patterns. In lower panel, association between chromosomal position and gene expression noise is shown; the noise levels of genes in the neighborhood of noisy genes are significantly higher than those of flanking genes of stable genes. As a control, we constructed 100 randomized genomes in which the positions of genes were fixed but the DM value of each gene was assigned randomly without replacement, and the same analysis was performed on each randomized genome to obtain random P values. In all, 2.5% quantile of random P value, and 97.5% quantile of random P values are shaded in gray. All data are shown on a −log(p) scale. b Enrichments of PRC marks and RNAPII states in noisy and stable clusters; two-tailed Fisher’s exact test; *P < 0.1, **P < 0.05 To identify the clusters of noisy or stable genes, we performed a sliding-window analysis on the mouse genome (Methods). We found 129 noisy clusters ranging in size from 4 to 11 genes, spanning a total number of 669 genes. Similarly, 112 stable clusters (between 4 and 13 genes) with a total number of 556 genes were found (Fig. [123]4a). The noise levels of genes in noisy clusters are significantly higher than that of genes in stable clusters (P < 2.2 × 10^−16 by two-tailed Wilcoxon rank sum test, Supplementary Fig. [124]4B) independent of the mean expression levels and gene lengths (Supplementary Fig. [125]4C–D). In addition, we found that DM levels correlate with bimodal expression patterns within the noisy clusters. One example is visualized in Fig. [126]4a; one of the noisy clusters on chromosome 1 consists of three PRCa and two active genes. Lefty1 and Lefty2 PRCa genes, which are important in controlling the balance between self-renewal and pluripotent differentiation in mESCs, are highly variable, and also highly correlated in their gene expression. An active gene, Pycr2, Pyrroline-5-carboxylate reductase 2, is in close proximity to both Lefty1 and Lefty2, and is more variable than the Sde2 gene that lies in proximity of Lefty2 only (density profiles are shown in Supplementary Fig. [127]4E). Indeed, within the clusters, gene expression variation levels of active genes increase with the increasing number of flanking variable genes (Supplementary Fig. [128]4F). Another PRCa gene is Tmem63a, which is a transmembrane protein implicated in maintenance of pluripotency, lies near Lefty1 and has high cell-to-cell variation in gene expression. Interestingly, PRCs characterize the noisy clusters, i.e., PRC marks are enriched in noisy clusters rather than in stable ones. In particular, genes with H3K27me3 are enriched at noisy clusters (P = 1.1 × 10^−2 by the two-tailed Fisher’s exact test), but depleted at stable clusters (P = 5.9 × 10^−2 by the two-tailed Fisher’s exact test, Fig. [129]4b). Since PRCs are tightly associated with RNAPII states, we examined differences between the RNAPII state of genes between noisy and stable clusters. We found that genes marked by active elongating RNAPII (S5pS7pS2p) are depleted at noisy clusters (P = 1.3 × 10^−3 by the two-tailed Fisher’s exact test, Fig. [130]4b), supporting the view that elongating RNAPII modifications promote stable gene expression. Together, noisy clusters are characterized by the presence of PRC marks and the absence of active elongating RNAPII, while stable clusters are characterized by the absence of PRCs. Gene and enhancer clustering in 2D and 3D Next, we analyzed whether PRCa genes are proximal to fully repressed Polycomb genes, which could eventually increase their sensitivity to Polycomb repression. Linear spatial proximity between PRCa genes and PRCr genes is significantly closer than the median distance between randomly chosen genes and PRCr genes (empirical P = 2 × 10^−2, Fig. [131]5a; Methods). Interestingly, PRCa genes are also in close proximity to active genes (empirical P < 0.0001, Supplementary Fig. [132]4G), while active genes are distal from PRCr genes (empirical P = 5 × 10^−3, Supplementary Fig. [133]4H), suggesting a 2D spatial arrangement of these genes as Active-PRCa-PRCr (as visualized in Fig. [134]5a). Fig. 5. Fig. 5 [135]Open in a new tab Effects of 2D and 3D neighborhood on transcriptional kinetics. a Histogram of simulated median distances under a null model assuming no positional preference in the neighborhood of PRCr genes. The observed median distance of PRCa genes to their nearest neighbor in the PRCr group, depicted by vertical dashed red line, are significantly less than expected by chance (empirical P = 2 × 10^−2). b Analyzing mESC Promoter Capture Hi-C data reveals that PRCa genes have a strong enrichment for long-range contacts between promoters with levels in between PRCr and active genes. Error bars represent s.e.m. c PRCa genes have significantly more interactions with active enhancers compared to PRCr genes (P < 2.2 × 10^−16). In contrast, interactions with poised enhancers are mainly observed for PRCr genes rather than PRCa (P < 2.2 × 10^−16). d Occurrence of interactions with active enhancers decrease noise of PRCa genes independent of the mean expression levels We next asked whether the linear genomic position effects of PRCs are reflected in the three-dimensional (3D) genome organization in ESCs. Recently, Schoenfelder et al.^[136]27 found that PRC1 acts as a major regulator of ESC genome architecture by organizing genes into 3D interaction networks. They generated mESC Promoter Capture Hi-C (CHi-C) data^[137]28, and analyzed it using the GOTHiC (Genome Organization Through Hi-C) Bioconductor package ([138]http://www.bioconductor.org/packages/release/bioc/html/GOTHiC.htm l). This yielded a strong enrichment for long-range contacts between promoters bound by PRCs. We applied the same approach to this data set using our gene list. We found that there is a strong enrichment for long-range promoter–promoter contacts for both PRCa and PRCr genes (Fig. [139]5b). Interestingly, PRCr genes have significantly stronger contact enrichment than PRCa genes in mESCs (one-tailed t-test P = 6.3 × 10^−6). PRCa genes are in between PRCr and active genes; they have stronger contact enrichment than active genes (one-tailed t-test P = 1 × 10^−4; Fig. [140]5b). In Fig. [141]5b, the promoter contacts of the aforementioned noisy cluster PRCa gene Lefty2 are visualized. It is in contact with the other PRCa genes Lefty1 and Tmem63a, and it has a strong connectivity with the active Pycr2 gene. These contacts may affect Pycr2’s frequency of transcriptional bursting, and thereby tune expression noise. In terms of the promoter preferences of gene sets, it is interesting to