Abstract Functional perturbation and action mechanism studies have shown that the transcription factor Zeb2 controls cell fate decisions, differentiation, and/or maturation in multiple cell lineages in embryos and after birth. In cultured embryonic stem cells (ESCs), Zeb2’s mRNA/protein upregulation is necessary for the exit from primed pluripotency and for entering general and neural differentiation. We edited mouse ESCs to produce Flag-V5 epitope-tagged Zeb2 protein from one endogenous allele. Using chromatin immunoprecipitation coupled with sequencing (ChIP-seq), we mapped 2432 DNA-binding sites for this tagged Zeb2 in ESC-derived neuroprogenitor cells (NPCs). A new, major binding site maps promoter-proximal to Zeb2 itself. The homozygous deletion of this site demonstrates that autoregulation of Zeb2 is necessary to elicit the appropriate Zeb2-dependent effects in ESC-to-NPC differentiation. We have also cross-referenced all the mapped Zeb2 binding sites with previously obtained transcriptome data from Zeb2 perturbations in ESC-derived NPCs, GABAergic interneurons from the ventral forebrain of mouse embryos, and stem/progenitor cells from the post-natal ventricular-subventricular zone (V-SVZ) in mouse forebrain, respectively. Despite the different characteristics of each of these neurogenic systems, we found interesting target gene overlaps. In addition, our study also contributes to explaining developmental disorders, including Mowat-Wilson syndrome caused by ZEB2 deficiency, and also other monogenic syndromes. Keywords: chromatin immunoprecipitation sequencing, embryonic stem cells, Mowat-Wilson syndrome, neural differentiation, neurodevelopmental disorder, syndromes, target genes, transcription factor, transcriptomics, Zeb2 1. Introduction Zeb2 (also named Sip1/Zfhx1b) and Zeb1 (δEF1/Zfhx1a), the two members of the small family of Zeb transcription factors (TFs) in vertebrates, bind to DNA to two separated E-box like sequences, as determined by in vitro binding to double-stranded oligonucleotides and/or Zeb-mediated repression of transfected reporter DNA constructs [[42]1,[43]2,[44]3,[45]4]. Spaced bi-partite CACCT sequences, often present as CACCTG E2-box, and sometimes CACANNT(G) sequences [[46]1,[47]2], are bound via two (between Zeb1 and Zeb2) highly conserved, separated clusters of zinc fingers [[48]3]. Mutations in ZEB2 cause Mowat-Wilson Syndrome (MOWS, OMIM#235730) [[49]5,[50]6,[51]7], a rare congenital disease. MOWS patients display intellectual disability, epilepsy/seizures, typical facial dimorphism, and often Hirschsprung disease (HSCR), as well as multiple other defects [[52]8,[53]9,[54]10]. Typical are also the delay in developmental milestones such as motoric development, and anomalies of eyes and teeth. Other features include specific craniofacial malformation and sensorineural deafness, which together with HSCR originate from defects in the ZEB2-positive (+) cells of the embryonic neural crest cell lineage. Meanwhile, mutant ZEB2 alleles have been determined for about 350 patients [[55]11,[56]12,[57]13,[58]14,[59]15,[60]16]. Recent reports have described malformations in the central nervous system of MOWS patients over a broad age range, including defects of the corpus callosum and/or hippocampus, and can be seen by neuroimaging. These reports also followed up on electro-clinical defects, such as focal seizures, with MOWS patients [[61]16,[62]17,[63]18]. Zeb2’s multiple functions, its action mechanisms and many partner proteins, the still few proven or candidate direct target genes, and lists of genes whose normal expression depends on intact Zeb2 levels, have been documented in various cell types [[64]19,[65]20,[66]21,[67]22,[68]23,[69]24,[70]25,[71]26,[72]27,[73]28, [74]29]. For a recent review, see [[75]30]). Such combinations of studies have allowed for explanations of specific phenotypes caused by Zeb2 perturbation. Zeb2 DNA-binding to candidate target genes helped to explain Zeb2 loss-of-function phenotypes in embryonic stem cells (ESCs) and initial cells of early and late embryos, later followed by post-natal and adult mice. These intact-Zeb2 dependent and/or Zeb2 target genes are involved in pluripotency (Nanog, Sox2), cell differentiation (Id1, Smad7), and embryonic brain cortical and adult neurogenesis (Ntf3, Sox6), as well as in epithelial-to-mesenchymal transition (EMT) (Cdh1) [[76]19,[77]20,[78]21,[79]22,[80]23,[81]24,[82]25,[83]26]. In reverse, subtle mutagenesis of Zeb2 DNA-binding sites in demonstrated target genes also confirmed Zeb DNA-binding, including its repressive activity on mesodermal XBra in Xenopus embryos [[84]27] and Cdh1 in epithelial cells [[85]19]. Despite its critical functions in the precise spatial-temporal regulation of expression of many system/process-specific relevant genes during embryogenesis and post-natal development, and more recently adult tissue homeostasis, stem cell-based repair, and acute and chronic disease [[86]28,[87]29,[88]30], data from chromatin immunoprecipitation followed by sequencing (ChIP-seq) for Zeb2 has been obtained in very few cases only. A major reason is that ChIP-seq grade antibodies specific for Zeb2 are not readily available. ChIP-seq data have been published for high-Zeb2 hepatocellular carcinoma and leukemia cell lines, or cultured cells that overproduce an epitope-tagged Zeb2 (tag-Zeb2) from cDNA-containing episomal vectors or the safe-harbor Rosa26 locus [[89]25,[90]31,[91]32]. Neither of these represent normal endogenous Zeb2 levels and dynamics. Furthermore, many anti-Zeb2 antibodies cross-react with Zeb1, so do not discriminate between both proteins when their presence overlaps or succeeds one another in a given cell type. However, these TFs compete for the same target genes, which for the individual proteins in any case, depends on cell identity/state, extrinsic stimulation of the cells, or cellular context (e.g., as demonstrated in somitogenesis [[92]33] and melanoma [[93]34]). In undifferentiated mouse (m) ESCs, Zeb2 mRNA/protein is undetectable, whereas during neural differentiation (ND) of ESCs its strong upregulation accompanies efficient conversion of naïve ESCs into epiblast stem cell like cells (EpiLSCs). This is essential for the subsequent exit of ESCs from primed pluripotency and the onset of ESC differentiation, including progression to neuroprogenitor cells (NPCs) [[94]25]. Here, we have edited one Zeb2 allele of mESCs by inserting a Flag-V5 epitope tag just before the Zeb2 stop codon, in-frame with the last exon (ex9 of mouse Zeb2 [[95]35]). These Zeb2-V5 mESCs were then differentiated into NPCs and the Zeb2 DNA-binding sites were determined by V5-tag-based ChIP-seq. Doing so, we identified 2432 binding sites for Zeb2 in NPCs, of which 2294 map to 1952 protein-encoding genes. We then cross-referenced these ChIP-positive (+) target genes with RNA-sequencing data of differentially expressed genes (DEGs) in cell-type specific, neurodevelopment-relevant Zeb2 perturbations [[96]23,[97]36,[98]37]. Although we compare non-identical systems, the overall approach still revealed a number of interesting overlaps of target genes, as well as Zeb2’s role in regulating critical targets in neurodevelopment, including its own gene promoter. Taken together, for the first time we report the identification of Zeb2’s genome-wide binding sites (GWBS) in ESC-derived NPCs at normal Zeb2 level. 2. Materials and Methods 2.1. ESC Culture Conditions and Differentiation CGR8 (strain 129) wild-type (WT) and Zeb2-Flag-V5+ (in brief, Zeb2-V5) mouse (m) ESCs were cultured and differentiated towards the neural lineage [[99]38], with few modifications). Briefly, the mESCs were cultured on 0.1% gelatin-coated plates in ESC-medium (DMEM supplemented with 15% heat-inactivated (HI) foetal bovine serum (FBS), 2 mM L-Glutamine, 1× Non-Essential Amino Acids (NEAA), 143 µM β-mercapto-ethanol (β-EtSH) (all ThermoFisher Scientific, TFS, Waltham, MA, USA) and leukemia inhibitory factor (LIF) at 10^3 U/mL). For neural differentiation (ND), 4 × 10^6 cells were plated on non-adherent 10-cm dishes (Greiner, Kremsmünster, Austria) and allowed to form cellular aggregates (CAs) in 10 mL CA-medium (DMEM, 10% HI-FBS, 2 mM L-Glutamine, 1× NEAA, and 143 µM β-EtSH). From day (D) 4 of ND, cells were grown in CA-medium supplemented with 5 µM retinoic acid (RA). During the aggregation stages of ND, the medium was changed every other day by carefully collecting the aggregates with a 10-mL pipet and transferring them to a 15-mL conical tube. The CAs were allowed to sink to the bottom of the tube where, after the previous medium was carefully discarded, the CAs were then gently resuspended in fresh medium and transferred back to the dishes. At D8 of ND, the aggregates were harvested and dissociated by resuspension in 1 mL Accutase (TFS) and pipetting them up and down using a 1-mL pipet, after shaking them in a 37 °C water bath for 5 min. The Accutase was deactivated by adding 9 mL of fresh N2-medium (DMEM with 2 mM L-Glutamine, 50 µg BSA/mL, and 1× N2-supplement) to the dissociated cells, and pelleting the cells gently for 5 min at 200 g. The cells were then resuspended in fresh N2-medium. To ensure single-cell suspension, the cells were filtered by passing them through a 40-μm nylon cell strainer (Corning, Corning, NY, USA); 2.5 × 10^5 cells/cm^2 were then plated on poly-DL-ornithine hydrobromide/laminin (both from Sigma-Aldrich, St. Louis, MI, USA) coated plates. Cells were harvested at D8 or D10 of ND. 2.2. Western Blots To check for Zeb2-V5 protein, the 2BE3-clone ESCs were subjected to ND till D8. Cytoplasmic and nuclear fractions were split using the NePer-kit^® (TFS). Protein concentrations were measured using the Bradford BCA (TFS), and equal quantities of protein lysates were loaded on 6% polyacrylamide gels (with SDS) and thereafter cut, according to relative molecular mass. Gels were then transferred onto nitrocellulose membranes (Amersham Bioscience, Amersham, UK), which were incubated overnight with anti-Zeb2 [[100]20] and anti-V5 (Life Technologies, Carlsbad, CA, USA) antibodies, followed by incubation at room temperature with horse radish peroxidase (HRP) conjugated secondary anti-rabbit and anti-mouse antibodies (Jackson ImmunoResearch, West Grove, PA, USA). Protein bands corresponding to Zeb2 or Zeb2-V5 were visualized on an AI-600 digital imager (Amersham Bioscience). As loading control, we used Valosin-containing Protein (VCP) and anti-VCP antibody (Santa Cruz, Dallas, TX, USA, sc-57492, mouse). 2.3. RNA Extraction and RT-qPCR Analysis Total RNA was extracted from ESCs using TRI Reagent (Sigma), and used for cDNA synthesis with RevertAid RT Kit (from TFS) with oligodT-primers. RT-qPCR was performed using SybrGreen dye (BioRad, Hercules, CA, USA) on a CFX96 T1000 thermal cycler (BioRad). All data shown are averages of three independent biological replicates and three technical replicates, normalized to β-Actin mRNA levels. Primers are listed in [101]Table 1. Analysis and data visualization was performed in R environment for statistical computing version 3.5.3, implemented with the tidyverse v1.3 package ([102]https://github.com/tidyverse, accessed on 1 November 2019). Table 1. List of primers used in the study. Primer Name Sense/Antisense Sequence (5′ → 3′) Application FlV5mZeb2Ex9_Fwd Sense GGCTTACCTGCAGAGCATCA genotyping FlV5mZeb2Ex9_Rev Antisense CTCCATCTAACTCTGTCTTGGC genotyping FlV5_Fwd Sense CTACTCGCAGCACATGAATC genotyping FlV5_Rev Antisense GAGAGGGTTAGGGATAGGC genotyping ΔZP_P1_Fwd Sense GTCAGTCCGTCCCCAGGTTT genotyping ΔZP_P2_Rev Antisense GGCATGCTAGCTGGGCTGGT genotyping LN249_Fwd Sense GGAGCAAACTGAACAAAACCTCGCC genotyping LN249_Rev Antisense GGCGAGGTTTTGTTCAGTTTGCTCC genotyping LN209_Fwd Sense AGCGGATCAGATGGCAGTTCGCATG genotyping LN209_Rev Antisense CATGCGAACTGCCATCTGATCCGCT genotyping Zeb2_Fwd Sense CAATGCAGCACTTAGGTGTA qPCR Zeb2_Rev Antisense TTGCCTAGAAACCGTATTGT qPCR Zeb2V5_Fwd Sense GAAACGATACGGGATGAGGA qPCR Zeb2V5_Rev Antisense AGGAGAGGGTTAGGGATAGG qPCR Nanog_Fwd Sense TCTTCCTGGTCCCCACAGTTT qPCR Nanog_Rev Antisense GCAAGAATAGTTCTCGGGATGAA qPCR Pou5f1_Fwd Sense AGAGGATCACCTTGGGGTACA qPCR Pou5f1_Rev Antisense CGAAGCGACAGATGGTGG TC qPCR Sox2_Fwd Sense GCGGAGTGGAAACTTTTGTCC qPCR Sox2_Rev Antisense CGGGAAGCGTGTACTTATCCTT qPCR Pax6_Fwd Sense ACATCTTTTACCCAAGAGCA qPCR Pax6_Rev Antisense GGCAAACACATCTGGATAAT qPCR Acrv1b_Fwd Sense CTGCCTACAGACCAACTACACC qPCR Acrv1b_Rev Antisense CCACGCCATCCAGGTTAAAGA qPCR Lhx5_Fwd Sense AGAACCGAAGGTCCAAAGAA qPCR Lhx5_Rev Antisense TCACTTTGGTAGTCTCCGTA qPCR Ntng2_Fwd Sense CAAGGACTCTACGCTTTTCG qPCR Ntng2_Rev Antisense AGCACTCGCAGTCTTGAAAT qPCR Sema3f_Fwd Sense CTACACAGCATCCTCCAAGA qPCR Sema3f_Rev Antisense ACGGCATTCTTGTTTGCATT qPCR Smad1_Fwd Sense TACTATGAGCTCAACAACCG qPCR Smad1_Rev Antisense GAAGCGGTTCTTATTGTTGG qPCR Smad3_Fwd Sense CACGCAGAACGTGAACACC qPCR Smad3_Rev Antisense GGCAGTAGATAACGTGAGGGA qPCR Sox13_Fwd Sense CTTACAGGAGGTTGTGCCA qPCR Sox13_Rev Antisense TCCTTAGCTTCCACATTGCT qPCR Stat3_Fwd Sense CAATACCATTGACCTGCCGAT qPCR Stat3_Rev Antisense GAGCGACTCAAACTGCCCT qPCR Tcf4_Fwd Sense TTGAAGATGTTTTCGCCTCC qPCR Tcf4_Rev Antisense CCTGCTAGTCATGTGGTCAT qPCR Tgfbr2_Fwd Sense GAAGGAAAAGAAAAGGGCGG qPCR Tgfbr2_Rev Antisense TGCTGGTGGTGTATTCTTCC qPCR Amylase_Fwd Sense GGCTGAGTGTTCTGGGAT ChIP-qPCR Amylase_Rev Antisense CACGGTGCTCTGGTAGAT ChIP-qPCR Cdh1_R1_Fwd Sense GCTAGGCTAGGATTCGAACGAC ChIP-qPCR Cdh1_R1_Rev Antisense TGCAGGGCCCTCAACTT ChIP-qPCR [103]Open in a new tab 2.4. Tag-Zeb2 Mouse ESCs gRNAs ([104]Table 2) targeting Zeb2-ex9, and tracrRNA (Integrated DNA Technologies, IDT, Coralville, IA, USA), were diluted to 125 ng/µL in duplex buffer (from IDT). gRNAs were annealed to tracrRNA at a 1:1 ratio at 95 °C for 5 min and cooling the samples to room temperature. 250 ng of these annealed gRNAs were transfected in 350,000 mESCs together with 2 µg pX459-Cas9-puro vector and 1 µg ssDNA oligo of the donor template containing the FlagV5-tag sequence ([105]Table 2). Transfection was done in a gelatin-coated 6-well plate using DNA: Lipofectamine-2000 (ratio of 1:2). Six hours after transfection the medium was refreshed, and at 24 h the cells were selected in puromycin (2 µg/mL). After two days, the remaining cells were transferred to gelatin-coated 10-cm dishes and given fresh ESC medium (see below). Per dish 1000; 1500; or 2000 cells were plated and allowed to form colonies. The medium was changed every other day. Colonies were picked, expanded, and genotyped by PCR (both outer and inner primer sets were used ([106]Table 1; [107]Figure S1). All candidate clones were validated by Sanger-sequencing; correct clones were expanded and validated by western blot. Mouse ESCs genome-editing was performed under the GGO (genetically modified organisms) institutional licenses 95-053 and 99-164 assigned to the Erasmus University Medical Center. Table 2. gRNAs and donor template used for CRISPR/Cas9-mediated Zeb2 editing. Name Sequence CRISPR/Cas9 Flag-V5 donor template ^1 aaaatggaaaccaaatcagaccacgaagaagacaatatggaagatggcatcgaaGACTACAAAGACGATGA CGACAAGgatatcGGTAAGCCTATCCCTAACCCTCTCCTCGGTCTCGATTCTACGTAAactactgcatttt aagcttcctattttttttttccagtagtattgtt in-frame knock-in of Flag-V5 tag gRNA_ex9_1 GGAAACCAAATCAGACCACGAGG gRNA_ΔZP1 CCCGCGCGCGTTTCAATGGGCGC Zeb2 ΔP deletion gRNA_ΔZP2 CCCTCGCGAGTGCAACACACCAA gRNA_ΔZP3 GGGCTCGGAGCGCTGCCGATCGG gRNA_ΔZP4 CCGCTGGACCGGGGGGGAGTTGA [108]Open in a new tab ^1 Flag-V5 donor template (from top left to bottom right): in lowercase: homology arms located in exon9 and 3′ UTR of Zeb2, respectively in underlined lowercase: mutated PAM sequences; in uppercase: Flag-encoding sequence; in lowercase italics and bold: EcoRI restriction site; in underlined uppercase: V5-encoding sequence; in bold uppercase: STOP codon. 2.5. CRISPR/Cas9-Mediated Deletion of the Zeb2 Binding Site Located at chr2:45109746-45110421 Oligonucleotides for gRNAs ([109]Table 2) with target outside of this chr2-region were cloned into BbsI-digested pX330-hspCas9-T2A-eGFP plasmid. All resulting plasmids used further were sequenced. 4 µg of gRNA-plasmids (1 µg each) were transfected in 350,000 mESCs and selected (see above). After 24 h these cells were sorted as GFP+ cells (LSR Fortessa, Becton-Dickinson (BD), Franklin Lakes, NJ, USA). Per well of a 6-well plate 1000; 1500; or 2000 GFP+ cells were plated and colonies were allowed to form, picked (see above), and genotyped by PCR using primers flanking this deletion, and within and outside of it. Clones showing a possible heterozygous or homozygous deletion, as concluded from the PCR analysis, were subjected to ND. At D8, they were harvested, RNA was isolated and cDNA synthesized (see below), and amplified (for the primers, see [110]Table 1). All candidate clones were validated by Sanger-sequencing. 2.6. Chromatin Immunoprecipitation (ChIP) ChIP 2 × 10^8 cells were harvested at ND-D8 in 10 mL of PBS and cross-linked using 1% formaldehyde (Sigma Aldrich) for 15 min, rotating at room temperature. Quenching followed with 125 mM glycine for 5 min, again rotating at room temperature. Cross-linked cells were washed twice with ice-cold PBS (5 min; 1500 rpm (240 rcf), 4 °C) before the pelleted cells were snap-frozen and stored at −80 °C. For sonication, the cell pellets were thawed on ice, resuspended in 1 mL sonication buffer (10 mM Tris-HCl pH 8.0, 1 mM EDTA, 0.5 mM EGTA) supplemented with protease and phosphatase inhibitors (PPI, from Roche, Basel, Switzerland), and incubated on ice for 10 min. DNA was sheared by sonicating the cells using a probe sonicator (32 cycles, 30 sec-on amplitude 9, and 30 sec-off). These samples were centrifuged at 13,200 rpm (17,000 rcf) for 10 min at 4 °C. Chromatin pellets were snap-frozen and stored at −80 °C. To check sonication efficiency, 50 µL of sample was de-crosslinked overnight by adding NaCl (final concentration 5 mM) at 65 °C and constantly shaken (950 rpm, ThermoMixer C, Eppendorf, Hamburg, Germany). The next morning 5, 10, and 20 µL of sample were loaded on a 2% agarose gel, revealing ideally a DNA-smear around 300 bp. A 50-µL sample was used as a control input. For immunoprecipitation, the chromatin of 10^7 cells was diluted in ChIP-dilution buffer (17 mM Tris-HCl pH 8.0, 170 mM NaCl, 1.2 mM EDTA, 0.01% SDS, 1.1% Triton X-100, with 1× PPI) to a final volume of 1 mL. Samples were pre-cleared by adding pre-washed Protein A/G agarose beads (Santa Cruz) and further incubation for 1 h, rotating at 4 °C. Then, samples were centrifuged for 1 min (1000 rpm; 106 rcf) at 4 °C, and the pre-cleared chromatin (supernatant) was transferred to a new low-binding 1.5-mL tube and incubated with 50 µL of pelleted V5-Agarose beads (Sigma Aldrich), rotating at 4 °C overnight. Before the addition of V5-agarose beads, the beads were washed 5 times (5 min each) in PBS by rotating them. As a negative control, half of the sample was incubated with Protein A/G beads (Santa Cruz, sc-2003). The following day the beads were pelleted (1000 rpm; 1 min) and washed as follows: once with lower-salt buffer (20 mM Tris-HCl pH 8.0, 150 mM NaCl, 2 mM EDTA, 0.1% SDS, 1% Triton X-100), transferred to non-stick low-binding 1.5-mL tubes and then washed once with high-salt buffer (i.e., lower-salt buffer, but now 500 mM NaCl), once washed with LiCl buffer (which is 10 mM Tris-HCl pH 8.0, 250 mM LiCl, 1 mM EDTA, 1% NP-40, 1% sodium deoxycholate (DOC), and twice washed with 10 mM Tris-HCl pH 8.0, 1 mM EDTA (each incubation for 5 min, rotating at 4 °C, followed by gently spinning down. The protein-chromatin was then eluted from the beads by adding 250 µL of elution buffer (1% SDS, 100 mM NaHCO[3]), rotating for 1 h at room temperature twice, and combining the eluates from both steps. To the input sample, 450 µL of elution buffer was also added, and all samples were de-crosslinked through the addition of 5 mM NaCl at 65 °C overnight, shaking at 950 rpm. The day after, 2 µL of proteinase-K (from 10 mg/mL stock), 20 mM (final concentration) Tris-HCl pH 6.5, 5 mM (final concentration) EDTA pH 8.0 and 10 mg/mL RNase-A (Sigma-Aldrich) were added to each sample and incubated for 1 h at 45°C while shaking (700 rpm). DNA was extracted from the samples using the PCI method and diluted in water. Five independent ChIPs were performed, for a total of 10^8 mESCs used per condition, and pulled-down chromatin was pooled. ChIP efficiency was assessed by qPCR using primers amplifying Cdh1 promoter sequences bound by Zeb2 [[111]25]. All primers used are listed in [112]Table 1. 2.7. ChIP-Sequencing DNA libraries from input (i.e., control) and V5 ChIPs were prepared using ThruPLEX DNA protocol (TakaraBio, Kusatsu, Shiga, Japan) specific for low amounts of DNA and sequenced on Illumina HiSeq-2500, and single reads of 50 bp were generated. Adapter sequences were trimmed from the 3′-end of the reads, after which the reads were aligned to the mm10/GRCm38 genome using HISAT2 [[113]39]. From the alignments, secondary or supplementary, low-quality, and fragmented alignments (fragments longer than 150 bp) were filtered away. Peaks were called with MACS [[114]40], and coverage was determined. 42 and 25 million reads were generated for input and V5 ChIP, respectively. 2.8. ChIP-Sequencing Data Analysis Peak calling was performed with MACS2 (Galaxy version 2.1.1.20160309.6) [[115]40,[116]41], with default parameters (narrow peak calling, Mm1.87e9, FDR < 0.05) using the input sample as background. The No model parameter was used, and the extension size was set at 210 bp based on the predicted fragment lengths from the alignments (MACS2 predict-tool, Galaxy version 2.1.1.20160309.1) [[117]40,[118]41]. The distance of the aligned reads from the TSS of the gene was analyzed using ComputeMatrix (Galaxy version 3.3.2.0.0) and PlotHeatmap Galaxy version 3.3.2.0.1; the used matrix is based on the log2ratio of the aligned ChIP peaks over the input, calculated using BamCompare (Galaxy version 3.3.2.0.0) [[119]42]. 2.9. Transcription Factors Motif Enrichment Analysis To identify the transcription factor binding sites (TFBS) in Zeb2-binding regions associated with DEGs, we first extracted unique Zeb2-peaks located 10 kb −/+ from the transcription start site (TSS). Next, we analyzed the TFBS enrichment using a UniBind enrichment tool with motifs from the UniBind database [[120]43] (using reference genome GRCm38/mm10). As a background for the analysis, all Zeb2-peaks were used. The p-value from Fisher’s exact test after multitest adjustments was used to identify significantly enriched TFBS. Further, the max rank index calculated based on the odds ratio, p-value from Fisher’s exact test, and the number of overlapping regions, was applied to rank the top enriched motifs. 2.10. RNA-Sequencing The quality of total RNA (of biologically independent triplicates) of wild-type mESCs at D0, and at ND D4, D6, and D8, was checked on Agilent Technologies-2100 Bioanalyzer, using an RNA nano-assay. All samples had RIN values of 9.8 or higher. Triplicate RNA-seq libraries were prepared (Illumina, San Diego, CA, USA TruSeq stranded mRNA protocol; [121]www.illumina.com, accessed on 1 January 2020). Briefly, 200 ng of total RNA was purified using polyT-oligo-attached magnetic beads for ending with polyA-RNA. The polyA-tailed RNA was fragmented, and cDNA synthesized (SuperScript II, Invitrogen, Waltham, MA, USA, random primers, in the presence of Actinomycin D). cDNA fragments were end-repaired, purified (AMPureXP beads), and A-tailed using Klenow exo-enzyme and dATP. Paired-end adapters with dual index (Illumina) were ligated to the A-tailed cDNA fragments and purified (AMPureXP beads). The resulting adapter-modified cDNAs were enriched by PCR (using Phusion polymerase) as follows: 30 s at 98 °C, 15 cycles of (10 s at 98 °C, 30 s at 60 °C, 30 s at 72 °C), 5 min at 72 °C. PCR products were purified (AMPureXP beads) and eluted in 30 µL resuspension buffer. One μL was loaded on an Agilent 2100 Bioanalyzer using a DNA-1000 assay to determine the concentration and for a quality check. Cluster generation was performed according to the Illumina TruSeq SR Rapid Cluster kit v2 Reagents Preparation Guide ([122]www.illumina.com, accessed on 1 January 2020). After the hybridization of the sequencing primer, sequencing-by-synthesis was performed using a HiSeq-2500 with a single-read 50-cycle protocol followed by dual index sequencing. Illumina adapter sequences have been trimmed off the reads, which were subsequently mapped against the GRCm38 mouse reference (using HiSat2 version 2.1.0) [[123]39]. Gene expression values were called using HTSeq-count version 0.9.1 [[124]44] and Ensembl released 84 gene and transcript annotation. Sample QC and DEG analysis have been performed in the R environment for statistical computing (version 3.5.3, using DESeq2 version 1.22.1 [[125]45] and Tidyverse version 1.2.1 ([126]https://github.com/tidyverse;https://www.r-project.org/ from R Core Team, accessed on 1 January 2019). 2.11. Meta-Analysis, Pathways Enrichment, Gene Ontology, Function Analysis, and Gene to Disease Association RNA-seq datasets (as DEG tables, from [[127]23,[128]36], were downloaded from GEO ([129]https://www.ncbi.nlm.nih.gov/geo/, [130]GSE35616, and [131]GSE103003, respectively, accessed on 1 March 2019). Cross-referencing and visualization were performed in R using Tidyverse, VennDiagram, and pheatmap packages. The remaining analyses were performed with the StringDB package for R [[132]46], while for Gene-to-Disease association Disgenet2R for R was used [[133]47]. 2.12. Zeb2 Short Hairpin (sh) RNA-Mediated Knock-Down Zeb2 knock-down (KD) was carried out by transfecting Zeb2-shRNAs into ESCs, at ND-D8. For this, the CAs were dissociated (see above), and single-cell suspensions were transfected using Amaxa Nucleofector II (using kit V, program A-33). [134]Table 3 lists the shRNAs used in this study; the Zeb2 target sequence is indicated in bold. In total 4 μg of shRNA was used for the transfection of 4.5 × 10^6 cells. After transfection, the cells were plated in 5 mL of N2-medium on a poly-ornithine/laminin-coated 6-cm cell culture dish. Two hours post-transfection, the medium was refreshed, and 24 h after transfection was changed to N2-medium in the presence of puromycin (see above) for 48 h. The cells were then harvested, and KD efficiencies were examined using RT-qPCR. As a control, scrambled shRNA was used. Table 3. shRNAs used. Name Sequence shZeb2_1 CCGGCCGAATGAGAAACAATATCAACTCGAGTTGATATTGTTTCTCATTCGGTTTTTG shZeb2_2 CCGGCCTCAGGAATTTGTGAAGGAACTCGAGTTCCTTCACAAATTCCTGAGGTTTTTG shZeb2_3 CCGGCCAGTGTCAGATTTGTAAGAACTCGAGTTCTTACAAATCTGACACTGGTTTTTG shZeb2_4 CCGGCCCATTTAGTGCCAAGCCTTTCTCGAGAAAGGCTTGGCACTAAATGGGTTTTTG shCTRL CCGGCAACAAGATGAAGAGCACCAACTCGAGTTGGTGCTCTTCATCTTGTTGTTTTT [135]Open in a new tab 3. Results 3.1. Heterozygous Zeb2-V5 ESCs Differentiate as Wild-Type Cells The addition of short epitope(s) at the N- or C-terminus, as well as activation/repression domains of heterologous transcription factors (TFs) at the Zeb2 C-terminus was previously shown not to interfere with Zeb2’s DNA-binding (as tested in Xenopus embryos [[136]48], heterologous cells [[137]49], mouse forebrain [[138]36] and mESCs [[139]25]). Here, we have used a CRISPR/Cas9 approach (see Materials and Methods) to insert an in-frame Flag-V5-tag encoding sequence in Zeb2-ex9 of mESCs (ESC clone 2BE3; [140]Figure S1). Allele-specific RT-qPCR, using primers that amplify sequences between the ex9 and the V5-tag, showed mRNA expression from the tagged allele in ESC culture at day (D) 0, 4, 6, and 8 of ND as compared to the parental wild-type (WT) mESC line ([141]Figure 1A). Figure 1. [142]Figure 1 [143]Open in a new tab Characterization of the heterozygous Zeb2-Flag-V5 mESC line and ChIP-qPCR validation. (A) Allele-specific RT-qPCR using two sets of primers located either in exon7 (and therefore able to detect the whole Zeb2 mRNA produced by both alleles; orange bar) or located in exon9 and the V5-tag (thus recognizing specifically the knocked-in tagged allele; light blue bar); (B) Western blot analysis showing V5 epitope containing Zeb2 in ESC-derived NPCs (at D8 of neural differentiation, ND) in nuclear extracts (NE) and cytoplasmic extracts (CE). Membranes were blotted with anti-V5 antibody (left panel, αV5) or anti-Zeb2 antibody (right panel, αZeb2, [[144]20]). As a control, a fraction of Zeb2-rich extract obtained from HeLa cells transfected with a pcDNA3-V5Zeb2 vector was also separated in the same gel; (C) Zeb2 mRNA levels in wild-type (WT, grey bar) and Zeb2-V5 (orange bar, clone 2BE3, indicated as Zeb2-V5) mESCs during ND, as determined by RT-qPCR; (D) RT-qPCR for ND, using marker genes: pluripotency marker genes Nanog, Pou5f1 (Oct4) and Sox2 are downregulated in Zeb2-V5 mESCs similarly to WT. The neuronal marker gene Pax6 is also significantly upregulated during differentiation, such as in WT mESCs; (E) Scheme of the mouse Cdh1 promoter showing the three E-boxes located upstream of the ATG start codon. Zeb2 binds specifically to only two of these, indicated as R1 [[145]25]; (F) ChIP-qPCR showing enrichment for Zeb2-V5 binding to the R1 region of the Cdh1 promoter. Agarose beads were used as negative control (in grey). Western blot analysis in nuclear extracts of ND-ESCs at D8 (thus NPCs, [[146]25]) confirmed the presence of Zeb2 of expected molecular mass, using either anti-V5 (αV5) or anti-Zeb2 antibodies ([147]Figure 1B). Both the Zeb2-V5 and wild-type (WT) ESCs were then also verified during ND differentiation for temporal expression of Zeb2, core pluripotency genes (Pou5f1, Nanog, both downregulated upon ND, and Sox2, also an NPC TF) and an acknowledged NPC marker (Pax6) ([148]Figure 1C,D). The untagged and tagged Zeb2 ESC lines displayed comparable expression dynamics of Zeb2, indicating that Zeb2-V5 NPCs at D8 of ND can be used for chromatin immunoprecipitation sequencing (ChIP-seq). Further confirmation came from the selective pull-down of Zeb2 on the known target Cdh1, using ChIP-qPCR. Zeb2 binds to two of three E-boxes in the mouse Cdh1 promoter ([149]Figure 1E), which it represses during epithelial-to-mesenchymal transition (EMT) [[150]19,[151]25]. A ±25-fold enrichment for Zeb2-V5 was obtained when probing this Cdh1 region using anti-V5 antibody (αV5) conjugated beads compared to agarose beads as negative control ([152]Figure 1F). Hence, Zeb2-V5 binds to known Zeb2 target sites, and the NPCs are suitable for endogenous mapping of the Zeb2 genome-wide binding sites (GWBS). 3.2. One-Third of 2432 Zeb2 DNA-Binding Sites Map Close to the Transcription Start Site of System-Relevant Expressed and Protein-Encoding Genes, Including the Zeb2 Gene Itself αV5-precipitated samples from upscaled Zeb2-V5 NPCs were used for ChIP-seq (see materials and methods), followed by analysis with Galaxy Software [[153]50]. Of the 2432 total significant peaks, 2294 peaks (94% of total) mapped to 1952 loci that encode proteins, while 125 peaks (5% of total) mapped to micro-RNA (miRNA) genes, and 1% to regions that lack annotation (NA, using ENSEMBL-GRCm38.99; [154]Figure 2A; [155]Table S1). Figure 2. [156]Figure 2 [157]Open in a new tab Zeb2-V5 protein is recruited at the TSS of transcriptional regulator encoding genes, predominantly those classified in Wnt signaling. (A) 2432 peaks were selected from our ChIP-seq data set (see [158]Section 2). Of these, 94% are associated with protein-coding loci, 5% with miRNAs, and the remaining 1% map to regions without functional annotation (NA); (B) Frequency plot showing the binding of Zeb2-V5 at and around (−10 to +10 kb) the TSS; (C) The 2294 peaks map to 1952 protein-encoding genes, many of which operate in Wnt signaling ([159]Table S1) or are (as shown in panel (D)) transcriptional regulators; (E) Of the 1952 protein-encoding genes, 1244 are differentially expressed during ND when compared to the undifferentiated state (D0). Of these 1244 genes, 335 are differentially expressed at all three time points of ND. We also list a few examples of DEGs uniquely expressed at one time point, as well as these that are shared among three time points; a full list is provided in [160]Table S2; (F) Overlap of the Zeb2-bound regions with H3K27ac, H3K4me1, and H3K4me3 histone marks in the −10/+10 kb from the TSS of up or downregulated genes at D8 of mESCs differentiation; (G) Overlap of the Zeb2-bound regions outside the −10/+10 kb region from the TSS with histone marks. About 37.5% of all binding sites of Zeb2-V5 are located within −10/+10 kb of annotated transcription start sites (TSS, [161]Figure 2B). Gene ontology (GO) pathway enrichment analysis of the aforementioned 1952 loci revealed binding of Zeb2 to classes of genes annotated to signaling by Wnt, integrin, chemokine/cytokine (predominantly as defined in inflammation) and cadherin, respectively, as well as to developmental signaling by EGF, VEGF, TGFβ, and FGF family pathways ([162]Figure 2C). Among these 1952 loci, those for genes encoding transcription regulatory proteins, and post-translational modification as well as metabolic enzymes, are well-represented ([163]Figure 2D). In parallel, we applied bulk temporal RNA-seq of WT mESCs at D0 (undifferentiated), D4 (induction of ND), D6 (early NPCs), and D8 (NPCs) and checked the expression dynamics of the 1952 Zeb2-bound genes (from the D8 ChIP-seq sample). Among these, 1244 changed in steady-state transcription levels between D4-8 as compared to D0 ([164]Figure 2E; log2FoldChange < −0.5 or >0.5 and p-value < 0.05; low-stringency analysis was opted to assess also small differences in mRNA of Zeb2-bound genes). Further, 335 of these genes, including Zeb2 itself, are commonly expressed between D4-6-8, but at different levels (for lists of all DEGs, see [165]Table S2). [166]Figure S2A depicts the D4, D6, and D8 transcriptomes of ND-mESCs, each compared to D0, with an indication of whether the genes are bound or not by Zeb2, as determined by our Zeb2-V5 ChIP-seq. At each of these respective time points, hence at different Zeb2 mRNA levels, about 11–14% of the up-/down-regulated DEGs are bound by Zeb2 ([167]Figure S2B). Among the Zeb2-bound genes that are normally down-regulated, Dnmt3l and Esrrb are present, suggesting that upregulation of Zeb2 in ND-ESCs (D6 and D8) directly causes downregulation of these two genes accordingly ([168]Figure S2C; Table S2). Zeb2 has been suggested as a direct repressor of Dnmt3l and Esrrb, facilitating the switch from self-renewal of ESCs to their exit from pluripotency, and promoting differentiation, since expression levels of all Dnmt3 genes remained higher in Zeb2-knockout (KO) ND-ESCs [[169]25]. However, these Zeb2-KO cells also convert very inefficiently into EpiLSCs and fail to exit from primed pluripotency. Importantly, among the Zeb2-binding genes whose mRNA levels increased during ND, Zeb2 itself is also present (yellow dot, [170]Figure S2C), indicating autoregulation. In fact, in this ND model, the highest recruitment of Zeb2-V5 in ChIP-seq data was mapped upstream of the TSS of Zeb2 ([171]Table S1). Out of the 1244 Zeb2-bound genes that significantly changed steady-state mRNA levels in our D4 to D8 transcriptome data sets, 213 are exclusive DEGs in D8 NPCs ([172]Figure 2E and [173]Figure S2D). Among these, Tcf4 is bound by Zeb2 and increases in expression in NPCs ([174]Figure 2E and [175]Figure S2D). Tcf4 is a ubiquitous basic helix-loop-helix (bHLH) type TF that binds to E-boxes; its many isoforms [[176]51,[177]52] cooperate with cell-type specific bHLH TFs in heterodimers, which are active during CNS development [[178]53,[179]54] (for a review, see [[180]55]). In oligodendrocyte precursors (OPCs), Tcf4 is essential for their subsequent differentiation. It dimerizes with the lineage-specific bHLH-TF Olig2, further promoting their differentiation and maturation [[181]56], while Zeb2, together with upstream Olig1/2, is essential for myelinogenesis in the embryonic CNS [[182]21]. Here, Zeb2 generates anti-BMP(-Smad)/anti-Wnt(-β-catenin) activities, which is crucial for CNS myelinogenesis by differentiation of OPCs. The regulatory action of Zeb2 on the Tcf4 target gene, as found in mouse cells by our ChIP-seq, may underpin phenotypic similarities between MOWS and Pitt-Hopkins syndrome patients (PTHS, OMIM #610954; for a recent discussion, see [[183]57]) the latter caused by mutations in TCF4 [[184]58], making us speculate that TCF4 may be deregulated in neural cells in MOWS. 3.3. Zeb2 Peaks Overlap with Active Enhancers and Promoters To assess whether Zeb2-peaks are present in the regulatory regions of up or down-regulated genes from our transcriptome data, we cross-referenced the coordinates of the Zeb2 broad peaks within −10/+10 kb from the TSS with the mouse ChIP-seq datasets available in ENCODE for nervous systems (cerebellum, cortical plate, olfactory bulb, forebrain, midbrain, hindbrain, neural tube, and olfactory bulb, respectively). We found that, of these datasets from histone ChIP-seq available in ENCODE, the respective H3K27ac, H3K4me1, and H3K4me3 marks were overlapping with our ChIP-seq data ([185]Figure 2F). The H3K27ac signature strongly overlaps with the Zeb2 peaks in genes upregulated at D8 (19% in upregulated genes vs. 4% in downregulated genes). For H3K4me1 and H4K4me3 marks, no big difference in overlap between up and down-regulated genes was observed. While the H3K27ac mark is associated with active enhancers, H3K4me1 is associated with primed enhancers, and H3K4me3 is considered a “promoter” marker [[186]59]. Taken together, these data suggest an activating role for Zeb2 here. Outside the −10/+10kb considered range, about 48% of the identified peaks had a 42% overlap with H3K27Ac and a 58% overlap with H3K4me1 histone marks ([187]Figure 2G). We then did motif enrichment analysis using UniBind ([188]https://unibind.uio.no, 1 May 2022) for TFs that could bind the Zeb2-bound peaks or could do so in proximity to up or down-regulated genes at D8 of mESCs differentiation ([189]Figure S3A,B). In those peaks close to the TSS of upregulated genes, binding motifs for Sox2, Gata2, and Tcf3 are very abundant. These TFs are known to function during NPC or ND. For example, Sox2 is an acknowledged marker for neurogenesis [[190]60]. It has been demonstrated that Zeb2 is needed to elicit anti-Sox2 activities in (re)myelination by adult Schwann cells in the PNS, needed for normal progression of commitment, differentiation, and maturation in this glial cell lineage [[191]21,[192]24,[193]61]. Tcf3 (also known as E2A) plays a role in stem cell self-renewal [[194]62], but is also important during neural fate commitment and possibly repressing Nodal signaling during ND [[195]63]. Gata2 has been associated with negative regulation of proliferation in NPCs and, as a result, further differentiation of these cells [[196]64]. However, how Zeb2 acts upon or together with these TFs during NPC differentiation is not fully known yet. In the peaks close to the TSS of downregulated genes there is a prevalence for CTCF, Fos, Myc, and Stat5a binding. CTCF is of interest because it acts as a link between 3D genome architecture and gene expression regulation. During NPC differentiation however, it was observed that 40% of the NPC-specific DNA loops were not CTCF-dependent, whereas, in other cell-state specific loops, this was only 10%, indicating a less important role for CTCF in the regulation of NPC differentiation compared to other cell lineages [[197]65]. This might indicate an interesting role for Zeb2 in binding and possibly regulating CTCF mRNA levels during NPC differentiation, supporting the subsequent activation of NPC-specific genes arising from e.g., repressing CTCF. Also here, more studies are required to get more insights into the cooperativity or counteracting actions of Zeb2 with candidate TFs in the regulation of the candidate Zeb2 targets. 3.4. Meta-Analysis of Identified Binding Sites and Perturbed-Zeb2 RNA-Seq Data Reveal Overlapping Zeb2 Target Genes We performed a meta-analysis of three published transcriptome data sets from control and Zeb2-KO mice: sorted E14.5 mouse ventral forebrain interneurons (Nkx2.1-Cre driven Zeb2-KO; [[198]36]) and sorted (at P2) progenitors of the ventricular-subventricular zone (V-SVZ), an adult neurogenic niche in the forebrain (Gsh2-Cre; [[199]23,[200]36]). In addition, we used high-throughput RT-qPCR data generated on a Fluidigm platform and obtained after esiRNA-based knockdown (KD) of Zeb2, as part of a systems-biology study in ND-mESCs [[201]37] (with the Zeb2 KD data subset kindly provided by R. Dries, Boston University). From these respective datasets, the DEGs upon the Zeb2 perturbations (p-value < 0.05; log2FoldChange < −1 and >1) were filtered. This identified 108 genes in total, and these depend on normal Zeb2 levels for their (i) downregulation/repression (if directly by Zeb2, as a repressor) or (ii) other genes that depend on Zeb2 for their upregulation/activation (if directly by Zeb2, as an activator) (for Zeb2 as dual TF [[202]29,[203]30,[204]66]. In parallel, the 2294 Zeb2-V5 sites mapping to the 1952 protein-encoding genes were filtered from the complete ChIP-seq dataset and then used as references for the RNA data sets