Graphical abstract graphic file with name fx1.jpg [47]Open in a new tab Highlights * • A new drug repurposing pipeline designed to identify compounds with desired traits * • Identified definitive endoderm differentiation inducers, reducing growth factor needs * • A list of potential inducers for future experimental screens was compiled * • Endoderm-mesoderm single-cell RNA-seq reveals potential drug targets __________________________________________________________________ In this article, Novakovsky and Sasaki et al. identified small molecules that can induce stem cell differentiation toward definitive endoderm in the absence of growth factors. One of these chemicals was found to potentially inhibit the MYC pathway. The computational pipeline can be used for the identification of compounds for other differentiation targets. Introduction Over 400 million people worldwide live with diabetes, a chronic disease caused by insulin deficiency or resistance. A promising emerging treatment approach uses human embryonic stem cell (hESC)-derived insulin-producing cells to replace lost or dysfunctional patient cells. Current protocols for generating insulin-producing cells mimic pancreatic β cells’ developmental specification by differentiating hESCs into primitive streak (PS), an intermediate fate branchpoint ([48]Davis et al., 2008), and then definitive endoderm (DE), a crucial pancreatic precursor ([49]Mahaddalkar et al., 2020) ([50]Figure S1A). Activation of the transforming growth factor β (TGF-β) and WNT pathways through extrinsic signals and inhibition of the PI3K/mTOR pathway lead hESCs to the PS branchpoint. These well-studied processes have known gene markers ([51]Table S1). Improving differentiation protocols for DE from hESCs remains a research focus, with some protocols using CHIR99021 (CHIR), a WNT inducer, and the growth factor activin A (AA), a TGF-β inducer ([52]Loh et al., 2014; [53]Naujok et al. 2014) ([54]Figure S1B). The complex manufacturing process and high cost of growth factors like AA makes current DE differentiation protocols expensive. Small molecules are ideal replacements: they are more stable, easier to store, allow for greater specificity, and have greater activity and reproducibility ([55]Pan and Liu 2019). Small molecules can successfully induce differentiation, even in the absence of growth factors ([56]Korostylev et al., 2017; [57]Borowiak et al., 2009). However, small molecules rarely act on one single target, which can cause undesired off-target effects, and their discovery, for instance by high-throughput screening, is an expensive and time-consuming process ([58]Waring et al., 2015). Drug repurposing is an alternative method for identifying compounds with desired properties ([59]Pushpakom et al., 2019). The Connectivity Map (CMap) is a catalog of induced gene expression signatures for thousands of compounds ([60]Lamb et al., 2006). It provides a platform for identifying potential inducers/inhibitors of a process of interest based on the similarity of the expression signatures of the process and those of compounds in the database. The application of CMap for drug repositioning has been demonstrated in different fields ([61]Liu et al., 2015a; [62]Zhang et al., 2015; [63]Dyle et al., 2014; [64]Brum et al., 2018). Recent studies increasingly utilize the Library of Integrated Network-based Cellular Signatures (LINCS) database, an expansion of the CMap catalog, including >500,000 gene expression signatures from the screening of >20,000 small molecules across 99 different cell lines ([65]Subramanian et al., 2017). Here, using DE differentiation as a model developmental process, we describe three in silico screening approaches to discover small molecules that can be used for directed differentiation. To accomplish this, we used transcriptomic profiles of AA-based DE differentiation or pathway and transcription factor (TF) target enrichment to query the CMap/LINCS catalogs and identify candidate AA replacements. We tested the ability of a subset of these candidates to drive endoderm differentiation from hESCs. This approach presents an efficient alternate means to optimizing stem cell differentiation protocols. Results Bulk RNA-seq analysis of public datasets reveals DE-specific TFs and pathways To identify potential endoderm inducers, we built a profile of gene expression changes of hESC differentiation into DE ([66]Figure S1A). We analyzed two publicly available bulk RNA sequencing (RNA-seq) time-series datasets that capture gene expression changes in hESCs after the addition of AA to the media ([67]Figure S2A), GEO: [68]GSE75748 ([69]Chu et al., 2016) (H1 hESCs) and [70]GSE109658 ([71]Lu et al., 2018) (H9 hESCs). We then performed differential expression analysis at 0 (i.e., pluripotent stage) and 96 h (i.e., DE stage) and compared the differentially expressed genes between the two datasets. Both up- and down-regulated genes significantly overlapped between the two datasets, with 735 and 433 genes, respectively ([72]Table S2). The analysis confirmed not only agreement between the two datasets ([73]Figure S2B) but also highlighted groups of genes with similar and expected expression patterns ([74]Table S1). For instance, the pluripotency markers POU5F1, SOX2, and NANOG showed a steady decrease of expression over time. In contrast, expression of mesoderm markers, such as TBXT, FGF4, or CDX1, peaked at 24 h, declining shortly after, while the expression of endoderm markers including SOX17, GATA6, and EOMES increased over time ([75]Figure 1A). Notably, the number of differentially expressed genes at 96 h was in the order of thousands, which is consistent with the significant changes that cells undergo in response to AA treatment ([76]Chu et al., 2016) ([77]Figure S2C). Figure 1. [78]Figure 1 [79]Open in a new tab Analysis of public DE differentiation datasets (A) The expression trend of key genes involved in pluripotency and mesoderm/endoderm development; shows consistency among public datasets. (B) Heatmaps with activation values of pathways and transcription factors along differentiation timeline respectively; clustering is based on GEO: [80]GSE75748 values. Next, we explored pathway activities at each time point using single-sample gene set enrichment scores ([81]Hänzelmann et al. 2013) ([82]Figures 1B and [83]S3, left panels). For ease of interpretation, estimated activities of Hallmark pathways from the Molecular Signatures Database (MSigDB) ([84]Liberzon et al., 2015) were clustered. DNA repair and spermatogenesis pathways were significantly up-regulated in early stages ([85]Table S3), consistent with previous reports linking cell-cycle-related pathways with pluripotency ([86]Chappell and Dalton 2013; [87]Hsu et al., 2019). Similarly, MYC and E2F targets showed a strong initial activity that became negligible in later time points. The following pathways were activated at later time points: p53, apoptosis, hypoxia, epithelial-mesenchymal transition (EMT), PI3K/AKT/mTOR, tumor necrosis factor α (TNF-α) via nuclear factor κB (NF-κB), interleukin-2 (IL-2)/STAT5 signaling, WNT, and TGF-β. All identified pathways were significant ([88]Table S3). The activation of TGF-β, p53, and apoptotic pathways was consistent with expectations ([89]Wang et al., 2017; [90]Lanneau et al., 2007). However, activation of the PI3K/AKT/mTOR pathway was surprising, as it has been shown to promote neuroectoderm differentiation ([91]Yu and Cui 2016), and the roles of NF-κB and IL-2/STAT5 pathways in DE differentiation were unclear. Using the same procedure, we assessed TF activity given the enrichment of their respective targets ([92]Figures 1B and [93]S3, right panels). POU3F and NFI, among others ([94]Table S3), showed enhanced activity in early time points, which is consistent with their roles in neural plate development, possibly indicating the “readiness” of stem cells to adopt a neuroectoderm fate while awaiting an external signal ([95]Iwafuchi-Doi et al., 2011). In the subsequent 48 h, different TFs became active. Interestingly, receptor-regulated SMADs displayed different patterns of activity, with SMAD3 and its co-regulator SMAD4 exhibiting a stable increase. The activity of TFs GLI1, -2, and -3 increased steadily over time, in agreement with the important role of the Hedgehog signaling pathway in endoderm development ([96]Deol et al., 2017). We also observed an increase of HIF1A activity, which is a hypoxia-inducible factor. We could not find prior reports linking the observed JUN activity to the formation of endoderm lineage ([97]Figure S3, right). Taken together, the analysis of the bulk RNA-seq data highlighted the importance of TGF-β, hypoxia, mTOR, and other pathways, as well as the TFs SMAD3 and -4 in the differentiation process. Single-cell RNA-seq analysis of the PS branchpoint identifies potential regulators of DE fate Cell differentiation is a complex, heterogeneous, and continuous process; its analysis at single-cell resolution could potentially reveal novel TFs guiding cell fate decisions, as well as new intermediate cell types. We therefore performed single-cell RNA-seq (scRNA-seq) of the PS branchpoint ([98]Figure 2A). We profiled at 36 and 72 h post-induction with AA and at 72 h post-induction with CHIR, which induces mesoderm (ME) differentiation. After data pre-processing and clustering, cells were annotated into different developmental stages (hESC, PS, ME, and DE) based on known gene markers whose expression correlated well with the sampled time points ([99]Figure 2B; [100]Table S1). Then, we identified differentially expressed TFs in DE, ME, and PS. We hypothesized that TFs inducing DE lineage would be differentially expressed in both PS (the branchpoint) and DE. Similarly, common differentially expressed TFs in PS and ME would induce ME lineage. The set of DE inducers included such known TFs like EOMES, GATA6, GSC, HHEX, LHX1, and OTX2, whereas FOXH1, MSX1, SP5, and TBXT were among the set of ME inducers ([101]Table S4). Pseudotime analysis revealed that both groups of genes became strongly up-regulated through the course of differentiation at each respective stage, supporting the role of these TFs in regulating lineage specification ([102]Figures 2C and 2D). Figure 2. [103]Figure 2 [104]Open in a new tab Single-cell RNA-seq analysis (A) t-Stochastic neighbor embedding (tSNE) plot of the analyzed dataset; major cell clusters are labeled with the corresponding cell type names. (B) Distribution of the expression values of the marker genes, which were used to annotate the clusters. (C) Pseudotime ordering of the cells shows the expected bifurcation point with DE and ME branches. (D) Expression trends of the key mesoderm markers along the differentiation axis. The absence of the “spike” that is present in bulk data (A) can be explained by low depth and resolution of the single-cell data. (E) The regulon specificity score heatmap of the most informative TFs for each of the cell types. (F) The distribution of GSEA scores per single cell for the key developmental pathways. The asterisks (^∗∗∗∗) highlight the significance of the change with respect to hESCs (Wilcoxon test). Next, we identified “active” TFs during the course of differentiation using SCENIC ([105]Aibar et al., 2017). Overall, SCENIC results were in agreement with those of the bulk RNA-seq analysis. Consistent with previous reports, SCENIC identified endoderm-related factors, including LHX1 and KLF6/8, OTX2 and FOXA2, SOX17, EOMES and ETS1/2, FOXQ1, and SIX3. It also revealed potentially new markers such as RREB1, CUX1, IRF9, DDIT3, TFAP2C, PBX3, JUN, and JUND ([106]Figure 2E). Notably, RREB1 ([107]Lee et al., 2012) and CUX1 ([108]Ripka et al., 2010) are important for the development of the primitive gut tube and pancreas, while DDIT3 is a known apoptosis inducer ([109]Papathanasiou et al., 1991), and TFAP2C might be involved in mesoderm specification ([110]Madrigal et al., 2020). The identification of DDIT3, IRF, JUN, and TFAP2 was consistent with the previous bulk RNA-seq results ([111]Figure 1B). However, the role of these genes in endoderm formation is not yet understood. For ME, we identified the following activated TFs: CDX1/2/4, which are involved in hematopoiesis ([112]Paik et al., 2013), TCF4, LEF1, SP5, HOXB, TBX6, and FOXH1, which is also involved in heart development ([113]von Both et al., 2004). The role of other top ranked TFs in ME development, such as MAX, ZNF90, DBP, E2F1, and TFDP2, remains unknown ([114]Figure 2E). SCENIC also assigned high activities to the pluripotency markers POU5F1 (OCT4), NANOG, and MYC in stem cells. SCENIC did not detect any enriched TFs in PS cells. Finally, we performed gene set enrichment analysis (GSEA) at the single-cell level, identifying high TGF-β and low WNT pathway activities in DE cells ([115]Figure 2F). The activity of the mTORC1 Hallmark pathway was reduced in DE cells compared with hESCs and PS. We also observed down-regulation of the MYC pathway and activation of the hypoxia pathway in DE and ME compared with hESCs. GSEA of SMAD targets revealed a strong enrichment of SMAD2, -3, and -4 in PS and DE stages, corroborating their important role in DE differentiation. These results were consistent with those of the bulk RNA-seq data ([116]Figure 1B). Taken together, scRNA-seq provided additional evidence for the importance of TGF-β/SMAD activation and mTOR/WNT inhibition in DE differentiation. It also uncovered TFs that could be important for lineage specification at the PS branchpoint. Drug repurposing identifies candidate DE inducers We performed a CMap analysis using ssCMap ([117]Zhang and Gant 2009) on 300 common up- and down-regulated genes from the bulk RNA-seq analysis (150 for each cohort; [118]Figure 3A, right panel) that relies on the initial release of CMap. Similarity of ssCMap profiles to the DE differentiation profile was assessed by GSEA ([119]Subramanian et al., 2005). The analysis led to the identification of the PI3K inhibitors LY-294002 (LY29) and wortmannin, as well as the mTOR inhibitor sirolimus, whose expression profiles significantly correlated with the gene expression signature of DE (PC3 cell line) ([120]Figure 3B; [121]Table S5). The utility of this approach is highlighted by previous studies in which LY29 and wortmannin have been used along with AA to induce DE ([122]Pan and Liu 2019), and mTOR inhibition has been shown to improve PS induction ([123]Zhou et al., 2009). Using the signatures of the same 300 common up-/down-regulated genes, we performed GSEA on the updated CMap/LINCS catalog. We set a threshold on the enrichment score to 0.35, resulting in the selection of 313 out of ∼200,000 conditions corresponding to 298 unique compounds ([124]Figure 3C; [125]Table S5). Some of the top identified hits were heat shock protein 90 (HSP90) and histone deacetylase (HDAC) inhibitors. HSP90 has been reported to inhibit mesoderm and endoderm fates ([126]Bradley et al., 2012), while HDAC inhibitors have been linked to endoderm lineage formation ([127]Zhou et al., 2007). Together with ssCMAP, this analysis resulted in 299 (298 that include both sirolimus and wortmannin plus LY29) unique compounds. Later, we refer to this approach as “ssCMAP/LINCS” ([128]Figures 3B and 3C). Figure 3. [129]Figure 3 [130]Open in a new tab Drug repurposing pipeline (A) Different sources of input for drug repurposing; all are based on the RNA-seq results of the public datasets. (B) ssCMAP result for endoderm transcriptomic profile (intersection of GEO: [131]GSE109658 and [132]GSE75748). (C) Distribution of top 40,000 compound signatures ranked by GSEA enrichment score. The threshold of 0.35 is highlighted with a dashed line. (D and E) Spearman (D) and Pearson (E) correlation of LINCS signatures with transcriptomic profiles from the public datasets (GEO: [133]GSE75748 - x axis; [134]GSE109658 - y-axis); correlation threshold of 0.15 for both datasets is highlighted with transparent rectangles. (F) Pipeline for the identification of potential TGF-β/SMAD inducers. MCF-7 drug profiles are filtered based on dose and duration of treatment. The filtering is followed by GSEA with Hallmark and KEGG pathways and TF targets from RegNetwork. (G) Overlap between three different approaches. As an alternative to using a subset of genes for GSEA in drug reference profiles, we compared the differential expression metric of all genes (i.e., the T-statistic) with the LINCS perturbation Z scores using Pearson and Spearman correlations ([135]Figure 3A, middle panel). Concretely, we looked at how compound profiles from LINCS are correlated with the full gene expression profiles from both public RNA-seq datasets (GEO: [136]GSE75748 and [137]GSE109658). This approach (henceforth called the “correlation” approach; [138]Figures 3D and 3E) has the advantage that it avoids thresholding (no fixed size gene sets). Both correlation metrics agreed well within and between datasets ([139]Figures 3D and 3E; [140]Table S5), and results were consistent with the “ssCMAP/LINCS” approach: we observed the presence of PI3K and mTOR inhibitors, including LY29, wortmannin, and sirolimus, as well as HDAC and HSP90 inhibitors, such as geldanamycin and tacedinaline. Setting a threshold on the Pearson and Spearman correlation of 0.15, we identified 395 unique compounds. The overlap with the 299 “ssCMAP/LINCS” compounds was 37 unique molecules. During the bulk and scRNA-seq data analyses, we observed TFs/pathways that could potentially induce DE differentiation ([141]Figures 1B and [142]3A, left panel), including SMAD2, -3, and -4, known to be essential for the activation of the TGF-β signaling pathway ([143]Nakao et al., 1997). Instead of looking at individual genes, alternatively, we searched for compounds that could activate the pathways and respective TFs. Using GSEA, we identified a list of molecules from LINCS whose profiles were enriched for TGF-β pathways, as well as for genes regulated by SMADs ([144]Figure 3F). This “pathway/TF” approach is based on comparing pathway/TF enrichment profiles instead of individual genes. Briefly, we focused on the cell line with the most abundant data in LINCS, MCF-7. Overall, 1,287 compounds were enriched for the TGF-β signaling pathway according to definitions from either Hallmark ([145]Liberzon et al., 2015) or KEGG pathways ([146]Ogata et al., 1999). Similarly, 1,062 compounds were enriched for SMAD targets according to the RegNetwork database ([147]Liu et al., 2015b). The entire set of results is provided ([148]Table S6). A total of 408 unique molecules overlap between both datasets ([149]Table S5). Taken together, we applied three drug repurposing strategies ([150]Figure 3). The overall number of unique molecules was 999, with seven in common ([151]Figure 3G; [152]Table S5). For further experimental validation, we focused on the seven common compounds. We also purchased 10 molecules from “pathway/TF” for additional testing (which were selected based on the cost and availability). The reasons for the last purchase are detailed in the [153]discussion. Experimental validation of the compounds reveals new DE inducers In order to test candidate compounds for their ability to induce DE formation, we designed an endoderm reporter using CRISPR, as previously described ([154]Krentz et al. 2014): a SOX17-mNeonGreen (SOX17-mNG) knockin H1 hESC line that preserved the endogenous SOX17 mRNA sequence ([155]Figure 4A). Wild-type H1 hESCs and SOX17-mNG cells were differentiated into DE cells for 3 days ([156]Figures 4B–4D) using the standard protocol: 100 ng/mL AA + 3 μM CHIR for 1 day, followed by 100 ng/mL AA for 2 days (i.e., AC-A-A; [157]Figure 4E). No changes in the number of CXCR4+ cells were observed by fluorescence-activated cell sorting (FACS) ([158]Figure 5A), suggesting that SOX17-mNG targeting did not impact endoderm differentiation. Figure 4. [159]Figure 4 [160]Open in a new tab SOX17-NG system for molecule screening (A) Schematic representation of mNeonGreen cassette used in this study. (B) Schematic representation of hESC differentiation toward DE with AA protocol. (C) mNeonGreen-2A-SOX17 knockin H1 (WA01) cells (homozygous mutant) differentiated by planar culture for 3 days. (D) Suspension culture with standard protocol. Days 2 and 3. (E) DE-induction efficiency oriented by AA. AC-A-A, activin A and CHIR99021 on day 1 and AA on day 2 and 3; AC-N-N, AA and CHIR on day 1 and no morphogens on day 2 and 3, etc. (n = 5 independent biological replicates). Data are presented as the mean ± SEM. (F) DE-induction efficiency by small molecules. Every treatment includes CHIR on day 1, and AA is replaced by each molecule described on y axis. LoAA, treatment with CHIR on day 1 and AA by 1:10 concentration for 3 days; LoAA w/BRD-K42644990, treatment with CHIR on day 1 and BRD and AA simultaneously by 1:10 concentration for 3 days (n = 3 independent biological replicates). Data are presented as the mean ± SEM. (G) DE-induction efficiency by small molecules, MCF7 LINCS profile of those not enriched for TGF-β or SMAD targets. Every treatment includes CHIR on day 1, and AA is replaced by each molecule described on y axis (n = 3 independent biological replicates). Data are presented as the mean ± SEM. (H) DE-induction efficiency by small molecules that are listed by all three drug repurposing approaches (n = 3 independent biological replicates). ^∗p < 0.05 (compared with C-N-N in F and compared with C-N-N in E). Data are presented as the mean ± SEM. Figure 5. [161]Figure 5 [162]Open in a new tab Characterization of DE cells induced by small molecules (A) FACS data of DE cells differentiated from ESCs for 72 h by each treatment. DE markers (CXCR4 and mNeonGreen reporter for SOX17) are analyzed. (B) FACS data of induced DE cells for surface markers CD177/CD275 indicating DE heterogeneity. (C) Quantification analysis of induced DE cells for mNeonGreen, CXCR4, CD177, and CD275. ESC, non-treated ESCs as negative control; AA, activin A and CHIR99021 on day 1 and AA on days 2 and 3 as standard treatment; JNJ, JNJ and CHIR on day 1 and JNJ-only treatment on days 2 and 3; BRD, BRD and CHIR on day 1 and BRD-only treatment on days 2 and 3; BRD+1:10AA, treatment with CHIR on day 1 and BRD and AA simultaneously by 1:10 concentration for 3 days. ^∗p < 0.05; n = 4 independent biological replicates. Data are presented as the mean ± SEM. High-content screening (HCS) was conducted using a confocal high-content imaging system to quantify SOX17-mNG-expressing cells. On day 3, SOX17-mNG+ cells comprised 59.3% ± 2.1% of Hoescht 33342 staining cells ([163]Figure 4E) compared with 87.1% ± 2.1% using FACS ([164]Figure 5A). The lower endoderm induction efficiency observed with the HCS system may result from being unable to exclude dead cells that would be Hoechst 33342+ and mNG− ([165]Figures 4E, 4F, and [166]5A–5C). A single-day of AA + CHIR treatment induced 32.2% ± 2% cells to form DE (i.e., AC-N-N; [167]Figure 4E). Neither CHIR (i.e., C-N-N) nor AA treatments without CHIR on the first day (i.e., A-A-A) could induce DE formation (4.7% ± 0.7% and 0.2% ± 0%, respectively; [168]Figure 4E). These results confirm that for these screening conditions, the combination of CHIR and AA is required for inducing the formation of DE. Next, we tested 10 purchased compounds from the “pathway/TF” analysis for their capacity to induce DE in the absence of, or with a reduced concentration of, AA ([169]Figure 3F; [170]Table S5). Taking into account recent studies ([171]Li et al., 2019; [172]Madrigal et al., 2020), we were also interested in testing a small-molecule JNK inhibitor (JNKi). As the 408 compounds from the pathway/TF enrichment approach included several JNKis that were not accessible, we included an alternative JNKi that was available, making it 11 compounds in total. LY29 was added to the compound list due to its strong drug repurposing support (“ssCMAP/LINCS” and “correlation” approaches, see above) together with JNJ42041935 (henceforth called JNJ), a HIF1 stabilizer that mimics hypoxia. Hypoxic treatment of hESCs has been shown to enhance DE formation ([173]Chu et al., 2016), suggesting that a physiological hypoxic condition during early development contributes to the formation of DE cells. However, HIF1 stabilizers have not been used for in vitro DE formation. First, we determined whether the candidate compounds could improve DE differentiation after an initial 1 day treatment with AA; eight out of 13 drug repurposing compounds (10 plus JNJ, JNKi, and LY29, ∼62%) were more efficient than the baseline treatment (i.e., AC-N-N). Then, we sought to determine whether they could induce DE in the absence of AA; JNKi, LY29, BRD-K42644990 (henceforth called BRD), and JNJ could ([174]Figure 4F). DE induction by BRD and JNJ was further confirmed by FACS (BRD, 51.7% ± 1.6%; JNJ, 53.8% ± 2.2%) ([175]Figure 5B). As negative controls, we screened 10 additional compounds that were not expected to activate TGF-β nor SMAD signaling pathways according to our third drug repurposing approach. None induced DE formation ([176]Figure 4G), confirming the robustness of the drug repurposing strategy. Similar screening conditions (CHIR reduced to 1.75 μM; see [177]experimental procedures) were used to evaluate the set of seven compounds identified by all three in silico drug repurposing approaches ([178]Figure 4H). Of these, PI3K inhibitors PI-103 and wortmannin induced DE differentiation, with efficiencies near 63% and 46%, respectively. Cytochalasin B, an inhibitor of actin polymerization, affected DE differentiation to a lesser extent. While all were less effective than AA, which reached 89% in the [179]Figure 4H screen, the P13K inhibitors were more effective than treatment with CHIR alone (17.5%). For this particular screen, we observed a higher DE efficiency using CHIR alone. This may be attributable to the sensitivity of the screen to variations in CHIR source, which differed from the source in other screens. Among the seven candidate molecules, BRD is a novel compound with an unknown mechanism of action in the context of DE induction from hESCs. To test whether BRD could replace AA, at least partly, regarding its cost effectiveness, we investigated its efficacy for DE induction by combining BRD with a reduced amount of AA. Treatment with 3 μM CHIR for the first day only and BRD +10 ng/mL AA (1/10 the standard concentration) for all three days induced 59.8% ± 4.6% SOX17-mNG+ cells as detected by the HCS system and 72.5% ± 0.7% as detected using FACS, representing a DE efficiency similar to that of the standard protocol but at a 90% cost reduction ([180]Figure 4F). Therefore, among the compounds that were able to induce DE formation in the absence of AA, the subsequent analyses focused on JNJ and BRD (10 ng/mL AA). Population analysis of BRD-/JNJ-induced DE cells, and subsequent differentiation into pancreatic endoderm and β-like cells Similar to DE in vivo, in-vitro-differentiated DE is heterogeneous: it contains CD177+ and CD275+ cell subpopulations with a preference for differentiating into either pancreatic or liver lineages, respectively ([181]Mahaddalkar et al., 2020). To assess whether BRD-/JNJ-induced DE cells achieved such heterogeneity, we performed FACS on differentiation day 3 ([182]Figure 5B). mNG+ endodermal cells differentiated by BRD, JNJ, and/or AA were predominantly CD177+, suggesting a preferential competence for pancreatic progenitor cells ([183]Figure 5C). Next, in order to test whether BRD and JNJ induced DE cells in other pluripotent stem cells (PSCs), we differentiated another hESC line (CyT49) and one hiPSC line (1–21) into DE cells. Both compounds could successfully induce DE cells from these PSCs ([184]Figure S4). Then, we tested whether BRD- and JNJ-induced DE cells could further differentiate into organ-specific cells such as pancreatic progenitors and β-like cells. On differentiation day 15, gene expression levels of pancreatic progenitor markers NKX6-1 and PDX1, and the endocrine progenitor marker NEUROD1, were comparable between BRD-, JNJ-, and AA-induced DE cells after adjusting for the SOX17-GFP+ cell ratio on day 3 ([185]Figure 6A). Moreover, for NKX6-1 and PDX1, protein expression levels were also enriched by flow cytometry ([186]Figure 6B). Immunostaining revealed that NKX6-1 and PDX1 proteins were expressed in differentiated spheroids on day 15 ([187]Figure 6C). On differentiation day 22, differentiation efficiency toward β-like cells was assessed by flow cytometry using an INS-2A-GFP knockin add-on hESC line in which GFP+ cells represent insulin-expressing β-like cells. Interestingly, BRD and AA induced β-like cells ([188]Figure 6D; 16.1% and 69.8%, respectively) reflecting the DE-induction efficiency from day 3 ([189]Figure 6A; 19.5% and 81.9%, respectively), whereas JNJ could not induce β-like cells efficiently ([190]Figure 6D). Taken together, these results suggest that BRD- and JNJ-based protocols can induce endoderm that is largely similar to that generated with AA and that BRD- and JNJ-induced DE can differentiate toward pancreatic progenitor cells. Particularly, BRD could induce a pancreatic β-like and endocrine stage with similar efficiency to AA, suggesting that BRD at least partially shares the mechanism of action for DE induction with AA. Figure 6. [191]Figure 6 [192]Open in a new tab Characterization of pancreatic progenitor cells induced by small molecules (A) mRNA expression levels of a stem cell marker (Nanog) and pancreatic progenitor markers (PDX1, NKX6-1, and NEUROD1) for pancreatic progenitor cells differentiated by small molecules for 15 days are quantified by nCounter assay (Nanostring) (top panels). Each mRNA expression level is adjusted by the ratio of mNeonGreen cells in all cells on differentiation day 3 (bottom panels). ^∗p < 0.05 compared with the other three groups; n = 3 independent biological replicates. Data are presented as the mean ± SEM. (B) PDX1 protein+ cells are quantified for day 15 pancreatic progenitor (PP) cells by FACS. (C) Immunostaining for PDX1 and NKX6-1 proteins on AA- and BRD-induced PP cells. Scale bar, 50 μm. (D) BRD− or BRD+ AA-induced DE can support differentiation of insulin-expressing β-like cells during a 22 day differentiation protocol as assessed by flow cytometry using an INS-2A-GFP knockin add-on hESC line (n = 6 independent biological replicates for AA, n = 3 independent biological replicates for the other samples). Data are presented as the mean ± SEM. RNA-seq analysis of BRD-induced DE differentiation reveals down-regulation of the MYC pathway Both the targets and mechanism of action of BRD are unknown. Using DeepCodex ([193]Donner et al. 2018) and SwissTargetPrediction ([194]Daina et al. 2019), which can predict a compound’s targets based on its similarity to known molecules, we did not obtain any significant hits. To explore its mechanism of action, we designed a bulk RNA-seq experiment wherein we tested the effect of a BRD treatment on gene expression alone and in combination with CHIR (BRD + CHIR) and compared it against a control pluripotent state and the standard AA protocol. To ensure that we would identify BRD-specific effects, we compared the BRD treatments against a CHIR-only protocol ([195]Figure 7). Furthermore, we included a JNJ with CHIR (JNJ + CHIR) treatment as a positive control—another small molecule that can induce DE—and with a known mechanism of action. Finally, since low doses of AA could increase the efficiency of the BRD + CHIR treatment, we also tested the sample from this modified treatment at 72 h. Figure 7. [196]Figure 7 [197]Open in a new tab Effect of different compounds on hESC differentiation Heatmap of marker gene expression and pathway activities (left). Scaled expression of the marker genes for different differentiation protocols. (Right) Single-sample gene set enrichment scores of selected Hallmark pathways. An exploratory analysis of the bulk RNA-seq data revealed a strong confounding effect by CHIR during the initial differentiation stages; almost no genes showed significant differences in expression levels between the BRD + CHIR and CHIR-only treatments ([198]Table S7). After 6 h of BRD + CHIR treatment, we observed a significant increase in the expression levels of EOMES (vs. control pluripotent state), which is one of the key TFs guiding the endoderm lineage ([199]Table S7) ([200]Teo et al., 2011). However, AA induced additional regulators, including MIXL1, TBXT, WNT3, CER1, and FGF8, and induced higher EOMES expression levels than BRD + CHIR ([201]Table S7). Notably, AA strongly inhibited the pluripotency factor SOX2; in contrast, the expression levels of SOX2 with both the BRD + CHIR and JNJ + CHIR treatments remained relatively high ([202]Table S7). The expression profiles of key genes, such as EOMES, SOX17, and GATA6, for the BRD + CHIR and JNJ + CHIR treatments were almost identical, indicating either the existence of some commonalities or, most likely, a strong confounding effect by CHIR ([203]Figure 7). Interestingly, the only significantly expressed gene in the JNJ + CHIR vs. the BRD + CHIR treatment at 6 h was DDIT4, a HIF1 responsive protein involved in mTORC1 inhibition ([204]Corradetti et al. 2005) ([205]Table S7). Finally, regardless of the treatment, we observed clear endoderm signals at 72 h ([206]Figure 7; [207]Table S7); however, although trends in expression levels were comparable, cells derived from the AA treatment displayed higher expression levels of endoderm-specific markers ([208]Figure 7). Next, to characterize BRD-specific effects, we treated H1 hESCs with only BRD for 6 and 24 h. BRD-induced gene expression was clearly distinct from treatments leading to the endoderm lineage trajectory ([209]Figure 7). For instance, BRD induction of EOMES and other endoderm-important genes were strongly down-regulated; however, the expression levels of pluripotency markers (SOX2, NANOG, and POU5F1) were stable ([210]Figure 7). Noteworthy, the expression of BMP3, an inhibitor of both AA and BMP signaling ([211]Gamer et al., 2005) that can activate SMAD2-dependent signaling by binding to ActRIIB (activin receptor type 2B) ([212]Wen et al., 2019), was highly up-regulated at 24 h ([213]Figure 7, highlighted). Shifting the focus of the analysis on the enrichment of gene terms, such as pathways and TF targets, we found that BRD-specific responses were mostly associated with down-regulated pathways, among which MYC was the most strongly inhibited both at 6 and 24 h ([214]Figure 7, highlighted; [215]Table S7). Previous studies have found that a strong down-regulation of MYC is correlated with high expression of meso- and endoderm markers ([216]Cliff et al., 2017). The analysis also revealed a high activation of the protein secretion pathway, whose role in the differentiation process was unclear ([217]Figure 7, highlighted). Taken together, these results suggest a potential key role of MYC inhibition, and possibly BMP3 up-regulation, in the mechanistic effect of BRD on pluripotent state exit and promotion of endoderm differentiation under WNT signaling via CHIR. Discussion Using three orthogonal drug repurposing approaches, we identified 999 candidate inducers for DE differentiation ([218]Table S5). The set is enriched for potential inducers, including PI3K, HDAC, and mTOR inhibitors, some of which are used in existing protocols ([219]Thakur et al., 2020). Testing seven common compounds among the three approaches, we discovered at least two endoderm inducers, both (PI-103 and wortmannin) known PI3K inhibitors. While PI3K antagonists are used in endoderm differentiation protocols, to our knowledge, we are the first group to report induced endoderm with these molecules in the absence of AA. The three drug repurposing strategies did overlap broadly: “ssCMAP/LINCS” (299) and “correlation” (395) only overlapped for 37 molecules, while each had in common with the “pathway/TF” strategy 14 and 59 compounds, respectively ([220]Figure 3G). While these approaches differ in methods and data, greater overlap might be anticipated. These results may reflect limitations with both CMap- and LINCS-based drug repurposing ([221]Lim and Pavlidis 2019). Thus, intersecting multiple methods and focusing on the limited subset of consensus compounds could be beneficial. We tested 11 candidate compounds for their ability to induce DE, as well as three previously described or novel endoderm inducers, including IDE1, LY29, and HIF1a stabilizer JNJ, making it 14 compounds total. When adding AA only for the first day, DE was induced by 11 out of 14 compounds. Among these, BRD, JNJ, JNKi, and LY29 were able to induce DE formation in the absence of AA ([222]Figure 4F). Of note, alone, IDE1 could not induce hESC differentiation into endoderm, which is consistent with previous studies ([223]Tahamtani et al., 2014). None of 10 negative control compounds induced expression of endodermal markers. Previous blind screening approaches are relevant ([224]Figure 4G). For example, Korostylev et al. screened 23,406 compounds, of which only 67 induced FOXA2 expression (endoderm marker) ([225]Korostylev et al., 2017), and Borowiak et al. screened 4,000 compounds, identifying only two molecules, IDE1 and IDE2, that fit their criteria for DE induction ([226]Borowiak et al., 2009). These results demonstrate that our in silico approach of candidate selection improves the outcome of screening studies. Moreover, we report potential endoderm inducers suitable for future screening studies. Among the identified DE inducers, BRD lacks functional annotation. According to the LINCS pathway enrichment analysis, BRD is expected to be a TGF-β and SMAD inducer in MCF7 cells. However, our RNA-seq experiments revealed that this is not the case in hESCs. Moreover, the overall correlation between two BRD expression profiles (tested on ESCs vs. on MCF7 from LINCS) was very low, 0.08. This highlights a limitation of the drug repurposing approach, as different cells have distinct characteristics ([227]Lim and Pavlidis 2019). In hESCs, BRD strongly inhibits the MYC pathway while inducing the BMP3 (weakly) and protein secretion pathways ([228]Table S7). Both mechanisms seem relevant for DE differentiation: stauprimide, a molecule that primes hESCs toward DE is a known MYC inhibitor ([229]Tahamtani et al., 2014), and BMP3 activates SMAD2 signaling ([230]Wen et al., 2019). Recently, replacement of AA by dorsomorphin, an inhibitor of the BMP ALK2 receptor, and titration of CHIR concentration have been shown to induce DE ([231]Jiang et al., 2021). The authors also reported DE induction by CHIR in the absence of AA, whereas another group observed no induction ([232]Li et al., 2019), and in our protocol, CHIR alone only induced a small DE population (4.7%–17%; [233]Figures 4E and 4H). Some of these differences could be explained by different molecule compositions in each protocol, such as the presence of insulin and the absence of B27 in Li’s and our protocols, suggesting that the orchestration of multiple signaling pathways at various times is necessary for optimized DE induction. Differences in CHIR preparation or storage conditions could also contribute to variable effects observed. Of note, inhibition of BMP signaling was involved in DE induction both by dorsomorphin and BRD. In addition to detecting AA replacements, we generated an scRNA-seq dataset of the DE-ME branch. Our main goal was to discover key regulators of the branchpoint. Inhibiting the activity of lineage-specific TFs, especially in key developmental stages, might be a sound differentiation strategy ([234]Palii et al., 2019). We identified TFs co-expressed at the PS branchpoint that serve as either mesoderm or endoderm markers. The T-box TFs EOMES and Brachyury (see [235]Table S4 for the full list) have a potential cross-antagonizing effect at a mesoderm-endoderm precursor stage ([236]Tosic et al., 2019). Brachyury is important for ME development ([237]Faial et al., 2015), while EOMES is crucial for DE ([238]Teo et al., 2011). Inactivating Brachyury at this branchpoint might promote differentiation toward endoderm. In conclusion, we have developed an in silico approach for identifying small molecules intended to optimize differentiation of PSCs toward endodermal tissues. Our approach will be of utility to discover new molecules that could guide cell differentiation and reduce the expense of differentiation protocols. Experimental procedures Resource availability Corresponding authors Francis C. Lynn ([239]francis.lynn@ubc.ca) and Wyeth W. Wasserman ([240]wyeth@cmmt.ubc.ca) are the corresponding authors. Materials availability This study did not generate new unique reagents. Experimental methods H1 and CyT49 human ESC lines were used for experiments approved by the UBC Children’s and Women’s Research Ethics Board (H09-00676). The H1 and CyT49 hESC lines were generously provided by WiCell (Madison, WI, USA) and ViaCyte (San Diego, CA, USA) respectively. 1–21 human iPSCs were generated as previously described ([241]Krentz et al., 2017). The SOX17-mNG and INS-2A-EGFP hESC lines were generated in H1 cells as described in [242]Krentz et al. (2014). A homozygous SOX17-mNG clone was used for these experiments, and a heterozygous INS-2A-GFP clone was used. hESCs were differentiated using the protocol published by [243]Nair et al. (2019). Morphogens were replaced for specific experiments as described in the manuscript. ImageXpress Micro Confocal System (Molecular Devices) was used as a high content imaging system. hESC-derived spheroids were collected and dissociated into single cells for FACS. LSRFortessa (BD Bioscience, San Jose, CA, USA) was used for FACS. nCounter assay (Nanostring, Seattle, WA, USA) was performed as gene expression assay. For immunostaining, spheroids and sections were imaged using a Leica SP8 confocal microscope. Author contributions G.N., S.S., F.C.L., W.W.W., S.M., and A.C. conceived the idea; G.N., S.S., F.C.L., and W.W.W. designed and performed the analyses; G.N. wrote the code, performed the drug repurposing pipeline, and analyzed the bulk RNA-seq data; M.E.O. analyzed scRNA-seq data; P.P. and N.L. provided preprocessed LINCS data; S.S., F.C.L., C.L.B., H.H., and D.Z. performed the experiments; F.C.L., W.W.W., S.M., O.F., and A.C. provided advice on all aspects of the work; G.N., S.S., C.L.B., F.C.L., and O.F. wrote and edited the manuscript with contributions from all authors. Acknowledgments