Abstract Cardiomyocyte differentiation derived from embryonic stem cells (ESCs) is a complex process involving molecular regulation of multiple levels. In this study, we first identify and compare differentially expressed gene (DEG) signatures of ESC-derived cardiomyocyte differentiation (ESCDCD) in humans and mice. Then, the multiscale embedded gene co-expression network analysis (MEGENA) of the human ESCDCD dataset is performed to identify 212 significantly co-expressed gene modules, which capture well the regulatory information of cardiomyocyte differentiation. Three modules respectively involved in the regulation of stem cell pluripotency, Wnt, and calcium pathways are enriched in the DEG signatures of the differentiation phase transition in the two species. Three human-specific cardiomyocyte differentiation phase transition modules are identified. Moreover, the potential regulation mechanisms of transcription factors during cardiomyocyte differentiation are also illustrated. Finally, several novel key drivers of ESCDCD are identified with the evidence of their expression during mouse embryonic cardiomyocyte differentiation. Using an integrative network analysis, the core molecular signatures and gene subnetworks (modules) underlying cardiomyocyte lineage commitment are identified in both humans and mice. Our findings provide a global picture of gene-gene co-regulation and identify key regulators during ESCDCD. Keywords: embryonic stem cell, cardiomyocyte differentiation, differentially expressed gene, gene co-expression network, transcription factor, module Graphical Abstract graphic file with name fx1.jpg [51]Open in a new tab __________________________________________________________________ The article provides a strategy to compare the similarities and differences of core molecular signatures and modules based on a co-expression network during differentiation of human and mouse embryonic stem cells toward cardiomyocytes. The driver genes and the potential regulation of transcription factor are further illustrated in the sequential differentiation process Introduction The normal development of heart is essential for maintaining physiological functions. Embryonic stem cell (ESC)-derived cardiomyocyte (CM) differentiation (ESCDCD) provides an extremely valuable in vitro model for understanding the molecular mechanisms of heart development. CM differentiation is a dynamic process and involves generally four sequential phases: undifferentiated ESC, mesoderm (MES), cardiac progenitor (CP), and CM,[52]^1 depending on precise control of gene expression patterns at the transcriptional and post-transcriptional levels.[53]^2 In past decades, many transcription factors (TFs) have been identified in a variety of physiological and pathological conditions, including CM differentiation. For example, Yamanaka factors (OCT4, SOX2, KLF4, and cMYC), which are highly expressed in ESCs, have been shown to participate in the regulation of developmental pathways and the pluripotency of ESCs.[54]^3 During ESCDCD, MES specification is controlled by the T-box TFs (T and EOMES),[55]^4^,[56]^5 which could active another TF (MESP1), in turn driving the expression of cardiac TFs along with repressing these genes related to pluripotency,[57]6, [58]7, [59]8 Moreover, the BMP signaling pathway is triggered to induce TFs such as Tbx5, Mef2c, and Nkx2.5 for promoting the differentiation of CP.[60]^9^,[61]^10 In addition, UPS-mediated ASB2 proteolysis can play a role in the maturation of CM by targeting TF3 (TCF3).[62]^11 Even with so many studies and findings, the molecular regulatory mechanism of TFs in CM differentiation remains to be explored. Specifically, whether TFs work alone or with other proteins in a complex and the regulation patterns of TFs in CM differentiation are also obscure. Recent advances in sequencing technology and bioinformatics approaches have enabled genome-wide transcriptomic studies of CM differentiation derived from human ESCs (hESCs) and mouse ESCs (mESCs). The gene co-expression network has been widely used to elucidate complex biological processes (BPs) and identify key genes underlying various diseases. For example, a gene co-expression network based on time-series data of microarrays successfully identified the key genes related to desiccation tolerance in Arabidopsis thaliana seeds.[63]^12 In cancer research, the strategy has been shown to have powerful effects on clarifying the molecular mechanism of tumorigenesis.[64]^13^,[65]^14 Recently, through weighted gene co-expression network analysis (WGCNA), the phase-specific modules and genes involved in human CM differentiation were identified.[66]^15 However, co-expression network analysis of mouse ESCDCD has not yet been adequately carried out. Although human and mouse genomes are very similar,[67]^16 there lacks a detailed understanding of differential expression patterns of genes during differentiation phase transitions in the cardiac lineage common to hESCs and mESCs or specific to each species. In this study, we comprehensively identify core molecular signatures and modules underlying CM differentiation from ESCs in humans and mice. Furthermore, the synergistic and phase-specific regulation of TFs potentially related to the differentiation process are also predicted. Several novel key drivers of ESCDCD are identified and their expression patterns are experimentally validated. These findings present an overview of gene-gene coordinated regulation and crucial regulators in CM differentiation. Results Differentially Expressed Gene Signatures during CM Differentiation in hESCs and mESCs We obtained 16,204 human genes and 13,891 mouse genes based on their expression greater than 1 fragments per kilobase of transcript per million mapped reads (FPKM) in at least one sample during ESCDCD.[68]^1^,[69]^17 Given that CM differentiation is a dynamic and sequential process (ESCs, MES, CP, and CM), we performed differential expression analysis between any two adjacent phases and then took a union of the differentially expressed gene (DEG) signatures, leading to a total 4,868 and 7,208 DEGs (DEG_all) in humans and mice, respectively. As shown in [70]Table 1, the sizes of DEG signatures during CM differentiation of humans and mice were quite different. During the differentiation of hESCs and mESCs into MES, the shared multiple evolutionarily conserved pathways, such as Hippo and Wnt signaling, play important roles in the early differentiation of stem cells ([71]Figures 1A and 1B).[72]^18^,[73]^19 In the transition from MES into CP, the only shared pathway, Wnt signaling, is indispensable for the differentiation of stem cells into CM in both humans and mice ([74]Figures 1C and 1D), especially for the formation of MES and CP.[75]^20 During the differentiating into mature CM, the two DEG signatures in humans and mice are both enriched for the pathways related to heart diseases ([76]Figures 1E and 1F). Details on the BPs of Gene Ontology (GO) enriched in the DEG signatures are shown in [77]Table S1, which also shows the similarity of early and late differentiation during human and mouse ESCDCD. Not surprisingly, the well-known phase-specific molecular markers are almost identical ([78]Figure S1). For example, from ESCs differentiating into the MES, the expression levels of genes, such as SOX2, NANOG, NODAL, and EPCAM, which are known to maintain the pluripotency of ESCs, were inhibited, while the expression of MES markers, including MESP1, EOMES, and DKK1, were upregulated. The CP markers including ISL1, MEF2C, and BMP4 started to express during the transition from MES into CP. At the phase of terminally differentiated CM, the expression levels of ACTC1, MYH6, and NKX2-5 were upregulated to maintain heart function. To comprehensively dissect the dynamic changes of cardiac lineage from hESCs and mESCs, we further stringently defined six DEG signatures based on downregulation from upregulation genes between two adjacent phases. Therefore, a total of 10 DEG signatures in humans and mice were listed ([79]Table 1). Although the homologous genes in both species from MGI (Mouse Genome Informatics)[80]^21 account for only a small portion of each corresponding DEG signature, there is a significant overlap between them during CM differentiation ([81]Figures 1G and 1H). Table 1. Statistics for DEG Signatures in Humans and Mice Signatures Humans Mice Homologous Genes Human-Specific Mouse-Specific DEG_all 4,868 7,208 2,501 2,367 4,707 DEG_MES_vs_ESC 1,695 3,981 603 1,092 3,378 DEG_CP_vs_MES 2,672 2,537 533 2,139 2,004 DEG_CM_vs_CP 1,852 4,168 842 1,010 3,326 Up_MES_vs_ESC 486 1,895 126 360 1,769 Down_MES_vs_ESC 1,209 2,086 294 915 1,792 Up_CP_vs_MES 1,305 1,715 300 1,005 1,415 Down_CP_vs_MES 1,367 822 147 1,220 675 Up_CM_vs_CP 942 2,230 390 552 1,840 Down_CM_vs_CP 910 1,938 352 558 1,586 [82]Open in a new tab Figure 1. [83]Figure 1 [84]Open in a new tab Comparison of the DEG Signatures during Cardiomyocyte Differentiation in Humans and Mice (A–F) Bubble plots show the enriched KEGG pathways of the DEG signatures between two adjacent phases in humans and mice. The size of a bubble is proportional to the number of genes shared by the corresponding pathway and signature. The intensity of the fill-in color of a circle is proportional to the significance level. The pathways enriched in the DEG signatures in humans and mice are highlighted in blue font. (A, C, and E) Bubble plots show the KEGG pathways of DEG signatures during human ESCDCD. (B, D, and F) Bubble plots show the KEGG pathways of corresponding DEG signatures during mouse ESCDCD. (A and B) MES_vs_ESC. (C and D) CP_vs_MES. (E and F) CM_vs_CP. (G) Heatmap representation of the result from the hypergeometric test for the overlap of the DEG signatures in humans. (H) Heatmap representation of the result from the hypergeometric test for the overlap between the DEG signatures of humans and mice. In addition, [85]Table 2 shows the pathways enriched in the human- and mouse-specific DEG signatures between adjacent phases. The human-specific DEGs are associated with pathways involved in various diseases, but the pathways (e.g., mitogen-activated protein kinase [MAPK], phosphatidylinositol 3-kinase [PI3K]-Akt, and Rap1) enriched in the mouse-specific DEGs are related to the differentiation process.[86]22, [87]23, [88]24 Interestingly, the Rap1 pathway regulating diverse BPs is involved in the entire ESCDCD. Collectively, these results suggest that there is a certain degree of similarity in differential regulation during human and mouse CM differentiations, but a vast majority of molecular changes are species-specific. The findings pave a way to explore new molecular mechanisms for CM differentiation in humans and mice. Table 2. KEGG Pathways of Human- and Mouse-Specific DEGs between Two Adjacent Phases Human __________________________________________________________________ Mouse __________________________________________________________________ DEG Pathways Adjust.p DEG Pathways Adjust.p MES_vs_ESC none none MES_vs_ESC axon guidance 3.93E−14 MAPK signaling pathway 1.37E−08 Rap1 signaling pathway 3.71E−08 breast cancer 3.00E−07 PI3K-Akt signaling pathway 3.09E−07 Ras signaling pathway 8.17E−06 CP_vs_MES proteasome 1.25E−08 CP_vs_MES dilated cardiomyopathy (DCM) 6.36E−09 ribosome 3.52E−08 hypertrophic cardiomyopathy (HCM) 2.00E−07 Alzheimer’s disease 0.0002 Rap1 signaling pathway 2.00E−07 Parkinson’s disease 0.0002 ECM-receptor interaction 1.28E−06 systemic lupus erythematosus 0.0003 axon guidance 8.80E−06 non-alcoholic fatty liver disease (NAFLD) 0.0005 PI3K-Akt signaling pathway 9.41E−06 CM_vs_CP alcoholism 0.0038 CM_vs_CP ECM-receptor interaction 9.08E−07 systemic lupus erythematosus 0.0039 Alzheimer’s disease 1.73E−06 ribosome 0.0046 spliceosome 2.16E−06 steroid biosynthesis 0.0052 thermogenesis 2.16E−06 non-alcoholic fatty liver disease (NAFLD) 9.10E−06 Rap1 signaling pathway 0.0001 [89]Open in a new tab Gene Co-expression Network Underlying CM Differentiation in Humans To further dissect the complex co-regulation relationship among the genes during CM differentiation, we performed a co-expression network analysis for the human ESCDCD expression data of 16,204 genes from 71 samples using multiscale embedded gene co-expression network analysis (MEGENA).[90]^25 Finally, the human CM differentiation gene co-expression network (HCDGEN) is comprised of 9,919 nodes and 26,128 edges, in which we identified 212 CM differentiation modules with a hierarchical structure, presenting parent-child relationships ([91]Table S2). The topmost modules in the network are assigned at a particular compactness scale ([92]Figure 2A). The module hierarchical structure and the enriched BP terms are also summarized in [93]Figure 2B. Module 4 (M4), M126, M14, M3, M8, and M10 are most significantly enriched for embryo development, developmental process, cell differentiation, cardiovascular system development, CM differentiation, and heart process, respectively, suggesting potential functional relevance to CM differentiation. Figure 2. [94]Figure 2 [95]Open in a new tab The Human Cardiomyocyte Differentiation Gene Co-expression Network (HCDGEN) and Its Functional Modules (A) The co-expression network. Each color represents one module on the topmost in the network. The corresponding modules and the most significant GO terms are shown. The lncRNAs and mRNAs are shown as diamonds and circles, respectively. The hub gene with the highest connectivity is labeled. (B) Sunburst plot showing biological process most enriched in the modules. The darker the color, the more significant the enrichment. Concordant Modules of DEG Signatures in CM Differentiation in Humans and Mice Considering that the mouse ESCDCD data are too few, the information of the co-expression network analysis is insufficient. However, as the previous result showed, the DEG signatures between human and mouse ESCDCD are very similar because of the evolutionary conservation. Therefore, we hypothesize that the modules of HCDGEN can capture key gene regulatory relationships in the CM differentiation process not only in humans but also in mice. Toward this end, we tested how the DEG signatures in human and mouse ESCDCD are enriched in modules of the HCDGEN ([96]Figure 3A; [97]Figure S2). The human homologs of the mouse genes were based on the MGI database.[98]^21 It can be seen that the module enrichment patterns of DEG signatures in human and mouse ESCDCD are similar. In addition, 32 and 26 modules are significantly enriched for the overall human and mouse DEG signatures (DEG_all), respectively, while 15 modules are co-enriched for both signatures ([99]Figure 3B). The modules enriched for the human DEG signatures between two adjacent phases are similar to those for the corresponding mouse DEG signatures. For example, the DEG signatures between MES and ESCs in humans and mice are both enriched in M3, which is involved in the regulation of early differentiation of ESCs ([100]Figures 2B and [101]3B). M8, the most significantly common enriched module for human and mouse DEG signatures of CM versus CP, is involved in CM differentiation ([102]Figures 2B and [103]3B). These results indicate that the HCDGEN contains information of CM differentiation in both humans and mice. Figure 3. [104]Figure 3 [105]Open in a new tab Enrichment of Human and Mouse DEG Signatures in the Modules of the HCDGEN (A) Sunburst plots showing the enrichment of the human and mouse DEG signatures in the modules of the HCDGEN. Each DEG signature is represented by a different color. (B) Enrichment of the human and mouse DEG signatures in the modules of the HCDGEN. The modules are ranked by the enriched significance. Also, modules enriched and shared by the corresponding DEG signature in humans and the mice are labeled in blue and red, respectively. Co-expression Subnetworks (Modules) Underlying Differentiation Transition in Cardiac Commitment The aforementioned enrichment of the DEG signatures in 212 modules strongly suggests that the HCDGEN can dissect the complex pathways underlying CM differentiation in both humans and mice. To further understand the molecular interactions underlying adjacent phases, we focus on M3, M21, and M8, which are the most significantly enriched for the DEG signatures of MES versus ESC, CP versus MES, and CM versus CP, respectively, in both of humans and mice ([106]Figures 3B and [107]4). M3, with 915 nodes and 2,417 edges, represents the dynamic transition between the ESC and MES phases ([108]Figure 4A; [109]Table S3). Almost all of the genes in M3 are downregulated in the MES phase in both species. M3 is enriched for the pathways associated with stem cell differentiation, such as tumor necrosis factor (TNF), MAPK, Rap1, and Ras signaling and pluripotency regulation of stem cells ([110]Figures S3A, S3D, and S3E; [111]Table S3). Therefore, the genes in M3 might be involved in the differentiation of ESCs into MES. For example, the inhibition of SOX2, a key hub in M3, was found to induce the MES differentiation from human induced pluripotent stem cells.[112]^26 The DEG signatures between MES and CP in both species are significantly enriched in M21, which is associated with Wnt signaling, vasculogenesis, and heart development ([113]Figure S3B; [114]Table S3), indicating that this module could capture molecular interactions during the differentiation of MES into CP ([115]Figure 4B; [116]Table S3). Unlike M3, M21 harbors a large number of upregulated genes in CP in comparison with MES. FZD4, which codes a receptor of Wnt protein, is one of the 17 hub genes of M21 and plays a role in activating the Wnt signaling pathway with Norrin ligand, which may result in an increase of the ESC-derived CPs.[117]^27 Alternatively, the DEG signatures between CM and CP in humans and mice, especially the upregulated DEG signatures, are most enriched in M8 ([118]Figure 4C; [119]Table S3). The genes in M8 are mainly involved in maintaining the function of mature CM. Dysregulation of some genes in M8 such as SLC8A and SORBS2 might cause heart disease.[120]^28^,[121]^29 Figure 4. [122]Figure 4 [123]Open in a new tab Co-expression Subnetworks (Modules) Associated with Differentiation Phases The fill-in color of a node represents the direction of differential expression in humans, with green indicating downregulation and yellow indicating upregulation. The blue and red borders represent downregulation and upregulation in mice, respectively. The diamond and circle node shapes correspond to lncRNAs and mRNAs, respectively. (A) The subnetwork of M3 underlying the transition of ESC to MES. (B) The subnetwork of M21 underlying the differentiation of MES to CP. (C) The subnetwork of M8 underlying the differentiation of CP to CM. Taken together, the three modules detailed herein are the comprehensive molecular networks underlying the sequential transition of CM differentiation from ESCs, and many key hubs in these modules are potential novel regulators for the dynamic transformation in humans and mice. Human-Specific Co-expression Modules Underlying the Transition of CM Differentiation The species-specific subnetworks and key regulators are also crucial to understand the regulation of CM differentiation. As the co-expression network is constructed only on the human data, we highlight three human-specific modules, M338, M17, and M177, corresponding to three differentiation transitions during CM differentiation based on their specific enrichment for the human DEG signatures ([124]Figure 3B). M388 is specifically enriched for the DEGs between MES and ESC in humans ([125]Figure 5A; [126]Table S4). Among the pathways enriched in M388 ([127]Table S4), the Notch signaling has been well studied in human stem cell differentiation.[128]^30^,[129]^31 Interestingly, FTL, the top hub of M388, has much more dramatic expression during human ESCDCD than during mouse ESCDCD ([130]Figure 7D). The DEG signature in the differentiation from MES into CP, i.e., human DEG_CP_vs_MES, is specifically enriched in M17 ([131]Figure 5B; [132]Table S4), which is involved in the regulation of RNA process and metabolism. Moreover, long non-coding RNA (lncRNA) MGC4859 is a central hub gene of the module without any functional annotation. Among other hub genes, DHCR24 and UBE2H are the most significantly downregulated and upregulated, respectively. Unlike M388 and M17, only 25 of the 42 DEGs in M177 are specific to humans ([133]Figure 5C; [134]Table S4). M177 is mainly involved in cell cycle. However, SFPQ, the most downregulated hub gene in M177, was found to exert a role in human-induced pluripotent stem cell-derived CM differentiation.[135]^32 The most upregulated hub gene, GPI, in M177 was confirmed to attach glypicans to cell membrane in order to promote cardiac specification and differentiation.[136]^33 The three human-specific modules, M388, M17, and M177, represent the human-specific molecular interactions underlying the sequential phase transitions during human ESCDCD. Figure 5. [137]Figure 5 [138]Open in a new tab Human-Specific Modules Associated with Cardiomyocyte Differentiation The fill-in color of a node represents the direction of differential expression in humans, with blue indicating downregulation and red indicating upregulation. The diamond and circle node shapes correspond to lncRNAs and mRNAs, respectively. (A) The subnetwork of M388 underlying the transition of human ESC to MES. (B) The subnetwork of M17 underlying the differentiation of human MES to CP. (C) The subnetwork of M177 underlying the differentiation of human CP to CM. Figure 7. [139]Figure 7 [140]Open in a new tab Validation of the mRNA Expression Patterns of Key Driver Genes by qPCR Experiments (A–C) Expression patterns of the key driver genes (KDGs) in M3, M21, and M8 in the human and mouse RNA-seq datasets and the quantitative real-time PCR experiments. (A) LEFTY1 (upper) and AASS (lower). (B) FZD4 (upper) and SH3GLB1 (lower). (C) LRRC10 (upper) and SLC8A1 (lower). (D) Expression patterns of three global KDGs in the human and mouse RNA-seq datasets and the qPCR experiments. The curves show the expression levels of KDGs at each phase in the human and mouse RNA-seq datasets. The red and green curves represent the human and mouse data, respectively. The bar plots show the mRNA expression levels of KDGs using quantitative real-time PCR (n = 3). ∗p < 0.05, ∗∗p < 0.01. TFs for Regulating CM Differentiation TFs play pivotal roles in a variety of BPs, including heart development. However, how TFs are involved in CM differentiation is largely elusive. In our study, 12 TFs are differentially expressed during human CM differentiation and have binding information in hESCs according to ChIPBase.[141]^34 The targets regulated by each TF can be predicted based on whether the TF can bind to the region from 1 kb upstream or 1 kb downstream around a gene transcription start site (TSS). The targets of each TF are then intersected with genes and each DEG signature of the differentiation process in human CM, resulting in target gene sets and target DEG sets, respectively. The similarity between the 12 TFs was assessed by clustering analysis of their target gene sets and target DEG sets ([142]Figures 6A and 6B). The clustering results on target gene sets show that the regulation of ZNF143, ATF2, TEAD4, GABPA, MXI1, and MYC are similar, while ATF3 and USF1 regulate similar target genes. However, during the human CM differentiation, only MXI1, MYC, and ZNF143 regulate similar DEG. As shown in [143]Figure 6C, MXL1 may regulate all three transitions, and MYC and ZNF143 might regulate the differentiation of ESCs to MES to CP, while the rest of the nine TFs drive one particular transition. Alternatively, the enrichment of the TF targets in the DEG signatures suggests that NANOG may specifically regulate the pluripotency of ESCs, and EOME may regulate the differentiation from MES into CP ([144]Figure 6C; [145]Figure S4A). Finally, the enrichment of 12 TF target gene sets in the 212 modules suggests that TFs, including MXI1, MYC, TEAD4, UPS1, ZNF143, ATF2, ATF3, EGR1, and GABPA, play a potential and broad role in regulating the entire differentiation process from ESCs into CM ([146]Figure S4B). However, EOMES, FOXH1, and NANOG may specifically regulate transformation between two adjacent phases in differentiation ([147]Figure 6D). Figure 6. [148]Figure 6 [149]Open in a new tab Transcription Factors Potentially Regulating the Human Cardiomyocyte Differentiation Process (A) Clustering analysis of the 12 transcription factors (TFs) based on similarity between their target gene sets. (B) Clustering analysis of the 12 TFs based on similarity between their target DEG sets. (C) Enrichment of the DEG signatures in the 12 TF target gene sets. (D) Sunburst for enrichment of three TF target DEG sets in the 212 modules from the HCDGEN. The darker the color, the more significant the enrichment. Identification and Validation of Key Regulators in CM Differentiation The co-expression network of human CM differentiation provides much more information beyond the co-expressed gene modules. The connectivity will allow us to identify crucial key regulators in cardiac lineage commitment. Through the key driver analysis (KDA)[150]^35 for three modules (M3, M8, and M21) and the DEG signatures, we identified 60, 28, and 16 key driver genes (KDGs) in M3, M8, and M21, respectively ([151]Figures S5A–C). The KDGs with consistent expression in humans and mice may play important roles in controlling the transition of CM differentiation. For example, the inhibition of SOX2, a known ESC marker, could induce the differentiation from ESCs into MES.[152]^26 DCBLD2, as the only upregulated homologous regulator in M3, also shows a potential role in promotion of MES formation. Among the KDGs in M21, the mutation of KLF3 could lead to abnormal cardiovascular development.[153]^36 Similarly, some KDGs in M8, such as MYOM1, ACTC1, and ACTN2, are highly expressed in mature CM for maintaining heart morphogenesis.[154]37, [155]38, [156]39 To this end, for each of the three modules, we sorted KDGs according to the degree of nodes and highlighted expression patterns of two genes from the top five KDGs. The selected genes have not been reported to be related to CM differentiation, so that we could further explore its new molecular mechanism in CM differentiation later ([157]Figures 7A–7C). At the same time, we investigated the expression patterns of the top 30 KDGs in the whole co-expression network, in which 28 of them contained mouse homologous genes ([158]Figure S5D). Similarly, we also picked two KDGs (RPL8 and FTL) ([159]Figure 7D). Interestingly, the expression of CAP2 is quite different from the other 27 KDGs, and this expression pattern is consistent in humans and mice, so we also highlight its expression ([160]Figure 7D; [161]Figure S5D). Additionally, through differential expression analysis statistics, it was found that only eight genes are significantly differentially expressed in each state transition during human and mouse CM differentiation, and the differential expression during each phase transition is consistent between humans and mice. Also, they are defined dynamic genes, which may be involved in the whole process of differentiation stemming from ESCs into CM ([162]Figure 8). Figure 8. [163]Figure 8 [164]Open in a new tab Validation of the mRNA Expression Patterns of the Eight Dynamic Genes by qPCR Experiments The curves show the expression patterns of eight dynamic-expressed genes during cardiomyocyte differentiation in human and mouse RNA-seq datasets. The red and green curves represent humans and mice, respectively. The bar plots show the mRNA expression levels of the selected genes using quantitative real-time PCR (n = 3). ∗p < 0.05, ∗∗p < 0.01. In summary, we finally selected 17 genes (9 KDGs and 8 dynamic genes) as potential key regulators in CM differentiation. [165]Table S5 includes detailed information on the 17 key regulators from the GeneCards,[166]^40 GO,[167]^41 and Kyoto Encyclopedia of Genes and Genomes (KEGG)[168]^42 databases. The expression patterns of the 17 KDGs in CM differentiation were verified by in vitro differentiation experiments from mESCs into CM ([169]Figures 7, [170]8, and [171]S6). One of the reasons for the incomplete match of RNA sequencing (RNA-seq) and quantitative real-time PCR could be the difference of differentiation protocols and time kinetics between different batches and laboratories. Discussion Although ESCs have been widely used to investigate molecular mechanism of heart development, little is known about the relationships between humans and mice during CM differentiations from ESCs. Additionally, although humans and mice shared key pathways during CM differentiation, they have their own characteristics in terms of regulatory circuits ([172]Figures S3D and S3E). Therefore, we herein compare the transcriptomes to identify similarities and differences in human and mouse ESCDCD. During CM differentiation, the involved pathways and BPs of human and mouse DEGs at corresponding phases are similar, even though mice have many more DEGs with more dramatic changes. At the onset of differentiation, the signaling pathway regulating pluripotency of hESCs and mESCs is inhibited with the repression of pluripotency genes such as Nanog, Sox2, and Oct4.[173]^43 Both the Wnt and TGF-β signaling pathways have also been extensively studied for their roles in controlling fate commitment when ESCs start to differentiate into MES.[174]^44^,[175]^45 Similarly, the pathways of DEGs involved in transition from CP to CM are both associated with heart function and related diseases. However, only Wnt signaling regulates the transition from MES to CP in both species, suggesting that different CPs may have distinct molecular signatures, even in the same species.[176]^46 The pathways enriched in the human- and mouse-specific DEG signatures are quite different, indicating that the molecular changes are more species-specific. In the present study, we identify 212 hierarchical modules in the gene co-expression network underlying human CM differentiation. We find that M3 and M8 are associated with the cell differentiation phase, while M10 is involved in heart contraction. The enrichment analysis shows that these modules capture the molecular information during cardiac lineage. The DEG signatures in mice are significantly enriched in many modules of the human network, just as for the DEG signatures in humans. Moreover, many modules are enriched for both of human and mouse DEG signatures. The human-specific modules are associated with differentiation transitions of human ESCDCD. The binding of TFs to the distal region of genes is more likely to affect the transcription of non-coding RNA.[177]^34 Also, 1 kb upstream and downstream of the TSS is considered to be the promoter of the gene, so the binding of TFs in this region will initiate transcription of the gene.[178]^47 So, by the integration of DEGs, modules, and TF targets, we identified some potential regulation mechanisms of TFs during CM differentiation. First, the DEGs regulated by MXI1, MYC, and ZNF143 during CM differentiation are similar, so we speculate that they may synergistically regulate the differentiation process. Second, NANOG, EOMES, and FOXH1 specifically regulate a certain phase transition of sequential CM differentiation. Through gene co-expression network analysis, we also systematically uncovered a number of key regulators of CM differentiation, such as Meis1, Dkk1, FLRT2, Myl9, and Fabp3. Meis1 is known to play key roles in heart development and CM differentiation,[179]^1^,[180]^15 although the underlying molecular mechanisms remain elusive. Dkk1 has been demonstrated to promote CM differentiation through activating Wnt signaling and to suppress both hematopoietic and endodermal lineages.[181]^48 FLRT2 is also essential for maintaining heart development since its deficiency is associated with cardiac insufficiency.[182]^49 Myl9 and Fabp3 have been shown to be involved in myocardial infarction and cardiomyopathy,[183]^50^,[184]^51 but their roles in CM differentiation have not been elucidated. All of the genes identified herein are potentially involved in cardiac lineage specification, but further functional verification is needed to clarify specific regulatory mechanisms. A number of transcriptomic analyses of cardiac-specific differentiation from ESCs have been carried out to reveal the regulatory networks. However, almost all of them focus on the process only in one particular species. Our integrative network study of human and mouse transcriptomic data in cardiac-specific differentiation systematically reveals gene expression patterns specific to each species or common to both of them. There are some limitations with this work. First, the number of samples in the mouse ESCDCD is too small to construct a whole co-expression network, and thus key drivers of mouse ESCDCD may be missing. Second, the functions of the predicted key drivers of human ESCDCD need be further investigated to elucidate the molecular regulations in CM differentiation, as indicated by the co-expression network in the human ESCDCD. In conclusion, our comparative study of humans and mice reveals high similarity and specificity in transcriptomic changes in CM differentiation from hESCs and mESCs. Gene co-expression network analysis dissects the complex DEG signatures by providing a detailed gene regulatory relationship in the CM differentiation process. Some novel TFs, including NANOG, EOMES, and FOXH1, are identified to specifically regulate CM differentiation. More key regulators are also systematically predicted through the network analysis, and the expression patterns of several key genes are further experimentally validated. Taken together, our results present potential regulation information during CM differentiation in both humans and mice. Materials and Methods Cardiomyocyte Differentiation RNA-Seq Data Acquisition and Preprocessing The data for ESC-derived CM in mice (GEO: [185]GSE47948) and humans (GEO: [186]GSE64417) were obtained from the NCBI GEO databases.[187]^52^,[188]^53 Both datasets include four phases, for which differentiating cultures were enriched for ESCs, MES, cardiac precursors, and CM. For the mice, embryonic day 14 (E14) transgenic (Tg) (Nkx2-5-EmGFP) mESCs were cultured in serum-free med and were directly differentiated into CM following the protocols of previous studies.[189]^1^,[190]^54 RNA was extracted using TRIzol reagent and sequenced by an Illumina genome analyzer IIx. The mouse RNA-seq data included eight samples with two replicates at each phase. In terms of human data, the RUES2 hESCs were cultured in mouse embryonic fibroblast-conditioned medium and were directly differentiated in CM by a monolayer platform with a modified protocol based on previous studies.[191]^17^,[192]55, [193]56, [194]57 RNA was extracted using the mirVana kit (Life Technologies), and the RNA library was created according to the manufacturer’s protocol for the Ribo-Zero Gold sample prep kit. The human data for 71 samples were generated by Illumina HiSeq 2500 in biological triplicates. Cleaned fastq data for humans and mice were both downloaded from ArrayExpress (ArrayExpress: [195]E-GEOD-64417; ArrayExpress: [196]E-GEOD-47948),[197]^58 and the overall quality of the data was checked using FastQC, a Java-based tool for corresponding quality scores.[198]^59 Mapping and Differentially Expressed Analysis The reference genome sequence and annotation information for mm9[199]^16 and hg19 (GRCH38)[200]^60 were downloaded from the Ensembl database[201]^61 and were respectively used for the alignment of the reads for human or mouse RNA-seq samples. First, TopHat2[202]^62 aligned the reads to the indexed genome built by Bowtie 2[203]^63 for the reference genome, with default options. Subsequently, the FPKM values for genes were calculated after assembling and quantifying transcripts when the reads were aligned to the reference genome using Cufflinks v2.2.1 software.[204]^64 Finally, the analysis of DEGs was performed using Cuffdiff2 in Cufflinks2 with default options, except that the dataset was a time series (T). To filter the genes that were not expressed or were caused by sequencing error, the genes with greater than 1 FPKM in at least one time point sample were retained for subsequent analyses. DEGs were identified by an adjusted p value (adjust-p) of 0.05 and fold change (FC) of 2. In order to detect the different levels of DEGs, we defined 10 DEG signatures, including all DEGs (DEG_all), DEGs between two adjacent differentiation phases (DEG_MES_vs_ESC, DEG_CP_vs_MES, DEG_CM_vs_CP), and upregulated and downregulated DEG signatures (Up_MES_vs_ESC, Down_MES_vs_ESC, Up_CP_vs_MES, Down_CP_vs_MES, Up_CM_vs_CP, Down_CM_vs_CP) Regulatory Information Analysis of TFs We obtained the TF binding information from ChIPBase v2.0, which curates ∼10,200 chromatin immunoprecipitation sequencing (ChIP-seq) datasets across 10 species from multiple databases, such as ENCODE, GEO, and NIH Roadmap Epigenomics Project.[205]^34 Since we mainly focused on the CM differentiation of ESCs, we selected only TF target information for human ESC lines (H1 or H19). The regulatory region of TFs is defined between 1 kb upstream and 1 kb downstream of the TSS. Combined with the information of DEGs during human CM differentiation, the regulatory information of 12 differentially expressed TFs was downloaded for downstream analyses. We respectively intersect the target genes of the 12 TFs with the genes and DEG signatures in human CM differentiation as target gene sets and target DEG sets. Then, using the hypergeometric test, we clustered the target gene sets and target DEG sets of 12 TFs. Similarly, we also tested the enrichment of the target genes of 12 TFs in the DEG signatures between two adjacent phases as well as all of the modules in the human co-expression network. Gene Co-expression Network Analysis MEGENA is an R package for multiscale embedded gene co-expression network analysis. It shows improved performance over co-expression network construction.[206]^25 MEGENA contains three steps: (1) construction of a planar filtered network (PFN); we calculated the correlation based on the gene expression profiles, and then filtered the gene pairs using a parallelized screening procedure to obtain a fast planar filtered network; (2) multi-scale clustering analysis; from connected components of the initial PFN, multi-scale clustering was performed for each parent cluster to get a hierarchy clustering result; and (3) downstream analysis. The significant hubs were identified based on the topology of networks using multiscale hub analysis (MHA). Since the number of samples is recommended to be greater than 20,[207]^65 we only constructed the co-expression network based on the human CM differentiation data. MEGENA took as input the significant gene-gene correlations by an adjusted p-value threshold of 0.05 to construct a PFN, which was then used to identify the significant modules and the corresponding hubs with high connectivity. For each module, we performed the GO and KEGG pathway enrichment analysis by R packages in GO.db (the GO from GO) and KEGG.db (the KEGG pathway from KEGG). Hypergeometric Test A hypergeometric test was used to calculate the significance of enrichment and dataset intersection. Then, the probability of the hypergeometric test is [MATH: P(X=k)=CMkCNMnkCNn.< /mrow> :MATH] In the matter of intersection of DEG signatures, k is the overlap of two DEG signatures. N is the size of human genes or the number of union between human and mouse genes. The number of DEG signatures is respectively represented by M and n. For the enrichment of DEG signatures on modules, the N, M, n, and k mean the total number of genes in the network, the size of the module, the number of genes in the DEG signature, and the overlap between the module and the DEG signature, respectively. For the intersection of 12 TFs, k is the overlap of target gene sets or target DEG sets in the human CM differentiation regulated by two TFs. M and n mean the target gene sets or target DEG sets regulated by two TFs, respectively. N represents the background genes, which is the whole genes or all DEGs during the human CM differentiation. All hypergeometric tests were performed using the phyper function in R. The p values were corrected by the Benjamini-Hochberg procedure to get the adjust-p. GO Function and KEGG Pathway Enrichment Analysis ClusterProfiler,[208]^66 an R package, was used to perform the GO and KEGG pathway enrichment analyses. The threshold of significance was 0.05 for multiple testing corrected p values. KDG Analysis KDA is an R package and is developed to identify potential key regulatory components with respect to various biological contexts.[209]^35 KDA uses a set of DEGs and the directed co-expression network as input. Cell Culture Mouse D3 ESCs were cultured under 5% CO[2] at 37°C on feeder cells in the dishes coated with 0.1% gelatin in the media (500 mL of DMEM-high-glucose medium [Gibco, Carlsbad, CA, USA], 6.25 mL of non-essential amino acids [Gibco, Carlsbad, CA, USA], 6.25 mL of l-glutamine [Gibco, Carlsbad, CA, USA], 12.5 mL of penicillin/streptomycin [Gibco, Carlsbad, CA, USA], 94 mL of fetal bovine serum [Gibco, Carlsbad, CA, USA], 4.4 μL of β-mercaptoethanol [Sigma-Aldrich, St. Louis, MO, USA], and 62.5 μL of leukemia inhibitory factor [Millipore, Temecula, CA, USA]).[210]^67 Cells were passaged every 2–3 days using 0.25% trypsin-EDTA. The feeder layers’ cells were prepared according to the previous description.[211]^68 Cardiomyocyte Differentiation mESCs were differentiated into CM by the “hanging drop” method[212]^67 and then aggregated into embryoid bodies (EBs). The dissociation of mESCs was accomplished using 0.25% trypsin-EDTA and then resuspension in differentiation medium, both of which, and the medium of EB formation, were prepared as described.[213]^67 The hanging drops were hung for 2 days to form the EBs. The EBs were then collected and transferred onto 0.1% gelatin-coated six-well plates (15 EBs/well) at the following phases of differentiation: ESC phase, day 0; MES phase, days 4–6; CP phase, days 8–10; CM phase, days 12–15. Quantitative Real-Time PCR Total RNA was extracted using the standard TRIzol protocol (Invitrogen, Carlsbad, CA, USA). Then, 1,000 ng of RNA was used for the reverse transcription to synthesize cDNA with the PrimeScript RT reagent kit with genomic DNA (gDNA) Eraser (TaKaRa, Kusatsu, Japan). SYBR Premix Ex Taq II (TaKaRa, Kusatsu, Japan) was used in the real-time PCR reaction system and then carried out on a Bio-Rad CFX Connect real-time system. The related primers are listed in [214]Table S6. Tools for Plotting and Visualizing Networks All plots were drawn by using R packages. ggplot2[215]^69 is a well-known R package and is used to draw the bubble plots for function enrichment results and heatmaps. The sunburst plots were generated by the sunburstR package. The network and module-based subnetworks were visualized with Cytoscape.[216]^70 Statistical Analysis A Student’s test (two-tailed) was used to perform statistics analysis, and the error bar represents the standard error of the mean (SEM) of three independent experiments. p <0.05 was considered to represent a significant difference. Author Contributions Y.W., N.Y., and L.L. conceived the study and designed the experiments. Y.W. performed all bioinformatics analyses. X.Z., R.C., and Q.L. helped with the bioinformatics analysis. N.Y., Y.H., H.J., H.L., Y.G., and J.Z. were responsible for the experimental validation. B.Z. and L.L. supervised the entire research process. Y.W. wrote the manuscript. B.Z., L.L., L.P., C.T., and M.L. revised the manuscript. All authors reviewed and approved the manuscript. Conflicts of Interest The authors declare no competing interest. Acknowledgments