Abstract

   Cardiomyocyte differentiation derived from embryonic stem cells (ESCs)
   is a complex process involving molecular regulation of multiple levels.
   In this study, we first identify and compare differentially expressed
   gene (DEG) signatures of ESC-derived cardiomyocyte differentiation
   (ESCDCD) in humans and mice. Then, the multiscale embedded gene
   co-expression network analysis (MEGENA) of the human ESCDCD dataset is
   performed to identify 212 significantly co-expressed gene modules,
   which capture well the regulatory information of cardiomyocyte
   differentiation. Three modules respectively involved in the regulation
   of stem cell pluripotency, Wnt, and calcium pathways are enriched in
   the DEG signatures of the differentiation phase transition in the two
   species. Three human-specific cardiomyocyte differentiation phase
   transition modules are identified. Moreover, the potential regulation
   mechanisms of transcription factors during cardiomyocyte
   differentiation are also illustrated. Finally, several novel key
   drivers of ESCDCD are identified with the evidence of their expression
   during mouse embryonic cardiomyocyte differentiation. Using an
   integrative network analysis, the core molecular signatures and gene
   subnetworks (modules) underlying cardiomyocyte lineage commitment are
   identified in both humans and mice. Our findings provide a global
   picture of gene-gene co-regulation and identify key regulators during
   ESCDCD.

   Keywords: embryonic stem cell, cardiomyocyte differentiation,
   differentially expressed gene, gene co-expression network,
   transcription factor, module

Graphical Abstract

   graphic file with name fx1.jpg
   [51]Open in a new tab
     __________________________________________________________________

   The article provides a strategy to compare the similarities and
   differences of core molecular signatures and modules based on a
   co-expression network during differentiation of human and mouse
   embryonic stem cells toward cardiomyocytes. The driver genes and the
   potential regulation of transcription factor are further illustrated in
   the sequential differentiation process

Introduction

   The normal development of heart is essential for maintaining
   physiological functions. Embryonic stem cell (ESC)-derived
   cardiomyocyte (CM) differentiation (ESCDCD) provides an extremely
   valuable in vitro model for understanding the molecular mechanisms of
   heart development. CM differentiation is a dynamic process and involves
   generally four sequential phases: undifferentiated ESC, mesoderm (MES),
   cardiac progenitor (CP), and CM,[52]^1 depending on precise control of
   gene expression patterns at the transcriptional and
   post-transcriptional levels.[53]^2

   In past decades, many transcription factors (TFs) have been identified
   in a variety of physiological and pathological conditions, including CM
   differentiation. For example, Yamanaka factors (OCT4, SOX2, KLF4, and
   cMYC), which are highly expressed in ESCs, have been shown to
   participate in the regulation of developmental pathways and the
   pluripotency of ESCs.[54]^3 During ESCDCD, MES specification is
   controlled by the T-box TFs (T and EOMES),[55]^4^,[56]^5 which could
   active another TF (MESP1), in turn driving the expression of cardiac
   TFs along with repressing these genes related to pluripotency,[57]6,
   [58]7, [59]8 Moreover, the BMP signaling pathway is triggered to induce
   TFs such as Tbx5, Mef2c, and Nkx2.5 for promoting the differentiation
   of CP.[60]^9^,[61]^10 In addition, UPS-mediated ASB2 proteolysis can
   play a role in the maturation of CM by targeting TF3 (TCF3).[62]^11
   Even with so many studies and findings, the molecular regulatory
   mechanism of TFs in CM differentiation remains to be explored.
   Specifically, whether TFs work alone or with other proteins in a
   complex and the regulation patterns of TFs in CM differentiation are
   also obscure.

   Recent advances in sequencing technology and bioinformatics approaches
   have enabled genome-wide transcriptomic studies of CM differentiation
   derived from human ESCs (hESCs) and mouse ESCs (mESCs). The gene
   co-expression network has been widely used to elucidate complex
   biological processes (BPs) and identify key genes underlying various
   diseases. For example, a gene co-expression network based on
   time-series data of microarrays successfully identified the key genes
   related to desiccation tolerance in Arabidopsis thaliana seeds.[63]^12
   In cancer research, the strategy has been shown to have powerful
   effects on clarifying the molecular mechanism of
   tumorigenesis.[64]^13^,[65]^14 Recently, through weighted gene
   co-expression network analysis (WGCNA), the phase-specific modules and
   genes involved in human CM differentiation were identified.[66]^15
   However, co-expression network analysis of mouse ESCDCD has not yet
   been adequately carried out. Although human and mouse genomes are very
   similar,[67]^16 there lacks a detailed understanding of differential
   expression patterns of genes during differentiation phase transitions
   in the cardiac lineage common to hESCs and mESCs or specific to each
   species.

   In this study, we comprehensively identify core molecular signatures
   and modules underlying CM differentiation from ESCs in humans and mice.
   Furthermore, the synergistic and phase-specific regulation of TFs
   potentially related to the differentiation process are also predicted.
   Several novel key drivers of ESCDCD are identified and their expression
   patterns are experimentally validated. These findings present an
   overview of gene-gene coordinated regulation and crucial regulators in
   CM differentiation.

Results

Differentially Expressed Gene Signatures during CM Differentiation in hESCs
and mESCs

   We obtained 16,204 human genes and 13,891 mouse genes based on their
   expression greater than 1 fragments per kilobase of transcript per
   million mapped reads (FPKM) in at least one sample during
   ESCDCD.[68]^1^,[69]^17 Given that CM differentiation is a dynamic and
   sequential process (ESCs, MES, CP, and CM), we performed differential
   expression analysis between any two adjacent phases and then took a
   union of the differentially expressed gene (DEG) signatures, leading to
   a total 4,868 and 7,208 DEGs (DEG_all) in humans and mice,
   respectively. As shown in [70]Table 1, the sizes of DEG signatures
   during CM differentiation of humans and mice were quite different.
   During the differentiation of hESCs and mESCs into MES, the shared
   multiple evolutionarily conserved pathways, such as Hippo and Wnt
   signaling, play important roles in the early differentiation of stem
   cells ([71]Figures 1A and 1B).[72]^18^,[73]^19 In the transition from
   MES into CP, the only shared pathway, Wnt signaling, is indispensable
   for the differentiation of stem cells into CM in both humans and mice
   ([74]Figures 1C and 1D), especially for the formation of MES and
   CP.[75]^20 During the differentiating into mature CM, the two DEG
   signatures in humans and mice are both enriched for the pathways
   related to heart diseases ([76]Figures 1E and 1F). Details on the BPs
   of Gene Ontology (GO) enriched in the DEG signatures are shown in
   [77]Table S1, which also shows the similarity of early and late
   differentiation during human and mouse ESCDCD. Not surprisingly, the
   well-known phase-specific molecular markers are almost identical
   ([78]Figure S1). For example, from ESCs differentiating into the MES,
   the expression levels of genes, such as SOX2, NANOG, NODAL, and EPCAM,
   which are known to maintain the pluripotency of ESCs, were inhibited,
   while the expression of MES markers, including MESP1, EOMES, and DKK1,
   were upregulated. The CP markers including ISL1, MEF2C, and BMP4
   started to express during the transition from MES into CP. At the phase
   of terminally differentiated CM, the expression levels of ACTC1, MYH6,
   and NKX2-5 were upregulated to maintain heart function. To
   comprehensively dissect the dynamic changes of cardiac lineage from
   hESCs and mESCs, we further stringently defined six DEG signatures
   based on downregulation from upregulation genes between two adjacent
   phases. Therefore, a total of 10 DEG signatures in humans and mice were
   listed ([79]Table 1). Although the homologous genes in both species
   from MGI (Mouse Genome Informatics)[80]^21 account for only a small
   portion of each corresponding DEG signature, there is a significant
   overlap between them during CM differentiation ([81]Figures 1G and 1H).

Table 1.

   Statistics for DEG Signatures in Humans and Mice
     Signatures    Humans Mice  Homologous Genes Human-Specific Mouse-Specific
   DEG_all         4,868  7,208 2,501            2,367          4,707
   DEG_MES_vs_ESC  1,695  3,981 603              1,092          3,378
   DEG_CP_vs_MES   2,672  2,537 533              2,139          2,004
   DEG_CM_vs_CP    1,852  4,168 842              1,010          3,326
   Up_MES_vs_ESC   486    1,895 126              360            1,769
   Down_MES_vs_ESC 1,209  2,086 294              915            1,792
   Up_CP_vs_MES    1,305  1,715 300              1,005          1,415
   Down_CP_vs_MES  1,367  822   147              1,220          675
   Up_CM_vs_CP     942    2,230 390              552            1,840
   Down_CM_vs_CP   910    1,938 352              558            1,586
   [82]Open in a new tab

Figure 1.

   [83]Figure 1
   [84]Open in a new tab

   Comparison of the DEG Signatures during Cardiomyocyte Differentiation
   in Humans and Mice

   (A–F) Bubble plots show the enriched KEGG pathways of the DEG
   signatures between two adjacent phases in humans and mice. The size of
   a bubble is proportional to the number of genes shared by the
   corresponding pathway and signature. The intensity of the fill-in color
   of a circle is proportional to the significance level. The pathways
   enriched in the DEG signatures in humans and mice are highlighted in
   blue font. (A, C, and E) Bubble plots show the KEGG pathways of DEG
   signatures during human ESCDCD. (B, D, and F) Bubble plots show the
   KEGG pathways of corresponding DEG signatures during mouse ESCDCD. (A
   and B) MES_vs_ESC. (C and D) CP_vs_MES. (E and F) CM_vs_CP. (G) Heatmap
   representation of the result from the hypergeometric test for the
   overlap of the DEG signatures in humans. (H) Heatmap representation of
   the result from the hypergeometric test for the overlap between the DEG
   signatures of humans and mice.

   In addition, [85]Table 2 shows the pathways enriched in the human- and
   mouse-specific DEG signatures between adjacent phases. The
   human-specific DEGs are associated with pathways involved in various
   diseases, but the pathways (e.g., mitogen-activated protein kinase
   [MAPK], phosphatidylinositol 3-kinase [PI3K]-Akt, and Rap1) enriched in
   the mouse-specific DEGs are related to the differentiation
   process.[86]22, [87]23, [88]24 Interestingly, the Rap1 pathway
   regulating diverse BPs is involved in the entire ESCDCD. Collectively,
   these results suggest that there is a certain degree of similarity in
   differential regulation during human and mouse CM differentiations, but
   a vast majority of molecular changes are species-specific. The findings
   pave a way to explore new molecular mechanisms for CM differentiation
   in humans and mice.

Table 2.

   KEGG Pathways of Human- and Mouse-Specific DEGs between Two Adjacent
   Phases
   Human
     __________________________________________________________________

   Mouse
     __________________________________________________________________

   DEG Pathways Adjust.p DEG Pathways Adjust.p
   MES_vs_ESC none none MES_vs_ESC axon guidance 3.93E−14
   MAPK signaling pathway 1.37E−08
   Rap1 signaling pathway 3.71E−08
   breast cancer 3.00E−07
   PI3K-Akt signaling pathway 3.09E−07
   Ras signaling pathway 8.17E−06
   CP_vs_MES proteasome 1.25E−08 CP_vs_MES dilated cardiomyopathy (DCM)
   6.36E−09
   ribosome 3.52E−08 hypertrophic cardiomyopathy (HCM) 2.00E−07
   Alzheimer’s disease 0.0002 Rap1 signaling pathway 2.00E−07
   Parkinson’s disease 0.0002 ECM-receptor interaction 1.28E−06
   systemic lupus erythematosus 0.0003 axon guidance 8.80E−06
   non-alcoholic fatty liver disease (NAFLD) 0.0005 PI3K-Akt signaling
   pathway 9.41E−06
   CM_vs_CP alcoholism 0.0038 CM_vs_CP ECM-receptor interaction 9.08E−07
   systemic lupus erythematosus 0.0039 Alzheimer’s disease 1.73E−06
   ribosome 0.0046 spliceosome 2.16E−06
   steroid biosynthesis 0.0052 thermogenesis 2.16E−06
   non-alcoholic fatty liver disease (NAFLD) 9.10E−06
   Rap1 signaling pathway 0.0001
   [89]Open in a new tab

Gene Co-expression Network Underlying CM Differentiation in Humans

   To further dissect the complex co-regulation relationship among the
   genes during CM differentiation, we performed a co-expression network
   analysis for the human ESCDCD expression data of 16,204 genes from 71
   samples using multiscale embedded gene co-expression network analysis
   (MEGENA).[90]^25 Finally, the human CM differentiation gene
   co-expression network (HCDGEN) is comprised of 9,919 nodes and 26,128
   edges, in which we identified 212 CM differentiation modules with a
   hierarchical structure, presenting parent-child relationships
   ([91]Table S2). The topmost modules in the network are assigned at a
   particular compactness scale ([92]Figure 2A). The module hierarchical
   structure and the enriched BP terms are also summarized in
   [93]Figure 2B. Module 4 (M4), M126, M14, M3, M8, and M10 are most
   significantly enriched for embryo development, developmental process,
   cell differentiation, cardiovascular system development, CM
   differentiation, and heart process, respectively, suggesting potential
   functional relevance to CM differentiation.

Figure 2.

   [94]Figure 2
   [95]Open in a new tab

   The Human Cardiomyocyte Differentiation Gene Co-expression Network
   (HCDGEN) and Its Functional Modules

   (A) The co-expression network. Each color represents one module on the
   topmost in the network. The corresponding modules and the most
   significant GO terms are shown. The lncRNAs and mRNAs are shown as
   diamonds and circles, respectively. The hub gene with the highest
   connectivity is labeled. (B) Sunburst plot showing biological process
   most enriched in the modules. The darker the color, the more
   significant the enrichment.

Concordant Modules of DEG Signatures in CM Differentiation in Humans and Mice

   Considering that the mouse ESCDCD data are too few, the information of
   the co-expression network analysis is insufficient. However, as the
   previous result showed, the DEG signatures between human and mouse
   ESCDCD are very similar because of the evolutionary conservation.
   Therefore, we hypothesize that the modules of HCDGEN can capture key
   gene regulatory relationships in the CM differentiation process not
   only in humans but also in mice. Toward this end, we tested how the DEG
   signatures in human and mouse ESCDCD are enriched in modules of the
   HCDGEN ([96]Figure 3A; [97]Figure S2). The human homologs of the mouse
   genes were based on the MGI database.[98]^21 It can be seen that the
   module enrichment patterns of DEG signatures in human and mouse ESCDCD
   are similar. In addition, 32 and 26 modules are significantly enriched
   for the overall human and mouse DEG signatures (DEG_all), respectively,
   while 15 modules are co-enriched for both signatures ([99]Figure 3B).
   The modules enriched for the human DEG signatures between two adjacent
   phases are similar to those for the corresponding mouse DEG signatures.
   For example, the DEG signatures between MES and ESCs in humans and mice
   are both enriched in M3, which is involved in the regulation of early
   differentiation of ESCs ([100]Figures 2B and [101]3B). M8, the most
   significantly common enriched module for human and mouse DEG signatures
   of CM versus CP, is involved in CM differentiation ([102]Figures 2B and
   [103]3B). These results indicate that the HCDGEN contains information
   of CM differentiation in both humans and mice.

Figure 3.

   [104]Figure 3
   [105]Open in a new tab

   Enrichment of Human and Mouse DEG Signatures in the Modules of the
   HCDGEN

   (A) Sunburst plots showing the enrichment of the human and mouse DEG
   signatures in the modules of the HCDGEN. Each DEG signature is
   represented by a different color. (B) Enrichment of the human and mouse
   DEG signatures in the modules of the HCDGEN. The modules are ranked by
   the enriched significance. Also, modules enriched and shared by the
   corresponding DEG signature in humans and the mice are labeled in blue
   and red, respectively.

Co-expression Subnetworks (Modules) Underlying Differentiation Transition in
Cardiac Commitment

   The aforementioned enrichment of the DEG signatures in 212 modules
   strongly suggests that the HCDGEN can dissect the complex pathways
   underlying CM differentiation in both humans and mice. To further
   understand the molecular interactions underlying adjacent phases, we
   focus on M3, M21, and M8, which are the most significantly enriched for
   the DEG signatures of MES versus ESC, CP versus MES, and CM versus CP,
   respectively, in both of humans and mice ([106]Figures 3B and [107]4).
   M3, with 915 nodes and 2,417 edges, represents the dynamic transition
   between the ESC and MES phases ([108]Figure 4A; [109]Table S3). Almost
   all of the genes in M3 are downregulated in the MES phase in both
   species. M3 is enriched for the pathways associated with stem cell
   differentiation, such as tumor necrosis factor (TNF), MAPK, Rap1, and
   Ras signaling and pluripotency regulation of stem cells ([110]Figures
   S3A, S3D, and S3E; [111]Table S3). Therefore, the genes in M3 might be
   involved in the differentiation of ESCs into MES. For example, the
   inhibition of SOX2, a key hub in M3, was found to induce the MES
   differentiation from human induced pluripotent stem cells.[112]^26 The
   DEG signatures between MES and CP in both species are significantly
   enriched in M21, which is associated with Wnt signaling,
   vasculogenesis, and heart development ([113]Figure S3B; [114]Table S3),
   indicating that this module could capture molecular interactions during
   the differentiation of MES into CP ([115]Figure 4B; [116]Table S3).
   Unlike M3, M21 harbors a large number of upregulated genes in CP in
   comparison with MES. FZD4, which codes a receptor of Wnt protein, is
   one of the 17 hub genes of M21 and plays a role in activating the Wnt
   signaling pathway with Norrin ligand, which may result in an increase
   of the ESC-derived CPs.[117]^27 Alternatively, the DEG signatures
   between CM and CP in humans and mice, especially the upregulated DEG
   signatures, are most enriched in M8 ([118]Figure 4C; [119]Table S3).
   The genes in M8 are mainly involved in maintaining the function of
   mature CM. Dysregulation of some genes in M8 such as SLC8A and SORBS2
   might cause heart disease.[120]^28^,[121]^29

Figure 4.

   [122]Figure 4
   [123]Open in a new tab

   Co-expression Subnetworks (Modules) Associated with Differentiation
   Phases

   The fill-in color of a node represents the direction of differential
   expression in humans, with green indicating downregulation and yellow
   indicating upregulation. The blue and red borders represent
   downregulation and upregulation in mice, respectively. The diamond and
   circle node shapes correspond to lncRNAs and mRNAs, respectively. (A)
   The subnetwork of M3 underlying the transition of ESC to MES. (B) The
   subnetwork of M21 underlying the differentiation of MES to CP. (C) The
   subnetwork of M8 underlying the differentiation of CP to CM.

   Taken together, the three modules detailed herein are the comprehensive
   molecular networks underlying the sequential transition of CM
   differentiation from ESCs, and many key hubs in these modules are
   potential novel regulators for the dynamic transformation in humans and
   mice.

Human-Specific Co-expression Modules Underlying the Transition of CM
Differentiation

   The species-specific subnetworks and key regulators are also crucial to
   understand the regulation of CM differentiation. As the co-expression
   network is constructed only on the human data, we highlight three
   human-specific modules, M338, M17, and M177, corresponding to three
   differentiation transitions during CM differentiation based on their
   specific enrichment for the human DEG signatures ([124]Figure 3B). M388
   is specifically enriched for the DEGs between MES and ESC in humans
   ([125]Figure 5A; [126]Table S4). Among the pathways enriched in M388
   ([127]Table S4), the Notch signaling has been well studied in human
   stem cell differentiation.[128]^30^,[129]^31 Interestingly, FTL, the
   top hub of M388, has much more dramatic expression during human ESCDCD
   than during mouse ESCDCD ([130]Figure 7D). The DEG signature in the
   differentiation from MES into CP, i.e., human DEG_CP_vs_MES, is
   specifically enriched in M17 ([131]Figure 5B; [132]Table S4), which is
   involved in the regulation of RNA process and metabolism. Moreover,
   long non-coding RNA (lncRNA) MGC4859 is a central hub gene of the
   module without any functional annotation. Among other hub genes, DHCR24
   and UBE2H are the most significantly downregulated and upregulated,
   respectively. Unlike M388 and M17, only 25 of the 42 DEGs in M177 are
   specific to humans ([133]Figure 5C; [134]Table S4). M177 is mainly
   involved in cell cycle. However, SFPQ, the most downregulated hub gene
   in M177, was found to exert a role in human-induced pluripotent stem
   cell-derived CM differentiation.[135]^32 The most upregulated hub gene,
   GPI, in M177 was confirmed to attach glypicans to cell membrane in
   order to promote cardiac specification and differentiation.[136]^33 The
   three human-specific modules, M388, M17, and M177, represent the
   human-specific molecular interactions underlying the sequential phase
   transitions during human ESCDCD.

Figure 5.

   [137]Figure 5
   [138]Open in a new tab

   Human-Specific Modules Associated with Cardiomyocyte Differentiation

   The fill-in color of a node represents the direction of differential
   expression in humans, with blue indicating downregulation and red
   indicating upregulation. The diamond and circle node shapes correspond
   to lncRNAs and mRNAs, respectively. (A) The subnetwork of M388
   underlying the transition of human ESC to MES. (B) The subnetwork of
   M17 underlying the differentiation of human MES to CP. (C) The
   subnetwork of M177 underlying the differentiation of human CP to CM.

Figure 7.

   [139]Figure 7
   [140]Open in a new tab

   Validation of the mRNA Expression Patterns of Key Driver Genes by qPCR
   Experiments

   (A–C) Expression patterns of the key driver genes (KDGs) in M3, M21,
   and M8 in the human and mouse RNA-seq datasets and the quantitative
   real-time PCR experiments. (A) LEFTY1 (upper) and AASS (lower). (B)
   FZD4 (upper) and SH3GLB1 (lower). (C) LRRC10 (upper) and SLC8A1
   (lower). (D) Expression patterns of three global KDGs in the human and
   mouse RNA-seq datasets and the qPCR experiments. The curves show the
   expression levels of KDGs at each phase in the human and mouse RNA-seq
   datasets. The red and green curves represent the human and mouse data,
   respectively. The bar plots show the mRNA expression levels of KDGs
   using quantitative real-time PCR (n = 3). ∗p < 0.05, ∗∗p < 0.01.

TFs for Regulating CM Differentiation

   TFs play pivotal roles in a variety of BPs, including heart
   development. However, how TFs are involved in CM differentiation is
   largely elusive. In our study, 12 TFs are differentially expressed
   during human CM differentiation and have binding information in hESCs
   according to ChIPBase.[141]^34 The targets regulated by each TF can be
   predicted based on whether the TF can bind to the region from 1 kb
   upstream or 1 kb downstream around a gene transcription start site
   (TSS). The targets of each TF are then intersected with genes and each
   DEG signature of the differentiation process in human CM, resulting in
   target gene sets and target DEG sets, respectively. The similarity
   between the 12 TFs was assessed by clustering analysis of their target
   gene sets and target DEG sets ([142]Figures 6A and 6B). The clustering
   results on target gene sets show that the regulation of ZNF143, ATF2,
   TEAD4, GABPA, MXI1, and MYC are similar, while ATF3 and USF1 regulate
   similar target genes. However, during the human CM differentiation,
   only MXI1, MYC, and ZNF143 regulate similar DEG. As shown in
   [143]Figure 6C, MXL1 may regulate all three transitions, and MYC and
   ZNF143 might regulate the differentiation of ESCs to MES to CP, while
   the rest of the nine TFs drive one particular transition.
   Alternatively, the enrichment of the TF targets in the DEG signatures
   suggests that NANOG may specifically regulate the pluripotency of ESCs,
   and EOME may regulate the differentiation from MES into CP
   ([144]Figure 6C; [145]Figure S4A). Finally, the enrichment of 12 TF
   target gene sets in the 212 modules suggests that TFs, including MXI1,
   MYC, TEAD4, UPS1, ZNF143, ATF2, ATF3, EGR1, and GABPA, play a potential
   and broad role in regulating the entire differentiation process from
   ESCs into CM ([146]Figure S4B). However, EOMES, FOXH1, and NANOG may
   specifically regulate transformation between two adjacent phases in
   differentiation ([147]Figure 6D).

Figure 6.

   [148]Figure 6
   [149]Open in a new tab

   Transcription Factors Potentially Regulating the Human Cardiomyocyte
   Differentiation Process

   (A) Clustering analysis of the 12 transcription factors (TFs) based on
   similarity between their target gene sets. (B) Clustering analysis of
   the 12 TFs based on similarity between their target DEG sets. (C)
   Enrichment of the DEG signatures in the 12 TF target gene sets. (D)
   Sunburst for enrichment of three TF target DEG sets in the 212 modules
   from the HCDGEN. The darker the color, the more significant the
   enrichment.

Identification and Validation of Key Regulators in CM Differentiation

   The co-expression network of human CM differentiation provides much
   more information beyond the co-expressed gene modules. The connectivity
   will allow us to identify crucial key regulators in cardiac lineage
   commitment. Through the key driver analysis (KDA)[150]^35 for three
   modules (M3, M8, and M21) and the DEG signatures, we identified 60, 28,
   and 16 key driver genes (KDGs) in M3, M8, and M21, respectively
   ([151]Figures S5A–C). The KDGs with consistent expression in humans and
   mice may play important roles in controlling the transition of CM
   differentiation. For example, the inhibition of SOX2, a known ESC
   marker, could induce the differentiation from ESCs into MES.[152]^26
   DCBLD2, as the only upregulated homologous regulator in M3, also shows
   a potential role in promotion of MES formation. Among the KDGs in M21,
   the mutation of KLF3 could lead to abnormal cardiovascular
   development.[153]^36 Similarly, some KDGs in M8, such as MYOM1, ACTC1,
   and ACTN2, are highly expressed in mature CM for maintaining heart
   morphogenesis.[154]37, [155]38, [156]39 To this end, for each of the
   three modules, we sorted KDGs according to the degree of nodes and
   highlighted expression patterns of two genes from the top five KDGs.
   The selected genes have not been reported to be related to CM
   differentiation, so that we could further explore its new molecular
   mechanism in CM differentiation later ([157]Figures 7A–7C). At the same
   time, we investigated the expression patterns of the top 30 KDGs in the
   whole co-expression network, in which 28 of them contained mouse
   homologous genes ([158]Figure S5D). Similarly, we also picked two KDGs
   (RPL8 and FTL) ([159]Figure 7D). Interestingly, the expression of CAP2
   is quite different from the other 27 KDGs, and this expression pattern
   is consistent in humans and mice, so we also highlight its expression
   ([160]Figure 7D; [161]Figure S5D). Additionally, through differential
   expression analysis statistics, it was found that only eight genes are
   significantly differentially expressed in each state transition during
   human and mouse CM differentiation, and the differential expression
   during each phase transition is consistent between humans and mice.
   Also, they are defined dynamic genes, which may be involved in the
   whole process of differentiation stemming from ESCs into CM
   ([162]Figure 8).

Figure 8.

   [163]Figure 8
   [164]Open in a new tab

   Validation of the mRNA Expression Patterns of the Eight Dynamic Genes
   by qPCR Experiments

   The curves show the expression patterns of eight dynamic-expressed
   genes during cardiomyocyte differentiation in human and mouse RNA-seq
   datasets. The red and green curves represent humans and mice,
   respectively. The bar plots show the mRNA expression levels of the
   selected genes using quantitative real-time PCR (n = 3). ∗p < 0.05,
   ∗∗p < 0.01.

   In summary, we finally selected 17 genes (9 KDGs and 8 dynamic genes)
   as potential key regulators in CM differentiation. [165]Table S5
   includes detailed information on the 17 key regulators from the
   GeneCards,[166]^40 GO,[167]^41 and Kyoto Encyclopedia of Genes and
   Genomes (KEGG)[168]^42 databases. The expression patterns of the 17
   KDGs in CM differentiation were verified by in vitro differentiation
   experiments from mESCs into CM ([169]Figures 7, [170]8, and [171]S6).
   One of the reasons for the incomplete match of RNA sequencing (RNA-seq)
   and quantitative real-time PCR could be the difference of
   differentiation protocols and time kinetics between different batches
   and laboratories.

Discussion

   Although ESCs have been widely used to investigate molecular mechanism
   of heart development, little is known about the relationships between
   humans and mice during CM differentiations from ESCs. Additionally,
   although humans and mice shared key pathways during CM differentiation,
   they have their own characteristics in terms of regulatory circuits
   ([172]Figures S3D and S3E). Therefore, we herein compare the
   transcriptomes to identify similarities and differences in human and
   mouse ESCDCD.

   During CM differentiation, the involved pathways and BPs of human and
   mouse DEGs at corresponding phases are similar, even though mice have
   many more DEGs with more dramatic changes. At the onset of
   differentiation, the signaling pathway regulating pluripotency of hESCs
   and mESCs is inhibited with the repression of pluripotency genes such
   as Nanog, Sox2, and Oct4.[173]^43 Both the Wnt and TGF-β signaling
   pathways have also been extensively studied for their roles in
   controlling fate commitment when ESCs start to differentiate into
   MES.[174]^44^,[175]^45 Similarly, the pathways of DEGs involved in
   transition from CP to CM are both associated with heart function and
   related diseases. However, only Wnt signaling regulates the transition
   from MES to CP in both species, suggesting that different CPs may have
   distinct molecular signatures, even in the same species.[176]^46 The
   pathways enriched in the human- and mouse-specific DEG signatures are
   quite different, indicating that the molecular changes are more
   species-specific.

   In the present study, we identify 212 hierarchical modules in the gene
   co-expression network underlying human CM differentiation. We find that
   M3 and M8 are associated with the cell differentiation phase, while M10
   is involved in heart contraction. The enrichment analysis shows that
   these modules capture the molecular information during cardiac lineage.
   The DEG signatures in mice are significantly enriched in many modules
   of the human network, just as for the DEG signatures in humans.
   Moreover, many modules are enriched for both of human and mouse DEG
   signatures. The human-specific modules are associated with
   differentiation transitions of human ESCDCD. The binding of TFs to the
   distal region of genes is more likely to affect the transcription of
   non-coding RNA.[177]^34 Also, 1 kb upstream and downstream of the TSS
   is considered to be the promoter of the gene, so the binding of TFs in
   this region will initiate transcription of the gene.[178]^47 So, by the
   integration of DEGs, modules, and TF targets, we identified some
   potential regulation mechanisms of TFs during CM differentiation.
   First, the DEGs regulated by MXI1, MYC, and ZNF143 during CM
   differentiation are similar, so we speculate that they may
   synergistically regulate the differentiation process. Second, NANOG,
   EOMES, and FOXH1 specifically regulate a certain phase transition of
   sequential CM differentiation.

   Through gene co-expression network analysis, we also systematically
   uncovered a number of key regulators of CM differentiation, such as
   Meis1, Dkk1, FLRT2, Myl9, and Fabp3. Meis1 is known to play key roles
   in heart development and CM differentiation,[179]^1^,[180]^15 although
   the underlying molecular mechanisms remain elusive. Dkk1 has been
   demonstrated to promote CM differentiation through activating Wnt
   signaling and to suppress both hematopoietic and endodermal
   lineages.[181]^48 FLRT2 is also essential for maintaining heart
   development since its deficiency is associated with cardiac
   insufficiency.[182]^49 Myl9 and Fabp3 have been shown to be involved in
   myocardial infarction and cardiomyopathy,[183]^50^,[184]^51 but their
   roles in CM differentiation have not been elucidated. All of the genes
   identified herein are potentially involved in cardiac lineage
   specification, but further functional verification is needed to clarify
   specific regulatory mechanisms.

   A number of transcriptomic analyses of cardiac-specific differentiation
   from ESCs have been carried out to reveal the regulatory networks.
   However, almost all of them focus on the process only in one particular
   species. Our integrative network study of human and mouse
   transcriptomic data in cardiac-specific differentiation systematically
   reveals gene expression patterns specific to each species or common to
   both of them.

   There are some limitations with this work. First, the number of samples
   in the mouse ESCDCD is too small to construct a whole co-expression
   network, and thus key drivers of mouse ESCDCD may be missing. Second,
   the functions of the predicted key drivers of human ESCDCD need be
   further investigated to elucidate the molecular regulations in CM
   differentiation, as indicated by the co-expression network in the human
   ESCDCD.

   In conclusion, our comparative study of humans and mice reveals high
   similarity and specificity in transcriptomic changes in CM
   differentiation from hESCs and mESCs. Gene co-expression network
   analysis dissects the complex DEG signatures by providing a detailed
   gene regulatory relationship in the CM differentiation process. Some
   novel TFs, including NANOG, EOMES, and FOXH1, are identified to
   specifically regulate CM differentiation. More key regulators are also
   systematically predicted through the network analysis, and the
   expression patterns of several key genes are further experimentally
   validated. Taken together, our results present potential regulation
   information during CM differentiation in both humans and mice.

Materials and Methods

Cardiomyocyte Differentiation RNA-Seq Data Acquisition and Preprocessing

   The data for ESC-derived CM in mice (GEO: [185]GSE47948) and humans
   (GEO: [186]GSE64417) were obtained from the NCBI GEO
   databases.[187]^52^,[188]^53 Both datasets include four phases, for
   which differentiating cultures were enriched for ESCs, MES, cardiac
   precursors, and CM. For the mice, embryonic day 14 (E14) transgenic
   (Tg) (Nkx2-5-EmGFP) mESCs were cultured in serum-free med and were
   directly differentiated into CM following the protocols of previous
   studies.[189]^1^,[190]^54 RNA was extracted using TRIzol reagent and
   sequenced by an Illumina genome analyzer IIx. The mouse RNA-seq data
   included eight samples with two replicates at each phase. In terms of
   human data, the RUES2 hESCs were cultured in mouse embryonic
   fibroblast-conditioned medium and were directly differentiated in CM by
   a monolayer platform with a modified protocol based on previous
   studies.[191]^17^,[192]55, [193]56, [194]57 RNA was extracted using the
   mirVana kit (Life Technologies), and the RNA library was created
   according to the manufacturer’s protocol for the Ribo-Zero Gold sample
   prep kit. The human data for 71 samples were generated by Illumina
   HiSeq 2500 in biological triplicates. Cleaned fastq data for humans and
   mice were both downloaded from ArrayExpress (ArrayExpress:
   [195]E-GEOD-64417; ArrayExpress: [196]E-GEOD-47948),[197]^58 and the
   overall quality of the data was checked using FastQC, a Java-based tool
   for corresponding quality scores.[198]^59

Mapping and Differentially Expressed Analysis

   The reference genome sequence and annotation information for
   mm9[199]^16 and hg19 (GRCH38)[200]^60 were downloaded from the Ensembl
   database[201]^61 and were respectively used for the alignment of the
   reads for human or mouse RNA-seq samples. First, TopHat2[202]^62
   aligned the reads to the indexed genome built by Bowtie 2[203]^63 for
   the reference genome, with default options. Subsequently, the FPKM
   values for genes were calculated after assembling and quantifying
   transcripts when the reads were aligned to the reference genome using
   Cufflinks v2.2.1 software.[204]^64 Finally, the analysis of DEGs was
   performed using Cuffdiff2 in Cufflinks2 with default options, except
   that the dataset was a time series (T).

   To filter the genes that were not expressed or were caused by
   sequencing error, the genes with greater than 1 FPKM in at least one
   time point sample were retained for subsequent analyses. DEGs were
   identified by an adjusted p value (adjust-p) of 0.05 and fold change
   (FC) of 2. In order to detect the different levels of DEGs, we defined
   10 DEG signatures, including all DEGs (DEG_all), DEGs between two
   adjacent differentiation phases (DEG_MES_vs_ESC, DEG_CP_vs_MES,
   DEG_CM_vs_CP), and upregulated and downregulated DEG signatures
   (Up_MES_vs_ESC, Down_MES_vs_ESC, Up_CP_vs_MES, Down_CP_vs_MES,
   Up_CM_vs_CP, Down_CM_vs_CP)

Regulatory Information Analysis of TFs

   We obtained the TF binding information from ChIPBase v2.0, which
   curates ∼10,200 chromatin immunoprecipitation sequencing (ChIP-seq)
   datasets across 10 species from multiple databases, such as ENCODE,
   GEO, and NIH Roadmap Epigenomics Project.[205]^34 Since we mainly
   focused on the CM differentiation of ESCs, we selected only TF target
   information for human ESC lines (H1 or H19). The regulatory region of
   TFs is defined between 1 kb upstream and 1 kb downstream of the TSS.
   Combined with the information of DEGs during human CM differentiation,
   the regulatory information of 12 differentially expressed TFs was
   downloaded for downstream analyses. We respectively intersect the
   target genes of the 12 TFs with the genes and DEG signatures in human
   CM differentiation as target gene sets and target DEG sets. Then, using
   the hypergeometric test, we clustered the target gene sets and target
   DEG sets of 12 TFs. Similarly, we also tested the enrichment of the
   target genes of 12 TFs in the DEG signatures between two adjacent
   phases as well as all of the modules in the human co-expression
   network.

Gene Co-expression Network Analysis

   MEGENA is an R package for multiscale embedded gene co-expression
   network analysis. It shows improved performance over co-expression
   network construction.[206]^25 MEGENA contains three steps: (1)
   construction of a planar filtered network (PFN); we calculated the
   correlation based on the gene expression profiles, and then filtered
   the gene pairs using a parallelized screening procedure to obtain a
   fast planar filtered network; (2) multi-scale clustering analysis; from
   connected components of the initial PFN, multi-scale clustering was
   performed for each parent cluster to get a hierarchy clustering result;
   and (3) downstream analysis. The significant hubs were identified based
   on the topology of networks using multiscale hub analysis (MHA). Since
   the number of samples is recommended to be greater than 20,[207]^65 we
   only constructed the co-expression network based on the human CM
   differentiation data. MEGENA took as input the significant gene-gene
   correlations by an adjusted p-value threshold of 0.05 to construct a
   PFN, which was then used to identify the significant modules and the
   corresponding hubs with high connectivity. For each module, we
   performed the GO and KEGG pathway enrichment analysis by R packages in
   GO.db (the GO from GO) and KEGG.db (the KEGG pathway from KEGG).

Hypergeometric Test

   A hypergeometric test was used to calculate the significance of
   enrichment and dataset intersection. Then, the probability of the
   hypergeometric test is
   [MATH: <mrow><mtext>P</mtext><mrow><mo>(</mo><mrow><mtext>X</mtext><mo
   linebreak="badbreak">=</mo><mtext>k</mtext></mrow><mo>)</mo></mrow><mo
   linebreak="badbreak">=</mo><mfrac><mrow><msubsup><mi>C</mi><mi>M</mi><m
   i>k</mi></msubsup><msubsup><mi>C</mi><mrow><mi>N</mi><mo
   linebreak="badbreak">−</mo><mi>M</mi></mrow><mrow><mi>n</mi><mo
   linebreak="badbreak">−</mo><mi>k</mi></mrow></msubsup></mrow><mrow><msu
   bsup><mi>C</mi><mi>N</mi><mi>n</mi></msubsup></mrow></mfrac><mo>.</mo><
   /mrow> :MATH]

   In the matter of intersection of DEG signatures, k is the overlap of
   two DEG signatures. N is the size of human genes or the number of union
   between human and mouse genes. The number of DEG signatures is
   respectively represented by M and n. For the enrichment of DEG
   signatures on modules, the N, M, n, and k mean the total number of
   genes in the network, the size of the module, the number of genes in
   the DEG signature, and the overlap between the module and the DEG
   signature, respectively. For the intersection of 12 TFs, k is the
   overlap of target gene sets or target DEG sets in the human CM
   differentiation regulated by two TFs. M and n mean the target gene sets
   or target DEG sets regulated by two TFs, respectively. N represents the
   background genes, which is the whole genes or all DEGs during the human
   CM differentiation. All hypergeometric tests were performed using the
   phyper function in R. The p values were corrected by the
   Benjamini-Hochberg procedure to get the adjust-p.

GO Function and KEGG Pathway Enrichment Analysis

   ClusterProfiler,[208]^66 an R package, was used to perform the GO and
   KEGG pathway enrichment analyses. The threshold of significance was
   0.05 for multiple testing corrected p values.

KDG Analysis

   KDA is an R package and is developed to identify potential key
   regulatory components with respect to various biological
   contexts.[209]^35 KDA uses a set of DEGs and the directed co-expression
   network as input.

Cell Culture

   Mouse D3 ESCs were cultured under 5% CO[2] at 37°C on feeder cells in
   the dishes coated with 0.1% gelatin in the media (500 mL of
   DMEM-high-glucose medium [Gibco, Carlsbad, CA, USA], 6.25 mL of
   non-essential amino acids [Gibco, Carlsbad, CA, USA], 6.25 mL of
   l-glutamine [Gibco, Carlsbad, CA, USA], 12.5 mL of
   penicillin/streptomycin [Gibco, Carlsbad, CA, USA], 94 mL of fetal
   bovine serum [Gibco, Carlsbad, CA, USA], 4.4 μL of β-mercaptoethanol
   [Sigma-Aldrich, St. Louis, MO, USA], and 62.5 μL of leukemia inhibitory
   factor [Millipore, Temecula, CA, USA]).[210]^67 Cells were passaged
   every 2–3 days using 0.25% trypsin-EDTA. The feeder layers’ cells were
   prepared according to the previous description.[211]^68

Cardiomyocyte Differentiation

   mESCs were differentiated into CM by the “hanging drop” method[212]^67
   and then aggregated into embryoid bodies (EBs). The dissociation of
   mESCs was accomplished using 0.25% trypsin-EDTA and then resuspension
   in differentiation medium, both of which, and the medium of EB
   formation, were prepared as described.[213]^67 The hanging drops were
   hung for 2 days to form the EBs. The EBs were then collected and
   transferred onto 0.1% gelatin-coated six-well plates (15 EBs/well) at
   the following phases of differentiation: ESC phase, day 0; MES phase,
   days 4–6; CP phase, days 8–10; CM phase, days 12–15.

Quantitative Real-Time PCR

   Total RNA was extracted using the standard TRIzol protocol (Invitrogen,
   Carlsbad, CA, USA). Then, 1,000 ng of RNA was used for the reverse
   transcription to synthesize cDNA with the PrimeScript RT reagent kit
   with genomic DNA (gDNA) Eraser (TaKaRa, Kusatsu, Japan). SYBR Premix Ex
   Taq II (TaKaRa, Kusatsu, Japan) was used in the real-time PCR reaction
   system and then carried out on a Bio-Rad CFX Connect real-time system.
   The related primers are listed in [214]Table S6.

Tools for Plotting and Visualizing Networks

   All plots were drawn by using R packages. ggplot2[215]^69 is a
   well-known R package and is used to draw the bubble plots for function
   enrichment results and heatmaps. The sunburst plots were generated by
   the sunburstR package. The network and module-based subnetworks were
   visualized with Cytoscape.[216]^70

Statistical Analysis

   A Student’s test (two-tailed) was used to perform statistics analysis,
   and the error bar represents the standard error of the mean (SEM) of
   three independent experiments. p <0.05 was considered to represent a
   significant difference.

Author Contributions

   Y.W., N.Y., and L.L. conceived the study and designed the experiments.
   Y.W. performed all bioinformatics analyses. X.Z., R.C., and Q.L. helped
   with the bioinformatics analysis. N.Y., Y.H., H.J., H.L., Y.G., and
   J.Z. were responsible for the experimental validation. B.Z. and L.L.
   supervised the entire research process. Y.W. wrote the manuscript.
   B.Z., L.L., L.P., C.T., and M.L. revised the manuscript. All authors
   reviewed and approved the manuscript.

Conflicts of Interest

   The authors declare no competing interest.

Acknowledgments