Abstract Liver cancer arises from the evolutionary selection of the dynamic tumor microenvironment (TME), in which the tumor cell generally becomes more heterogeneous; however, the mechanisms of TME-mediated transcriptional diversity of liver cancer remain unclear. Here, we assess transcriptional diversity in 15 liver cancer patients by single-cell transcriptome analysis and observe transcriptional diversity of tumor cells is associated with stemness in liver cancer patients. Tumor-associated fibroblast (TAF), as a potential driving force behind the heterogeneity in tumor cells within and between tumors, was predicted to interact with high heterogeneous tumor cells via COL1A1-ITGA2. Moreover, COL1A1-mediated YAP-signaling activation might be the mechanistic link between TAF and tumor cells with increased transcriptional diversity. Strikingly, the levels of COL1A1, ITGA2, and YAP are associated with morphological heterogeneity and poor overall survival of liver cancer patients. Beyond providing a potential mechanistic link between the TME and heterogeneous tumor cells, this study establishes that collagen-stimulated YAP activation is associates with transcriptional diversity in tumor cells by upregulating stemness, providing a theoretical basis for individualized treatment targets. Subject terms: Tumour heterogeneity, Cancer stem cells Introduction Most liver cancer cases develop mainly due to evolutionary selection of an adverse tumor microenvironment (TME), in which deterministic tumor features are preferentially selected for their survival fitness [[46]1, [47]2]. Accordingly, complex genomics and TMEs create a molecular conundrum for the diagnosis and treatment of liver cancer and contribute to therapeutic failure and ultimately lethal outcomes [[48]2]. Tumor cells with strong stemness can generate heterogeneous subtypes through multidirectional differentiation [[49]3, [50]4]. However, the underlying mechanisms of TME-mediated heterogeneity in liver cancer remain unclear. Molecular characterization of cell communities at the single-cell level may help shed light on the complex interplay among tumor cells and stromal cells in the TME. Liver cancer is etiologically and biologically heterogeneous, comprising many molecular subtypes [[51]5, [52]6], which are clinically treated as separate entities. Interestingly, there is evidence that some molecular subtypes of hepatocellular carcinoma (HCC) and intrahepatic cholangiocarcinoma (iCCA) share similar tumor biology and key drivers [[53]7]. Accordingly, the hepatic microenvironment could direct lineage commitment to either HCC or iCCA in the presence of the same oncogenic drivers [[54]8, [55]9]. The TME likely improves the capability of tumor cells to grow in poor microenvironment [[56]2, [57]8, [58]9]. The cellular components of the TME are highly complex resulting in high microenvironmental diversity associated with poor prognosis [[59]2, [60]9]. As an important constituent of the TME, tumor-associated fibroblast (TAF) is regarded as a promising therapeutic target for limiting cancer progression [[61]10]. Studies have established that TAF could facilitate cancer progression via aberrant extracellular matrix (ECM) remodeling with collagen I (COL-I) enrichment [[62]10, [63]11]. However, whether and how TAF regulates the heterogeneity of tumor cells is unclear, and requires further investigation. The Hippo-Yes-associated protein (YAP) signaling is known to regulate stem cell homeostasis, tissue regeneration and tumor progression [[64]12]. YAP is a major downstream effector of Hippo pathway [[65]12, [66]13]. Phosphorylation of YAP results in its cytoplasmic retention and degradation [[67]14, [68]15]. When dephosphorylated, YAP can enter the nucleus and bind with transcriptional factors to regulate the expression of many target genes [[69]16], usually increasing cell proliferation and decreasing apoptosis [[70]17]. YAP hyperactivation has been observed in many tumors, including liver cancer [[71]13, [72]18]. However, whether and how Hippo-YAP signaling responds to TME stimuli to affect the tumor heterogeneity of liver cancer remains unknown. Here, upon assessing the single-cell sequencing results of liver cancer patients, we identified transcriptional diversity of tumor cells potentially coexisted with the increased stemness. TAF interacted with high heterogeneous tumor cells via COL1A1-ITGA2, which activated YAP-signaling to regulate transcriptional diversity in tumor cells by improving their stemness. Thus, our study uncovered intercellular crosstalk between TAF and tumor cells involved in transcriptional diversity in liver cancer, suggesting potential targets for liver cancer therapy. Experimental procedures Cell culture and establishment of primary tumor-associated fibroblasts (TAFs) HCC cell lines (Huh7, LM-3, and HepG2) were purchased from ATCC and cultured in DMEM media (Gibco) with Fetal Bovine Serum (FBS, 10%; Gibco) and Penicillin-Streptomycin (100 U; Gibco) at 37 °C, 5% CO2. Fresh human tumor tissues were used to harvest primary TAFs. Tissue samples were cut into small pieces (approximately 3 mm^3) and put on to six-well cell culture plates with Dulbecco’s Modified Eagle’s Medium (DMEM) containing 10% FBS (Gibco), 10 ng/mL basic fibroblast growth factor (bFGF), 100 U/mL penicillin and 100 mg/mL streptomycin (Gibco). TAFs were starting to migrate out from the small piece of tissues from 3 to 7 days later. After 2 weeks, the remnants of the tissue were carefully removed and subcultured TAFs every 3 days, and passages 3–10 were used for this study [[73]19, [74]20]. Reagents and antibodies DEN (N0756) and antibodies specific for Flag (M3165) were obtained from Sigma-Aldrich (MO). Antibodies specific for YAP (sc-101199, immunofluorescence [IF]) were purchased from Santa Cruz Biotechnology (TX), and those specific for YAP (ab39361, IP/IHC), ITGA2 (ab181548), and COL1A1 (ab34710) were purchased from Abcam (MA). Animal experiments Male SD rats (10–12 weeks, 220–250 g) were obtained from the Shanghai Experimental Center, Chinese Science Academy, Shanghai. The rats were randomly grouped and maintained at an animal facility under pathogen-free conditions. All animal experiments were performed according to the animal protocols approved by the Shanghai Eastern Hepatobiliary Surgery Hospital Animal Care Committee. To induce the model of liver cancer, 100 p.p.m. DEN (95 μg/ml) was added to the drinking water of rats for 16 weeks. Liver tumors were measured with electronic calipers and counted (for tumors with diameters ≥1 mm). Liver sections were preserved in 10% neutral-buffered formalin for histopathological and immunohistochemistry (IHC) analyses, blood was collected, and serum was isolated for biochemical analysis. All analysis were conducted in investigator blinded fashion. Sphere formation assay Sphere formation assays were performed in 6-cm culture dishes coated with 1% agarose. HPCs suspended in serum-free medium were seeded at a density of 5 000 cells/dish and incubated for 3–7 days in low or high concentrations of collagen. The numbers and sizes of spheres were observed manually under a microscope. Tissue microarray and IHC staining Tissue microarray (TMA) sections of tumor and adjacent nontumor specimens were prepared by Shanghai Outdo Biotech Co., Ltd. (Shanghai, China). This TMA contains tissues from 77 paired fresh liver carcinoma and adjacent tumor tissue samples, and was used to examine the expression profiles of YAP, ITGA2 and COL1A1 by IHC. For IHC, TMA sections were incubated with anti-ITGA2 antibody (1:200 dilution), anti-YAP1 antibody (1:100 dilution), or anti-COL1A1 antibody (1:200 dilution). IHC staining was scored by two independent pathologists who were blinded to the clinical characteristics of the patients. The scoring system was based on the intensity and extent of staining: staining intensity was classified as 0 (negative), 1 (weak), 2 (moderate), or 3 (strong); and the staining extent was dependent on the percentage of positive cells (out of 200 examined cells) and was classified as 0 (<5%), 1 (5–25%), 2 (26–50%), 3 (51–75%), or 4 (>75%). According to the staining intensity and staining extent scores, the IHC results were classified as 0–1, negative (-); 2–4, weakly positive (+); 5–8, moderately positive (++), and 9–12, strongly positive (+++). Immunofluorescent staining For antigen colocalization studies, fluorescence immunostaining of multiple proteins in tissues was performed with a sequential fluorescent method. Primary antibodies of against ITGA2 (1:200 dilution), YAP (1:100 dilution) and COL1A1 (1:200 dilution) were used. Alexa 488-conjugated goat antimouse IgG (Invitrogen, Carlsbad, CA) Alexa 561-conjugated goat antirabbit IgG (Invitrogen) and Alexa 647-conjugated goat antirabbit IgG were used as secondary antibodies. Real-time quantitative polymerase chain reaction (RT-PCR) Total RNA was extracted from cells by using TRIzol Reagent (Invitrogen, Carlsbad, CA, USA), and further treated with RNase free DNase (Promega, Madison, WI, USA) to eliminate any residual DNA. Complementary DNA was prepared by using oligo dT18-primers and MMLV reverse transcriptase (Promega). RT-PCR was performed on a Light Cycler 480 system (Roche Diagnostics, Mannheim, Germany). The primers used in this experiment were as follows: SMAD2, forward, 5′-CCCACTCCATTCCAGAAAAC-3′, and reverse, 5′-GAGCCTGTGTCCATACTTTG-3′; SOX9, forward, 5′-AGGAAGCTGGCAGACCAGTA-3′, and reverse, 5′-ACGAAGGGTCTCTTCTCGCT-3′; MYC, forward, 5′-AACAGGAACTATGACCTCG-3′, and reverse, 5′-AGCAGCTCGAA TTTCTTC-3′; and FGF8, forward, 5′-CAGTTGGAATTGTGGCAATCAAAG-3′, and reverse, 5′-CTTTTGATTTAAGGCAACGAACATTTC-3′. Patients and follow-up analysis The cohort in this study contained 77 patients from January 1997 to December 2007. All patients were randomly selected from those with liver cancer who underwent hepatectomy in the Shanghai Eastern Hepatobiliary Surgery Hospital. Informed consent was obtained from each patient under a protocol approved by the Hospital Research Ethics Committees. None of the patients were administered preoperative treatment, and recurrence was confirmed by contrast-enhanced imaging studies or cholangiography according to standard guidelines for liver cancer. Overall survival (OS) was defined as the interval between surgery and either death or the last follow-up. The data were censored at the last follow-up for surviving patients. Transfection and viral Infection Approximately 0.5–2 µg/ml plasmid was transfected using Lipofectamine 2000 (Invitrogen) according to the manufacturer’s instructions. Where indicated, cells were infected with virus expressing GFP (multiplicity of infection (MOI), 10) or YAP (MOI, 10) in serum-free medium for the indicated times. Eight hours later, the cells were rinsed and cultured in fresh medium. After 48 h, the cells were cultured in DMEM supplemented with 10% FBS. ChIP-qPCR ChIP was performed as previously described [[75]21]. Briefly, cells were cross-linked with freshly prepared formaldehyde (1.42%) for 15 min, and treated with glycine (125 mM) for 5 min at room temperature. After two rounds of washing with ice-cold PBS, the cells were scraped and collected by centrifugation. Pelleted cells were resuspended in 400 μL of ChIP lysis buffer (50 mM HEPES/KOH, pH 7.5; 140 mM NaCl; 1 mM EDTA; 1% Triton X-100; 0.1% Na-deoxycholate and protease inhibitors) and subjected to sonication with Bioruptor to shear the chromatin (30 s on high-power, 30 s off; 20 cycles). After sonication, samples were further diluted twice with lysis buffer and centrifuged to clear the supernatant. Eighty microliters of supernatant (1/10 of total) were directly processed to extract total DNA as whole-cell input. The remaining supernatants were transferred to new Eppendorf tubes and incubated with either IgG or YAP antibodies (14074, Cell Signaling Technology) at 4 °C overnight. Samples were then treated with prewashed protein A/G beads (L2118; Santa Cruz) for another 3 h, washed five times with the indicated buffers and resuspended in 100 μL of 10% Chelex (1421253; Bio-Rad). The samples were boiled for 10 min and centrifuged at 4 °C for 1 min. Supernatants were transferred to new tubes. After that, another 120 μL of Milli-Q water was added to each bead pellet, which was vortexed for 10 s, and centrifuged again to spin down the beads. The supernatants were combined together as templates for follow-up qPCR analysis. ChIP-seq ChIP-seq was performed based on a previous protocol with minor modifications [[76]22]. Cells stably expressing Flag-tagged YAP were subjected to the same treatments as described above to obtain cell pellets. After that, cells were resuspended in 400 μL ChIP digestion buffer (20 mM Tris HCl, pH 7.5; 15 mM NaCl; 60 mM KCl; 1 mM CaCl2 and protease inhibitors). To shear the chromatin, cells were digested with an appropriate amount of micrococcal nuclease (MNase, M0247S, NEB) at 37 °C for 20 min to ensure that the majority of chromatin was mono- and di-nucleosomes. The reaction was stopped with 2X stop buffer (100 mM Tris HCl, pH 8.1; 20 mM EDTA; 200 mM NaCl; 2% Triton X-100; 0.2% Na-deoxycholate and protease inhibitors). Samples were further sonicated with a Bioruptor at high power for 15 cycles (30 s on, 30 s off) and then centrifuged to remove debris. Next, soluble chromatin was immunoprecipitated with FLAG antibodies (F3165, Sigma) at 4 °C together with prewashed protein A/G beads. After extensive washing with the indicated buffers, samples were eluted and reverse cross-linked in elution buffer (10 mM Tris HCl, pH 8.0; 10 mM EDTA; 150 mM NaCl; 5 mM DTT and 1% SDS) at 65 °C overnight. Then, sequential digestion with DNase and Proteinase K was performed, and the DNA was purified with a PCR purification kit (B518141, Sangon Biotech). DNA that was successfully collected from three ChIP assays was pooled to generate libraries with the Ovation Ultra-Low Library Prep kit (NuGEN) according to the manufacturer’s instructions. Sequencing was performed on an Illumina HiSeq 2500 platform. Cell dissociation HCC tumor tissues and adjacent tissues were collected after surgical resection in MACS Tissue Storage Solution (Miltenyi Biotec) in a 50 mL conical tube and transported on ice to the laboratory. Briefly, samples were first washed with phosphate-buffered saline (PBS), minced into small pieces (approximately 1mm^3) on ice, and enzymatically digested with collagenase I (Worthington) for 15 min at 37 °C, with agitation. After digestion, samples were sieved through a 70 µm cell strainer, and centrifuged at 300 g for 5 min. The cell pellet was resuspended in 1 mL freezing media (Gibco) for long-term cryopreservation in liquid nitrogen. Throughout the dissociation procedure, cells were maintained on ice whenever possible, and the entire procedure was completed in less than 1 hr. scRNA-seq data processing ScRNA-seq data of tumor samples (n = 15; consistent with the original article, the sample would be taken into account only if it contains more than 20 tumor cells) (GEO: [77]GSE125449), healthy tissues (n = 4) and cirrhotic tissues (n = 3) (GEO: [78]GSE136103) were filtered (both gene and cell) and normalized by using the Seurat package (version 3.0) [[79]23] in R (version 3.5.3). Genes expressed in fewer than three cells per sample were excluded, as were cells that expressed fewer than 500 genes or had a mitochondrial gene content >20% of the total UMI count. The total number of transcripts in each single-cell was normalized to 10,000. Highly variable genes were detected according to the average expression (between 0.05 and 3) and dispersion (above 0.5) of the genes, followed by data scaling (subtracting the average expression) and centering (divided by standard deviation). These variable genes were considered to account for cell-to-cell differences, and were further used for PCA. The first 20 PCs were applied for t-SNE analysis according to the eigenvalues. Identification of nonmalignant cell types We extracted the transcriptome data of cells from the expression profiles of all the single cells evaluated. Similar to the total cell analysis, we first selected variable genes across cells, based on criteria of average expression (between 0.05 and 3) and dispersion (above 0.5) of the genes. Then, we performed data scaling followed by dimension reduction with PCA. The first 20 PCs were selected for t-SNE analysis. Different subclusters of cells were revealed on the t-SNE plot. We annotated the cells based on known cell lineage-specific marker genes as T cells (CD4, CD3E, CD3D, CD3G, CD8A, CD8B), B cells (CD79A, SLAMF7, BLNK, FCRL5), TECs (PECAM1, VWF, ENG, CDH5), CAFs (COL1A2, FAP, ACTA1, COL3A1, COL6A1), TAMs (CD14, CD163, CD68, CSF1R), and HPC-like (EPCAM, KRT19, PROM1, ALDH1A1, CD24). CNV estimation Cells defined as endothelia, fibroblast and macrophage were used as references to identify somatic copy number variations with the R