Graphical abstract [29]graphic file with name ga1.jpg [30]Open in a new tab Abbreviations: HCC, hepatocellular carcinoma; EGCG, epigallocatechin gallate; FIS, fisetin; ECM, extracellular matrix; RT-qPCR, reverse transcription-quantitative real time PCR; DEGs, differentially expressed genes; DMSO, dimethyl sulfoxide; MEM, minimum essential medium; STIM, stimulated; SD, standard deviation; TF, transcription factor; GO, Gene Ontology; SEM, standard error of mean; EMT, epithelial to mesenchymal transition; ADAM, a disintegrin and metalloproteinase with thrombospondin motifs; MMPs, matrix metalloproteinases; ADAMTS9, ADAM metallopeptidase with thrombospondin type 1 motif 9; MMP11, Matrix Metallopeptidase 11; MMP9, Matrix Metallopeptidase 9; SERPINE1, Serpin Family E Member 1; CTGF, Connective Tissue Growth Factor; PDGFRB, Platelet Derived Growth Factor Receptor Beta; HSPB1, Heat Shock Protein Family B (Small) Member 1; HSPA2, Heat Shock Protein Family A (Hsp70) Member 2; CLIC3, Chloride Intracellular Channel 3 Keywords: RNA-sequencing, Hepatocellular carcinoma, Epigallocatechin gallate, Fisetin, Transcription factors, Gene Ontology, Reactome Pathways Abstract Hepatocellular carcinoma (HCC) is an essentially incurable inflammation-related cancer. We have previously shown by network analysis of proteomic data that the flavonoids epigallocatechin gallate (EGCG) and fisetin (FIS) efficiently downregulated pro-tumor cytokines released by HCC through inhibition of Akt/mTOR/RPS6 phospho-signaling. However, their mode of action at the global transcriptome level remains unclear. Herein, we endeavor to compare gene expression alterations mediated by these compounds through a comprehensive transcriptome analysis based on RNA-seq in HEP3B, a responsive HCC cell line, upon perturbation with a mixture of prototypical stimuli mimicking conditions of tumor microenvironment or under constitutive state. Analysis of RNA-seq data revealed extended changes on HEP3B transcriptome imposed by test nutraceuticals. Under stimulated conditions, EGCG and FIS significantly modified, compared to the corresponding control, the expression of 922 and 973 genes, respectively, the large majority of which (695 genes), was affected by both compounds. Hierarchical clustering based on the expression data of shared genes demonstrated an almost identical profile in nutraceutical-treated stimulated cells which was virtually opposite in cells exposed to stimuli alone. Downstream enrichment analyses of the co-modified genes uncovered significant associations with cancer-related transcription factors as well as terms of Gene Ontology/Reactome Pathways and highlighted ECM dynamics as a nodal modulation point by nutraceuticals along with angiogenesis, inflammation, cell motility and growth. RNA-seq data for selected genes were independently confirmed by RT-qPCR. Overall, the present systems approach provides novel evidence stepping up the mechanistic understanding of test nutraceuticals, thus rationalizing their clinical exploitation in new preventive/therapeutic modalities against HCC. 1. Introduction Hepatocellular carcinoma (HCC) manifests as one of the most common and aggressive types of cancer on a global scale [31][1]. It develops under conditions of chronic inflammation [32][2] and liver cirrhosis [33][3], and is considered to be a highly vascularized tumor [34][4]. HCC incidence can be associated with a significant number of causative factors, the most important of which being viral hepatitis B and C, excessive consumption of alcohol, obesity, non-alcoholic fatty liver disease and dietary exposure to aflatoxin B1 [35][1], [36][3], [37][5], [38][6]. Underlying molecular traits of the disease, indicate that hepatocarcinogenesis is a particularly complex process, involving both genetic and epigenetic abnormalities [39][7]. Aberrant activation or inhibition of key signaling pathways that lead to excess cell proliferation and growth, along with an already established condition of oxidative stress, inflammatory tumor microenvironment, deregulated extracellular matrix (ECM) and rich angiogenic activity, shape the mosaic image of HCC incidence, progression and metastatic potential [40][8]. In most cases, HCC is diagnosed in advanced stages, resulting in the absence of many effective treatment regimes [41][9]. Therefore, it is urgent to identify reliable diagnostic biomarkers and develop promising drug candidates for disease prevention or therapy. To this end, a growing body of research has focused on natural dietary compounds known as nutraceuticals - a term coined from “nutrition” and “pharmaceuticals” [42][10] - due to their promising cancer chemopreventive/chemotherapeutic activities and limited side-effects [43][11]. Accordingly, several nutraceuticals have emerged, as inviting alternative candidates [44][12], [45][13] or complementary synergistic agents to conventional therapies [46][14], [47][15]. Their biological activities examined in several cancer types are associated with multiple effects on prominent cellular functions, including modulation of gene expression through epigenetic reprogramming or transcription factor alterations [48][11], [49][16], [50][17], [51][18], [52][19]. Flavonoids represent an important family of plant-derived nutraceuticals possessing potent anti-oxidative properties that render them attractive as inhibitors of oxidative stress, a cancer hallmark known to trigger signaling pathways of cell growth, survival, motility, inflammation, angiogenesis and metastasis, in both tumor and tumor-associated stroma cells [53][20]. Regardless of the growing attention to nutraceuticals there is not yet a solid mechanistic basis to support health claims in cancer patients. Thus, a more thorough understanding of their mode of action and molecular targets is strongly required. To this direction, we previously applied a systems biology approach combining data from high throughput Luminex immunoassays with network analysis in three human HCC cell lines exposed to selected nutraceuticals and/or distinct pro-tumor stimuli and explored the mechanisms underlying their modulatory effects on secreted cytokines at the level of phosphoproteins signaling [54][21]. Acquired results highlighted the polyphenol (-)-epigallocatechin gallate (EGCG) - the main catechin of green tea - and fisetin (FIS) - a flavonol widely found in many fruits and vegetables - as the most promising compounds against HCC [55][21], in agreement with other reports demonstrating their broad anti-cancer activities in numerous cancer preclinical models [56][22], [57][23], [58][24]. To pursue a further systems investigation of EGCG and FIS mechanisms at the transcriptome level, in the present work, HEP3B cells (a nutraceutical-responsive human HCC cell line [59][21]), are treated with test compounds under basal conditions or following induction with a mixture of prominent inflammatory/growth signaling stimuli, and the differential gene expression is monitored by RNA-seq analysis. Statistically significant differentially expressed genes (DEGs) are then searched for potentially co-modified targets and expression motifs, while thorough downstream data analyses are further conducted in order to explore significant connections of target genes to specific transcription factors, biological processes and pathways. Our results are expected to provide new mechanistic information about the modifications brought by EGCG and FIS on the global transcriptome landscape - especially under conditions simulating tumor inflammatory microenvironment - thus setting a base for their future usage in clinical chemopreventive and/or therapeutic interventions against HCC or other inflammation-related cancers. 2. Materials and methods 2.1. Nutraceutical acquisition and preparation EGCG ([60]Fig. 1A, purity ≥ 98%) was purchased from Sigma-Aldrich (St. Louis, USA) and STEMCELL^TM Technologies (Vancouver, Canada), while FIS ([61]Fig. 1B, HPLC purity ≥ 99) was provided by Carl Roth (Karlsruhe, Germany). Master stock solutions were prepared for both compounds in dimethyl sulfoxide (DMSO) and consequently were stored at −20 °C. Working dilutions contained up to 0.1% v/v DMSO. Fig. 1. [62]Fig. 1 [63]Open in a new tab Chemical structure of EGCG (A) and FIS (B). 2.2. Cell culture and treatment HEP3B cells were maintained as described before [64][21]. HCC cells, seeded the day before in a 6-well plate (45 × 10^4 cells per well), were serum-starved for 4 h and then treated with EGCG (100 μM), FIS (10 μM) or DMSO (0.1% v/v) for 2 h. Cells were subsequently exposed to a mixture of stimuli consisting of recombinant interleukins IL-6 (0.1 μg/ml), IL-1B (0.01 μg/ml) and tumor growth factor A (TGFA) (0.2 μg/ml) (all purchased from PeproTech, USA) or vehicle (medium) and were further incubated for 22 h. Working concentrations for reagents and incubation length were chosen based on preliminary experiments performed to update conditions described in detail previously [65][21]. Cell viability under the used experimental conditions was evaluated by the MTT assay as previously described [66][25]. Samples of all possible treatment combinations of HEP3B cells i.e. EGCG, FIS, or DMSO at either constitutive (MEM) or stimulated (STIM) state were subjected to RNA extraction from two independent experiments. 2.3. RNA isolation and RNA-sequencing Total RNA was isolated using the PureLink RNA Mini Kit (Invitrogen, USA) according to the manufacturer’s instructions. Quantification and quality control of isolated RNA was performed by measuring absorbance at 260 nm and 280 nm on a NANODROP ONE^C spectrophotometer (Thermo Scientific, USA). The RNA-seq run was performed on biological duplicates from two independent experiments per treatment group on a NextSeq 500 Illumina platform that provided single-end reads of 85 bp length. Quality assessment, library preparation (TruSeqLT) and the actual sequencing run was conducted in the Biomedical Research Foundation of the Academy of Athens (BRFAA) sequencing facility. The entire dataset of RNA-seq FASTQ files has been deposited to the repository Mendeley Data and is accessible through [67]https://doi.org/10.17632/n6bzf2nzj6.1 and [68]https://doi.org/10.17632/kxrsf6cghn.1, for EGCG and FIS, respectively. 2.4. Read alignment and gene quantification In the first step of the analysis we evaluated sequencing quality by implementing FastQC [69][26]. Quality control of sequenced reads indicated acceptable results, with the exception of two samples that were characterized by overall poor sequencing quality. Furthermore, we implemented the Trimmomatic read trimming tool [70][27] in an attempt to discard low quality fragments from the problematic samples. Sequenced reads of each sample were then mapped onto the reference human genome (hg19) using Tophat2 software [71][28] with default parameters. Apart from the two aforementioned samples, an overall mapping ratio of 82.9% ± 0.9 (mean, SD) was achieved. Low mapping rate (56.1% and 53.4%, respectively), along with poor sequencing quality, lead us to exclude the two samples from downstream analysis steps. Gene quantification for each sample was estimated by HTSeq-count [72][29] using the intersection-nonempty option as an overlap resolution mode. 2.5. Identification of differentially expressed genes Raw counts produced by HTSeq-count were normalized based on the DESeq2 [73][30] normalization method that internally corrects for library size. DESeq2 linear models in R environment were implemented in order to identify statistically significant DEGs between nutraceutical-treated stimulated cells and DMSO-treated stimulated cells. Genes with an adjusted (Benjamini-Hochberg correction for multiple hypothesis testing) p-value < 0.05 were considered as differentially expressed. No log[2](Fold Change) cut-off was implemented, as subtle-to-mild gene expression alterations are usually expected following nutraceutical treatment [74][31]. Lists of DEGs for each nutraceutical were then compared in order to highlight common targets. Co-modified genes were investigated by hierarchical clustering (Euclidean distance, average linkage) regarding their expression profile under the three examined conditions of stimulated cells i.e. DMSO-STIM, EGCG-STIM and FIS-STIM. Expression values used in the hierarchical clustering were normalized by DESeq2, transformed by the variance-stabilizing transformation [75][30] and finally mean-centered. 2.6. Transcription Factor, Gene Ontology and Reactome Pathway enrichment analyses ChEA3 database web-server application was implemented on the co-modified genes, in order to perform transcription factor (TF) enrichment analysis [76][32]. ChEA3 covers 1632 site-specific TFs and offers a selection of six primary reference gene set libraries, generated from various sources of distinct data: a. GTEx and ARCHS4 libraries containing TF-gene co-expression RNA-seq data, b. ENCODE, Literature ChIP-seq and ReMap containing TF-target associations from ChIP-seq experiments and c. Enrichr Queries comprised of TF-gene co-occurrence from user-submitted lists. Individual enrichment results for each library are consequently integrated, thus, producing an improved composite rank of potentially implicated prioritized TFs. The MeanRank integration method was selected, as it generally offers a higher predictive performance [77][32]. Functional Gene Ontology (GO) and Reactome Pathway enrichment analysis of the co-modified DEGs was conducted using the Bioinfominer software, a bioinformatic tool for intelligent, automated interpretation of genomic data, which performs statistical and network analysis on various biological hierarchical vocabularies [78][33], [79][34]. Significance threshold for altered biological processes/pathways was set at a corrected hypergeometric p-value of 0.05. 2.7. Reverse transcription-quantitative real time PCR (RT-qPCR) RNA (400 ng) was subjected to reverse transcription using the Takara PrimeScript^TM RT reagent Perfect Real Time Kit (Takara, Japan) and RT-qPCR was performed on a CFX Connect Real-Time System (Biorad, USA), using the SensiFAST^TM SYBR Lo-ROX Kit (Bioline, UK) according to the manufacturer’s protocol i.e. polymerase activation at 95 °C for 2 min followed by 40 cycles of denaturation at 95 °C for 5sec and annealing/extension at 60 °C for 30sec, concluding with a melting curve construction from 65 to 95 °C with a 0.5 °C increment. Primers for selected genes were provided by Invitrogen (USA) and are presented in [80]Table 1. Amplifications were performed in technical duplicates, using Glyceraldehyde 3-phosphate dehydrogenase (GAPDH) as a reference gene for normalization purposes. RNA used was derived from the same samples subjected to RNA-seq and also from two independent biological replicates. Data analysis of the resulting Ct values was performed by using the 2^-ΔΔCt method [81][35]. Table 1. Primers used for RT-qPCR of selected genes. HGNC Gene Symbols Forward Primer Reverse Primer ADAMTS9 5′-TCCGAGACTGCCGTAGAAAGA-3′ 5′-CCGACAAAACCTGAAGCAAAA-3′ MMP11 5′-CCGCCTCTACTGGAAGTTTG-3′ 5′-GCACAGCCAAAGAAGTCAGG-3′ MMP9 5′-TTGACAGCGACAAGAAGTGG-3′ 5′-CAGTGAAGCGGTACATAGGG-3′ SERPINE1 5′-CACAAATCAGACGGCAGCACT-3′ 5′-CATCGGGCGTGGTGAACTC-3′ CTGF 5′-AGGAGTGGGTGTGTGACGA-3′ 5′-CCAGGCAGTTGGCTCTAATC-3′ PDGFRB 5′-CGTCAAGATGCTTAAATCCACAGC-3′ 5′-TGATGATATAGATGGGTCCTCCTTTG-3′ HSPB1 5′-AAGGATGGCGTGGTGGAGATC-3′ 5′-TCGTTGGACTGCGTGGCTAG-3′ HSPA2 5′-AAAGGTCGTCTGAGCAAGGA-3′ 5′-ATAGGACTCCAGGGCGTTTT-3′ CLIC3 5′-TGTTTGTCAAGGCGAGTGAG-3′ 5′-CGTGGTGAGGGTGAAAGGTA-3′ GAPDH 5′-CAGTCAGCCGCATCTTCTTTTG-3′ 5′-AATCCGTTGACTCCGACCTTC-3′ [82]Open in a new tab 2.8. Statistical analysis RT-qPCR data were expressed as mean ± standard error of the mean (SEM) values, for each condition studied. Statistical analysis by Mann-Whitney test was performed using GraphPad Prism 6.0 software (GraphPad, San Diego, USA). A p-value < 0.05 indicates a statistically significant difference. 3. Results 3.1. Differential gene expression in EGCG and FIS-treated HEP3B cells In order to identify potential transcriptome alterations that might be related with the previously published inhibitory effects of EGCG and FIS on HEP3B cells [83][21], we followed a similar treatment protocol with the modification that after pre-exposure of HEP3B cell cultures for 2 h with the test compounds or DMSO, a mixture of pro-inflammatory/growth signaling stimuli (IL-6, IL-1B and TGFA) or medium was added and the incubation further continued for 22 h. Gene expression was next monitored by RNA-seq in samples from all different treatment combinations (DMSO-MEM, DMSO-STIM, EGCG-MEM, EGCG-STIM, FIS-MEM, FIS-STIM) and acquired data were searched for statistically significant gene expression differences versus a. basal expression levels represented by DMSO-MEM and b. “stimulated” expression levels represented by DMSO-STIM. These comparisons revealed that gene expression alterations in EGCG-MEM and FIS-MEM versus DMSO-MEM did not reach (with few exceptions) the preset statistical criteria of significance, whereas a substantial number of statistically significant DEGs were identified in EGCG-STIM and FIS-STIM versus DMSO-STIM. Based on these results, in our subsequent downstream analysis we focused on the latter set of genes that were significantly modified by test compounds under stimulated conditions which eventually are more representative of live interactions between HCC tumor and its stroma. Volcano plots shown in [84]Fig. 2 visualize the number of DEGs following exposure to EGCG ([85]Fig. 2A) and FIS ([86]Fig. 2B) compared to vehicle in stimulated cells. The horizontal dashed line indicates the preset statistical threshold corresponding to an adjusted p-value of < 0.05 whereas red and green dots represent up- and down-regulated genes, respectively. According to the aforementioned criterion the numbers of identified DEGs were 922 (458 over- and 464 under-expressed) after EGCG and 973 (498 over- and 475 under-expressed) after FIS treatment. It is worth noting that although no log[2](Fold Change) cut-off was initially implemented, all determined DEGs are characterized by an observed |log[2](Fold Change)| > 0.4. Differential analysis results for all investigated comparisons are provided in [87]Supplementary File S1 (Tables S1-S4). Fig. 2. [88]Fig. 2 [89]Open in a new tab Volcano plots illustrating DEGs in EGCG-stimuli versus DMSO-stimuli (A) and FIS-stimuli versus DMSO-stimuli (B). Red and green dots represent up- and down-regulated genes, respectively; grey dots correspond to non-statistically significant altered genes. The horizontal dashed line indicates a statistical threshold corresponding to an adjusted p-value of < 0.05; x-axis: mRNA log[2] Fold Change, y-axis: p-value in negative log[10] scale. (For interpretation of the references to color in this