Abstract Background While microRNAs (miRNAs) were widely considered to repress target genes at mRNA and/or protein levels, emerging evidence from in vitro experiments has shown that miRNAs can also activate gene expression in particular contexts. However, this counterintuitive observation has rarely been reported or interpreted in in vivo conditions. Methods We systematically explored the positive correlation between miRNA and gene expressions and its potential implications in tumorigenesis, based on 8375 patient samples across 31 major human cancers from The Cancer Genome Atlas (TCGA). Findings We found that positive miRNA-gene correlations are surprisingly prevalent and consistent across cancer types, and show distinct patterns than negative correlations. The top-ranked positive correlations are significantly involved in the immune cell differentiation and cell membrane signaling related processes, and display strong power in stratifying patients in terms of survival rate. Although intragenic miRNAs generally tend to co-express with their host genes, a substantial portion of miRNAs shows no obvious correlation with their host gene plausibly due to non-conservation. A miRNA can upregulate a gene by inhibiting its upstream suppressor, or shares transcription factors with that gene, both leading to positive correlation. The miRNA/gene sites associated with the top-ranked positive correlations are more likely to form super-enhancers compared to randomly chosen pairs. Wet-lab experiments revealed that positive correlations partially remain in in vitro condition. Interpretation Our study brings new insights into the critical role of miRNA in gene regulation and the complex mechanisms underlying miRNA functions, and reveals both biological and clinical significance of miRNA-associated gene activation. Keywords: Pan-cancer miRNA, miRNA activation, Intragenic miRNA, Super-enhancer Abbreviations: miRNA, microRNA; 3′UTR, 3′-Untranslated region; AGO, Argonaute protein; miRISC, miRNA-induced silencing complex; microRNPs, micro-ribonucleoprotein; SE, super-enhancer; GO, gene ontology; KEGG, Kyoto encyclopedia of genes and genomes; TCGA, The Cancer Genome Atlas; ENCODE, encyclopedia of DNA elements; qRT-PCR, quantitative real-time PCR __________________________________________________________________ Research in context. Evidence before this study miRNAs have long been well-known to repress target genes at either mRNA or protein level, through which regulating gene expression in a cell to a physiologically favorable balance. Recent evidence from in vitro studies indicates that miRNAs can also promote gene expression in particular conditions. However, there is a lack of investigation of this counterintuitive phenomenon in clinical samples. In addition, its implications in tumorigenesis, and the general mechanisms underlying this phenotype have not been systematically explored. Added value of this study By integrative analysis of miRNA and gene expression profile of 8375 patient samples across 31 major human cancers from The Cancer Genome Atlas (TCGA), we for the first time showed that miRNA-associated gene activation, rather than suppression, is surprisingly prevalent and conserved across multiple cancer types. We further confirmed the biological and clinical significance of the positive correlations in terms of essential biological processes they participate and cancer hallmarks they present. Additionally, we proposed and explored four potential molecular mechanisms that can well explain the observed gene upregulation associated with miRNA. Implications of all the available evidence The present study established a striking phenomenon regarding miRNA-associated gene activation in human cancer samples, corroborated its biological and clinical meaning, and partially explained the underlying molecular basis. Our work sheds new light on the complex miRNA-mRNA interaction and its implications in tumorigenesis. Alt-text: Unlabelled Box 1. Introduction Cancer is caused by uncontrolled cell growth reflective of multiple established hallmarks [[33]1]. Underlying the aberrant cell proliferation is activation of critical oncogenes and inactivation of tumor suppressor genes resulted from multiple genetic and epigenetic alterations in a cancer tissue-specific manner [[34][2], [35][3], [36][4], [37][5], [38][6]]. Among these, microRNAs (miRNAs) are a class of small (~22 nucleotides), non-protein-coding RNAs known as important post-transcriptional regulators of gene expression [[39]7,[40]8]. miRNAs exert regulatory functions by base-pairing with complementary sequences typically in the 3′-untranslated region (3′UTR) of mRNAs to target them for degradation or prevent their translation [[41]7,[42]9]. It is estimated that >60% of human protein-coding genes are under selective pressure to maintain pairing to miRNAs and over one third of human genes appear to be conserved miRNA targets [[43]10,[44]11]. This indicates that miRNAs can influence almost every critical signaling pathway in a cell [[45]12]. A canonical miRNA is transcribed from miRNA gene by RNA polymerase II (pol II) as a primary miRNA (pri-miRNA). The pri-miRNA transcript is first cleaved in the nucleus by the microprocessor, which contains a nuclear RNase III called Drosha and its cofactor DGCR8, into a hairpin structured precursor miRNA (pre-miRNA). Then, the pre-miRNA is exported to the cytoplasm through the activation of Exportin 5 and RAN-GTP, and further processed by an endonuclease Dicer to generate the miRNA duplex, which contains the miRNA paired to its passenger strand usually called miRNA* (miRNA star). One strand of the miRNA duplex, the mature miRNA, is loaded into an Argonaute protein (AGO) to form a miRNA-induced silencing complex (miRISC), whereas the other strand, the miRNA*, is degraded. Once loaded into the silencing complex, the miRNA pairs to complementary sites within mRNAs or other transcripts and the Argonaute protein exerts the posttranscriptional repression [[46]13]. Considering that ~2000 miRNA gene have been identified in human genome [[47]14], that a miRNA can target hundreds even thousands of different mRNAs, and an individual mRNA might be influenced by multiple miRNAs [[48]15], the miRNA biogenesis pathway plays an essential role in shaping the gene regulatory networks. Therefore, the miRNA biogenesis is elaborately maintained in a favorable balance under physiological condition, but can be severely impaired in cancer, resulting in differential expression of critical miRNAs compared to that in the normal tissues [[49][16], [50][17], [51][18]]. Based on the canonical miRNA biogenesis pathway and mechanism of function, miRNAs have long been believed to elicit their effects exclusively through mRNA degradation and/or translation inhibition. Recently, evidence has accumulated to suggest that miRNA can also promote gene expression in particular contexts via various mechanisms. For instance, under cell cycle arrest, human miRNA miR-369-3 was reported to activate tumor necrosis factor-α (TNFα) translation by directing micro-ribonucleoproteins (microRNPs) including AGO and FXR1 to a special type of miRNA binding sites called AU-rich elements (AREs) in the 3’UTR of TNFα, while it was also acknowledged that this translational activation can be switched to repression in proliferating cells [[52]19]. In most cases, however, miRNA was reported to activate gene expression by binding to the promoter region followed by recruitment of transcription factors to the miRNA binding sites. Striking examples include: miR-373 activates E-cadherin (CDH1) and cold-shock domain-containing protein C2 (CSDC2) by binding to their promoter regions [[53]20], miR-205 induces the expression of tumor suppressor genes interleukin 24 (IL24) and IL32 by targeting specific sites in their promoters [[54]21], miR-744/1186/466d-3p induces Ccnb1 expression in mouse cell lines by targeting promoter elements [[55]22], miR-589 binds the promoter RNA and activates cyclooxygenase-2 (COX-2) transcription [[56]23], and let-7i binds to TATA-box of IL2 and activates it at both mRNA and protein level [[57]24]. Recently, a new mechanism regarding miRNA activation has been proposed, showing that miR-24-1 activates FBP1 and FANCC genes by targeting their enhancers [[58]25]. In addition, these studies invariably corroborated that proteins related to miRNA biogenesis or functions, such as AGO and Dicer, or transcription related enzymes are significantly enriched surrounding the binding sites during gene activation. Collectively, these previous findings, mainly observed in in vitro conditions, established that miRNA can also upregulate gene expression by directly binding to the transcriptional regulatory regions. While the miRNA-mediated gene activation has been well studied in in vitro conditions and in animal experiments, it was rarely explored in human samples. To address the gap, we leveraged the Cancer Genome Atlas (TCGA) and conducted an unprecedentedly comprehensive analysis on the miRNA-gene interaction profiles in 8375 patient samples across 31 major human cancer types. We checked the correlations between all 1046 miRNAs and 20,531 genes annotated in TCGA, and found that positive miRNA-gene correlation is surprisingly prevalent and consistent across human cancers, even when the gene bears conserved binding sites for the miRNA. And the positive correlations display disparate patterns compared to the negative correlations. We performed a series of stringent bioinformatics analysis to investigate whether this positive correlation has any biological or clinical implication especially in the context of human oncogenesis. We revealed that the miRNA-gene pairs with positive correlation are extensively involved in many biological processes pertaining to immune response, cell membrane signaling, cell cycle control and other cancer hallmarks. In addition, the top ranked miRNAs and genes can well stratify patients in terms of overall survival rate based on their single or combined expression level. These results warrant the biological and clinical significance of the widespread positive correlations between miRNAs and genes across human cancers. We further investigated the molecular mechanisms underlying the observed miRNA-gene positive correlation. Most of the positive correlations (~87%) can be explained by one or more of our proposed four indirect-regulation hypotheses, including miRNA-host gene co-expression, gene activation by inhibiting upstream suppressor, co-regulation by shared transcription factors, and co-activation by common histone modifications. Considering that co-expression by shared genetic or epigenetic factors cannot be viewed as causative relationship, our study stresses that although some positive correlations are implicated in tumorigenesis, the expression level of the miRNA and gene are independent on each other. On the other hand, mechanisms related to the miRNA-host gene co-expression and indirect activation of gene by inhibiting its upstream suppressor involve causation, in the sense that the expression level of one will influence that of the other. We further hypothesized that at least part of the remaining positive correlations (~13%) can be explained by direct binding of miRNA to particular transcriptional regulatory regions of the partner gene, as reviewed above. Our wet-lab experiments in corresponding cancer cell line only partially recapitulated the positive correlations observed in human patient samples, implying that the miRNA-gene interaction are dramatically different between in vitro and in vivo conditions. This in turn highlights the significance of a comprehensive analysis on the miRNA-directed gene activation in human samples. 2. Materials and methods 2.1. TCGA data acquisition, quality control and preprocessing The miRNA and gene expression data, as well as clinical information of each cancer, were downloaded from TCGA data portal ([59]https://portal.gdc.cancer.gov/). Three cancers, including FPPP (FFPE Pilot Phase II), GBM (glioblastoma multiforme) and LAML (acute myeloid leukemia) were excluded due to small sample size or platform inconsistency. The pan-cancer analyses eventually consisted of 31 cancer types: ACC, BLCA, BRCA, CESC, CHOL, COAD, DLBC, ESCA, HNSC, KICH, KIRC, KIRP, LGG, LIHC, LUAD, LUSC, MESO, OV, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, TGCT, THCA, THYM, UCEC, UCS and UVM. Sample size ranged from 36 (CHOL) to 1102 (BRCA), see [60]Table S1. A total of 1046 miRNAs and 20,531 genes (protein coding and noncoding) were included in the TCGA IlluminaHiseq miRNASeq and IlluminaHiSeq RNASeqV2 data, respectively. We used level 3 expression data for both miRNA and gene. The miRNA expression counts the reads aligning with the corresponding precursor, while the gene expression is derived from the reads per kilobase of transcript per million reads mapped (RPKM). The miRNA and gene expression values were logarithmically transformed (base 2) prior to further analysis. Pearson correlation analysis was performed to assess the co-expression between miRNAs and genes across all 31 cancer types. A correlation was deemed significant in a cancer if its absolute correlation coefficient |R| > 0.1 (R > 0.1 for positive correlation and R < −0.1 for negative correlation) and Hochberg adjusted p-value adj.P < .05 unless otherwise stated. 2.2. CCLE data curation and correlation analysis We downloaded cell line data for gene and miRNA expression from the Cancer Cell Line Encyclopedia (CCLE) database [[61]26] for verification of TCGA data. In summary, the miRNA data involves 654 human miRNAs and 954 cell lines spanning 25 cancer types; and the RNAseq data contains 18,361 genes and 1019 cell lines across 26 cancer types. We calculated the miRNA-gene correlation for each pair in each of the 25 overlapped cancers. Due to relatively small sample size for each cancer type compared to TCGA, at this targeted validation step we imposed a looser restriction on the p-value cut-off (P < .01) for significance in CCLE. To compensate this, we used the more conservative Spearman's rank correlation instead of Pearson correlation. 2.3. Functional characterization of positive correlation associated miRNAs and genes To explore the biological significance of the top-ranked pairs, we conducted gene ontology (GO) and KEGG [[62]27] signaling pathway enrichment analyses on the genes involved in the top-ranked significant pairs, using the R package clusterProfiler [[63]28]. Enrichment profiles were checked over genes at different pan-levels (PanCan10/15/20/25) respectively, with Fisher exact test p-value < .05 deemed significant. To further investigate the clinical relevance of these pairs, we performed survival analysis on different groups based on their miRNA and gene expression profiles. Briefly, patients were first divided into non-overlapped groups based on their miRNA and gene expression in three ways: 1) patients were divided into three groups as high-middle-low (Hi-Mi-Lo) by miRNA expression; 2) patients were divided into three groups as high-middle-low (Hi-Mi-Lo) by gene expression and 3) patients were divided into two groups as high-low (Hi-Lo) by gene:miRNA ratio. Then we adopted log-rank test to compare the overall survival probability of patients from different groups, categorized by single or combined stratification indexes. Therefore, for a particular miRNA~gene pair, the HiHi+Hi group refers to patients from high miRNA expression, high gene expression and high gene:miRNA ratio group. The stratification power in patient survival of miRNAs/genes from different pan-level groups was compared by the Kolmogorov-Smirnov test (K—S test) performed on the density distribution of their log-rank test p-values. To further investigate the implications of our detected positive correlations in tumorigenesis, we associated the top-ranked miRNA and genes to the well-known cancer hallmark traits. Briefly, we checked the association of the PanCan10 miRNAs with the well-known 10 cancer hallmarks proposed by Hanahan and Weinberg 2011 [[64]1], as did in a previous study [[65]29]. We also examined the enrichment of the top-ranked genes of different pan-levels in 50 well-established hallmark gene sets related to critical biological processes regarding cellular component, development, DNA damage, immune, metabolic, pathway, proliferation and signaling [[66]30], which are assumedly essentially relevant to cancer initiation and progression. 2.4. Identification of miRNAs with host genes We downloaded the gene annotations (hg38) of 27,423 genes (including 19,902 protein-coding genes and 7521 long non-coding RNAs – lncRNAs) from GENCODE (GTF v25) [[67]31], and annotations of 1881 human miRNAs from miRBase (hsa.gff3) [[68]14]. We determined whether a miRNA is embedded in another gene (protein-coding or noncoding RNA) according to their genomic coordinates (locations) from the annotation files. A total of 591 miRNAs were found to be located inside a specific gene, termed “host” gene. Of these, 451 pairs were covered in the TCGA data. These 591 host genes consisted of 474 protein-coding genes and 117 lncRNAs, see [69]Table S4. We investigated the preference of a miRNA to co-express with their host gene by hypergeometric test. The conservation information of each miRNA was adopted from TargetScanHuman v7.2 (miR_Family_Info.txt). A positive conservation score (Conservation? = 1, 2) of a miRNA indicates conservation while a negative score (−1) refers to non-conservation, remaining miRNAs with a zero score were ignored ([70]Table S4). 2.5. Detection of double-negative patterns underlying positive correlation To validate the hypothesis that a miRNA can upregulate a gene by inhibiting its upstream suppressor, we attempted to detect the double-negative patterns. For a significant positive miRNA-gene pair, we first detected all the intermediate (IM) genes that negatively correlate with both the miRNA and gene in the pair across multiple cancers (R < −0.1, adj.P < .05, cancer coverage≥5). Then we narrowed down the intermediate genes to real targets of the miRNA based on TargetScanHuman v7.2 [[71]15]. We downloaded the predicted targets (TAR) with context++ and weighted context++ scores, followed by a series of preprocessing steps, including extraction of human species, parsing/trimming miRNA names, removing duplicates, and eventually obtained 198,312 targeting records with high confidence (i.e., with a positive probability of conserved targeting, P[ct]), which involves 321 different miRNAs and 13,035 genes. 2.6. Detection of double-positive patterns underlying positive correlation To validate the hypothesis that a positively correlated miRNA-gene pair might be regulated by shared transcription activators, we tried to detect the double-positive patterns. For a significant positive miRNA-gene pair, we first detected all the intermediate (IM) genes that positively correlate with both the miRNA and gene in the pair across multiple cancers (R > 0.1, adj.P < .05, cancer coverage ≥5). Then we narrowed down the genes by two steps. First, we restricted the IMs into general transcription factors (gTF). We downloaded transcription factors (TF) and their targets with the R data package tftargets ([72]https://github.com/slowkow/tftargets). This dataset includes human TF information curated from six published databases: TRED [[73]32], ITFP [[74]33], ENCODE [[75]34], Neph2012 [[76]35], TRRUST [[77]36], Marbach2016 [[78]37]. We mapped the Entrez gene IDs into gene symbols using two R packages: annotate ([79]10.18129/B9.bioc.annotate) and [80]org.Hs.eg.db (10.18129/[81]B9.bioc.org.Hs.eg.db). After integration and removal of duplicates, we obtained 2705 gTFs with their target genes, based on which we removed intermediate genes of each pair that were not included in the gTF sets. At the second step, we further narrowed down the gTFs into specific transcription factors (sTF) by removing gTFs whose target genes do not include the gene in the miRNA-gene pair under investigation. 2.7. Super-enhancer (SE) identification from ENCODE H3K27ac ChIP-seq data We downloaded the H3K27ac ChIP-seq data (bed files for narrowPeak) of 15 human tissues including 64 samples from ENCODE (including blood, lung, liver, kidney, brain, large intestine, stomach, pancreas, esophagus, prostate gland, adrenal gland, breast, ovary, thymus and urinary bladder tissues, see [82]Table S5) [[83]34]. To detect the super-enhancer formation profile, we investigated the H3K27ac signal surrounding a miRNA or gene site based on the ROSE (Rank Ordering of Super-enhancers) pipeline [[84]38] with minor modifications. Briefly, we first stitched the detected peaks if they are within a certain distance (called the “region”), and then calculated the average input-subtracted H3K27ac signal intensity within the region by a revised ROSE score: [MATH: ROSEscore=1L< /mi>i=1Nwidthpeaki×intensitypeaki :MATH] In this formula, N is the number of peaks detected in the region, and L is the length (in bp) of the region in question. In this study, L = 100 K bp, centered at the transcription start site (TSS) of each miRNA or gene. The width and intensity of each peak (peak[i]) was obtained from its genomic coordinates (width = chromEnd - chromStart) and signal intensity (signalValue), respectively (see ENCODE narrowPeak format description). We employed the scaled peak intensity (“score” in the bed file) instead of signalValue for better visualization. As did in the original ROSE pipeline, we adopted a promoter exclusion zone of 4 K bp, i.e., if a peak was entirely contained within a window of ±2 K bp around the TSS, the peak was excluded from the calculation. It should be noted that in contrast to the original ROSE pipeline, we removed a limitation on the maximum distance of 12.5 K bp between two constituent enhancers, to focus on the general H3K27ac intensity surrounding the miRNA/gene TSS site. Under our framework, a region with ROSE[score] ≥ 10 in at least 11 out of 15 tissues was considered as a super-enhancer (SE). 2.8. Cell culture BJ human foreskin fibroblasts were maintained in minimum essential medium supplemented with 10% fetal calf serum, nonessential amino acids, and antibiotics. 293 T, MDA-MB-231 and Huh7 cells were grown in Dulbecco's modified Eagle medium supplemented with 10% fetal calf serum, glutamine, and antibiotics. Hela, Huh7 and T47D cells were cultured in EMEM, DMEM and RPMI-1640, respectively, all basic medium was supplemented with 10% FCS and 1% antibiotics. 2.9. Plasmids The expression vectors for miRNAs and 3′UTR dual-luciferase reporter plasmid (pmirGLO) were purchased from Biosettia, Inc. (San Diego, CA). To construct target gene 3′UTR dual-luciferase reporters (pmirGLO-CTLA4–3′UTR, pmirGLO-IGFBP5–3′UTR, pmirGLO-ITK-3’UTR, pmirGLO-PDGFRA-3′UTR, pmirGLO-PIK3CG-3′UTR, pmirGLO-TGFBI-3’UTR, pmirGLO-IL7R-3’UTR), target gene 3′UTR exons containing miRNA seed sequences were amplified by PCR from the genomic DNA of 293 T cells. The PCR primers are described in supplemental Materials. All 3’UTR fragments are inserted into pmirGLO by NheI-HF and XhoI. The pGL3-basic plasmid was purchased from Promega Corporation. To construct pGL3-IL7R/PIK3CG-3Kb + 3′UTR reporter, a 3 kb promoter sequence from the IL7R and PIK3CG gene was amplified from the genomic DNA of 293 T cells and inserted into pGL3-basic plasmid by NheI and XhoI (IL7R promoter), or MluI and BglII (PIK3CG promoter). Then, the PCR product for the 3’UTR of IL7R or PIK3CG (still containing the miRNA seed sequence) was digested by XbaI and inserted into pGL3-IL7R/PIK3CG-3Kb reporter, respectively, downstream of luc+. The PCR primers are described in supplemental Materials. 2.10. Lentivirus-based gene transduction pLV-miR-ctrl, pLV-miR-21, pLV-miR-142, pLV-miR-155, pLV-miR-214 Recombinant lentiviruses were packaged in 293 T cells in the presence of helper plasmids (pMDLg, pRSV-REV, and pVSV-G) using Lipofectamine 2000 (Invitrogen). BJ or MDA-MB-231 cells (1 × 10^5/well) were seeded into 6-well plates, grown overnight, infected with 300 uL virus in 3 mL fresh medium containing 8 μg/mL polybrene, and spun for 1 h at 1600 to 1800 rpm. Transduced cells were purified with 1.2 μg/mL of puromycin. 2.11. RNA isolation and quantitative Real-Time PCR RNA was isolated from cells using TRIzol (Thermo Fisher Scientific) according to manufacturer's protocol. 500 ng of RNA was reverse transcribed to cDNA with iScript™ Reverse Transcription Supermix(Bio-Rad). Quantitative real-time PCR was performed in triplicate with gene-specific primers and SsoAdvance™ SYBR Green Supermix (Bio-Rad) in a Bio-Rad CFX96 REAL TIME SYSTEM following manufacturer's protocols. GAPDH was used as internal control to normalize the mRNA input for each gene. qPCR primers are described in supplemental [85]Table S7. 2.12. Dual-luciferase reporter assay The target genes 3’UTR activity was analyzed in both 293 T and MDA-MB-231 cells by transient transfection of luciferase reporter constructs. On the 1st day, 6 × 10^5/well 293 T or 1.5 × 10^5/well MDA-MB-231 cells were seeded into 12-well plates. These cells were transfected with 0.17 μg pmirGLO reporter vector, 1.43 μg pLV-miR-ctrl/pLV-miR-142 vector and 4.0 μl lipo-2000 according to manufacturer's instruction on the next day. 48 h after transfection, cell lysates were collected using Passive Lysis Buffer (E1941, Promega). Firefly and Renilla luciferase activity was detected using Dual-Luciferase Reporter Assay System (E1960, Promega) on GloMax®-Multi+ Microplate Multimode Reader (Promega). For gene IL7R and PIK3CG 3 kb promoter +3’UTR reporter analysis, MDA-MB-231 cells were transiently transfected with 0.16 μg pGL3-IL7R/PIK3CG-3Kb + 3’UTR plasmid, 1.36 μg pLV-miR-ctrl/pLV-miR-142 vector and 0.08 μg of the control Rluc vector driven by β-actin, TK or CMV promoter, using 4 μl lipo-2000 according to manufacturer's instruction. Other procedures are the same with 3’UTR dual-luciferase assay. 3. Results 3.1. Positive miRNA-gene correlations are prevalent and consistent across human cancers By an integrative miRNA-gene correlation analysis on 1046 miRNAs and 20,531 genes across 31 TCGA cancers ([86]Table S1), we detected a total of 2,842,030 pairs that were significantly positively correlated (R > 0.1, adj.P < .05) in at least one TCGA cancer type. We then ranked these positive pairs according to the number of cancer types (called cancer coverage) in which their correlation appear to be positive (R > 0.1) and significant (adj.P < .05). In most of the subsequent analysis, we focus on the top ranked pairs with cancer coverage ≥10, totaling 18,996 miRNA-gene pairs, which involves 348 miRNAs and 3074 genes ([87]Table S2). Each of the top 56 significant positive pairs covered at least 27 cancer types ([88]Fig. 1A). Three pairs, including miR-196b~HOXA10, miR-335~MEST and miR-483~IGF2, covered all 31 cancers under investigation. Interestingly, miR-196b and HOXA10 have been reported to co-express and their overexpression characterized poor prognosis in patients with gastric cancer [[89]39]. The IGF2 intronic miR-483 has been widely recognized as an oncogenic miRNA that transcriptionally upregulates its host gene IGF2 [[90]40,[91]41], while the long non-coding RNA (lncRNA) H19 intragenic miR-675 was shown to be the most highly conserved feature of H19 and serves as the functional regulatory unit of this lncRNA [[92]42]. These previous studies corroborated the significant biological implications of the positive correlations that are prevalent across multiple cancer types. Fig. 1. [93]Fig. 1 [94]Open in a new tab Overview of miRNA-gene positive correlation landscape in human cancers. (A) Heat map showing top positively correlated miRNA-gene pairs covering ≥27 cancer types. TCGA cancer names and corresponding sample sizes used for correlation calculation is shown on the left. Pairs are shown in the format of miRNA~gene at the bottom, with human miRNA prefix (hsa-mir-) omitted for better visualization. A positive correlation with Pearson correlation coefficient R > 0.1 and Hochberg adjusted p-value <0.05 was deemed significant. n.s.: non-significant. (B) Top miRNAs ranked by the number of its positively correlated genes. For better visualization, only miRNAs with ≥100 targets and across ≥10 cancer types are shown. (C) Detailed correlation profiles of a representative positive pair hsa-mir-15 ~ITK across 31 TCGA cancers. The sample size of each cancer, Pearson correlation coefficient and p-value are also shown. Red color indicates significant positive correlation. See also Fig. S1–3. (For interpretation of the references to color in