Abstract Background Microarray technology applied to microRNA (miRNA) profiling is a promising tool in many research fields; nevertheless, independent studies characterizing the same pathology have often reported poorly overlapping results. miRNA analysis methods have only recently been systematically compared but only in few cases using clinical samples. Methodology/Principal Findings We investigated the inter-platform reproducibility of four miRNA microarray platforms (Agilent, Exiqon, Illumina, and Miltenyi), comparing nine paired tumor/normal colon tissues. The most concordant and selected discordant miRNAs were further studied by quantitative RT-PCR. Globally, a poor overlap among differentially expressed miRNAs identified by each platform was found. Nevertheless, for eight miRNAs high agreement in differential expression among the four platforms and comparability to qRT-PCR was observed. Furthermore, most of the miRNA sets identified by each platform are coherently enriched in data from the other platforms and the great majority of colon cancer associated miRNA sets derived from the literature were validated in our data, independently from the platform. Computational integration of miRNA and gene expression profiles suggested that anti-correlated predicted target genes of differentially expressed miRNAs are commonly enriched in cancer-related pathways and in genes involved in glycolysis and nutrient transport. Conclusions Technical and analytical challenges in measuring miRNAs still remain and further research is required in order to increase consistency between different microarray-based methodologies. However, a better inter-platform agreement was found by looking at miRNA sets instead of single miRNAs and through a miRNAs – gene expression integration approach. Introduction microRNAs (miRNAs) are small non-coding RNA molecules of 18–24 nucleotides in length that are widely conserved in all eukaryotic organisms and serve as regulators of gene expression. miRNAs are involved in all major cellular processes and are implicated in a large number of human diseases including cancer [42][1]–[43][3]. Over the past decade, DNA microarray technology has become an increasingly cost-effective methodology that is able to quickly generate high-throughput data, paving the way to genome-wide (GW) analysis of gene-expression, genomic copy number variations, SNPs, and epigenetic alterations. Microarray-based techniques have been extensively used in several areas of research and molecular assays using patterns of gene expression and predetermined mathematical algorithms, such as Mammaprint®) [44][4], are currently under validation by prospective multicentric clinical studies in breast cancer. More recently, microarray technology has been applied to miRNA profiling and is becoming a promising technique in many research fields, such as translational research in oncology, and can provide useful information on the role of miRNAs in both tumorigenesis and progression of cancer [45][2]. Nevertheless, independent studies characterizing the same pathology have often poorly overlapping results. This could be due to small sample size, high tumor variability and heterogeneity but also to technical reasons. A major advantage of the microarray approach consists on the high-throughput simultaneous screening of up to thousands molecules in a single assay, but this requires hybridization conditions to be the same for all probes on the array. This is not trivial for miRNA microarrays because the GC content of miRNAs is highly variable and the options for probe design are more limited than for mRNA due to their short length. For a complete review of general concepts and special challenges that are relevant to miRNA profiling refers to Pritchard et al [46][5]. A multitude of platforms for miRNA profiling are commercially available, and each manufacturer has developed its own technical procedures to maximize sensitivity and specificity in measuring miRNA expression levels. As a result, probe signals are expected to largely differ among platforms, and a direct comparison is not possible. In spite of this, the general patterns of differentially expressed (DE) miRNAs should be coherently detected by all platforms. Only recently the comparison of intra- and inter-platform reproducibility of miRNA microarrays has been analyzed in more than three different platforms (see [47]Table S1 for details) [48][6]–[49][11]. Taken together, these studies provide evidence that miRNA microarray platforms show excellent intra-platform reproducibility, but limited inter-platform concordance. Indeed, comparing miRNAs identified as DE within each platform, a significant variation in the total number as well as in the fold-change of miRNAs has been noted. Three of these studies [50][6]; [51][8]; [52][9] based their conclusions on the comparison of tissues or pools of tissues of completely different origin. Sah et al. analyzed the expression of seven synthetic miRNAs spiked in known concentration into a RNA from placental tissue and hybridized on five platforms [53][11]. To be nearer to a miRNA microarray application in cancer research, Git et al. analyzed a pool of normal breast tissues and two breast cancer cell lines [54][7] and Dreher et al. compared untrasfected and HPV-transfected human keratinocytes [55][10]. Even if, the former four comparisons represent a useful system to address technical issues, and the later two studies are undoubtedly more realistic, the issue of concordance of different platforms, when clinically specimens are used, has not been yet addressed. In the present study, we compared the miRNA expression profiles of nine colorectal cancer and normal colon mucosa samples from the same patients using four different commercial platforms (Agilent, Exiqon, Illumina and Miltenyi). The expression of the most concordant and selected discordant miRNAs among platforms was then evaluated with quantitative real time PCR (qRT-PCR). Finally, integrative analyses of miRNAs in the context of gene expression and literature data were performed as a proof of principle of the validity of microarray miRNA analysis in gaining insight into the biological role of these miRNAs. Results Experimental Setting To highlight the influence of the sample origin and the study design on the obtained results, we made a computational comparison of expression data from four microarray studies. We selected the data obtained on a common miRNA platform, i.e Agilent, from the two miRNA platform comparison studies (details in [56]Table S1) whose expression data on human samples are publicly available ([57]GSE13860 [58][6] and E-MTAB-96 [59][7]) and from two studies chosen as examples of experimental applications in a clinical setting, i.e. profiles associated with tumorigenesis of prostate [60][12] and gastric [61][13] cancer ([62]GSE21036 and [63]GSE28700, respectively; details in Fig.S1 legend). As shown in [64]Figure S1, the number of DE miRNAs and the associated fold changes are considerably higher in the cross-platform analysis than in profiles looking at tumorigenesis. Imposing a uniform and arbitrary threshold (|log[2] fold change|>1), 88.5% ([65]GSE13860) and 25.9% (E-MTAB-96) of miRNAs present in the arrays were differentially expressed in the cross-platform datasets; on the other hand, only 6.9% ([66]GSE21036) and 6.7% ([67]GSE28700) of miRNAs were identified as DE at same threshold in the clinical datasets. With these premises, we decided to evaluate the inter-platforms reproducibility in a clinical setting by assessing the tumor and the normal counterpart miRNA profiles in samples collected from nine patients who underwent surgical resection for colon cancer (see [68]Table S2 for clinical and pathological characteristics). RNA aliquots from these samples were hybridized on four microarray platforms: Agilent SurePrint G3 human miRNA Microarray, Exiqon miRCURY LNA microRNA Array, Illumina Human_v2 microRNA expression Beadchips, and Miltenyi miRXplore Microarray. Main features of the four platforms are described in [69]Table 1. Table 1. Platform description. Agilent Exiqon Illumina Miltenyi Array version Human miRNA V3 miRCURY LNA microRNA Array Human miRNA_V2 miRXplore microarray V5 Array per slide 8 1 12 1 Channels Single Single Single Dual Input total RNA 100 ng 300 ng 600 ng 1200 ng Labeling Cy3 Hy3 Cy3 Hy3/Hy5 Labeling process Alkaline phosphatase and 3′ ligation Alkaline phosphatase and3′ ligation Polyadenylation, RT, MSO[70]^§pool annealing, PCR Alkaline phosphatase and3′ ligation miRBase version miRBase V12.0 miRBase V14.0 miRBase V12.0 miRBase V14.0 N° hsa-miR 866 891 858 911 N° probes/miR 2 1 1 1 N° replicates/probe 4–8 4 370 (average) 4 [71]Open in a new tab ^§ MSO = miRNA specific oligo. It should be noted that the platform from Illumina was withdrawn since March 2010; however, we decided to include it in our comparison due to its extensive use in laboratories worldwide, including those in our Institute. Accordingly, the issues addressed in the present investigation can be of interest to users of the Illumina platform to better interpret their results and to enable a more rationale switch to a different platform. The Agilent, Exiqon, and Illumina arrays were carried out in one-color. Miltenyi was hybridized in two colors: tissue samples were labeled with Hy5, and a synthetic reference purchased by Miltenyi with Hy3. Since the synthetic reference was designed on miRBase 9.2 and covered only a portion of the miRNAs present on the arrays designed on miRBase 14.0, only the Hy5 data were considered and used for normalization in order to enable a more direct comparison with the other three platforms. The Agilent, Exiqon, and Illumina platforms contained probes designed either on viral miRNA sequences or on putative miRNAs not yet annotated in miRBase, derived from literature and Next-Generation Sequencing studies. Since these sequences are present only in one platform, they were excluded from our analyses. miRBase database is the primary repository for all miRNA sequences and annotations used by all manufacturers for the design of the probes. However, the frequent update of miRBase results in annotation problems. To avoid possible bias, we selected arrays designed on close miRBase versions and the probes of the four tested platforms were designed on either v12.0 or v14.0 miRBase. We verified that the names and sequences of miRNAs present in v12.0 did not change in the newer miRBase version, while a set of new miRNAs was added. The difference in the total number of miRBase annotated miRNAs in the four platforms was relatively small (6%). Evaluation of Data Distribution and Detection Rate Non-normalized signal intensities showed a platform dependent distribution reflecting the unique methods developed by manufacturers for labeling, hybridization stringency and data acquisition ([72]Fig. 1A). For all platforms, the signals covered most of the dynamic range available for 16-bit scanners; Agilent, Exiqon, and Miltenyi signal distributions tended to have positive skewness (a right side long tail) and differed from Illumina distributions where many more probes showed intermediate to high expression levels. Figure 1. Comparison of microarray platform performance. [73]Figure 1 [74]Open in a new tab (A) Global non-normalized intensity distribution. (B) Graphical representation of miRNA detection; blue = detected, yellow = undetected, gray = not present. (C) Box-plot of the percentage of GC content in mature miRNA sequences; blue = detected, yellow = undetected. P-values were calculated by Student’s t test. For the Agilent and Illumina platforms, we followed the detection call criteria recommended by the manufacturers. Illumina’s software provides a detection P-value that estimates to what extent a signal is greater than the noise represented by negative controls; similarly, Agilent’s software provides a flag (gIsPosAndSignif) that estimates if the feature signal is positive and significant compared to the background. In contrast, for the Exiqon and Miltenyi platforms a detection call criteria was not defined; for these platforms, we established a threshold percentage of pixels for every spot in the array whose intensity was lower than background. Taking into account these filtering procedures, 675 (78% of miRBase annotated miRNAs present on the array), 775 (87%), 808 (94%), and 376 (41%) unique miRNAs were detectable in at least one of the samples in the Agilent, Exiqon, Illumina, and Miltenyi platforms, respectively ([75]Fig. 1B). There were 233 miRNAs that were shared by all platforms, being strongly limited by the low detection rate in the Miltenyi platform. In order to estimate to what extent the GC content impacted the detection call of each platform, we calculated the GC percentage of the miRNAs assayed and compared, for each platform, the GC content between detected and undetected miRNAs. Despite each manufacturer has adjusted probe design and hybridization procedures to overcome discrepancies in the thermodynamic stability of probe/target recognition, the GC content was significantly higher in the detected than in the undetected miRNAs in all platforms, and this difference was particularly evident for the Miltenyi platform ([76]Fig. 1C). Normalization and Class Comparison Results Several normalization and data processing procedures are available, most translated by gene-expression studies and with little consensus among laboratories. Considering the unique characteristics of each platform, it is unlikely that the same normalization procedure could perform equally in all platforms to correct systematic differences. In order to choose the best normalization for each platform, we evaluated the ability of the four different methods (loess, quantile, rank invariant, and Robust Spline Normalization) to reduce the intra-class variability in normal and tumor samples through the use of Relative Log Expression (RLE) (see [77]Fig. S2). Moreover, we expected that the best normalization method should increase the fold changes and the number of differentially expressed miRNAs between tumor and normal tissue. According to these criteria, we chose RSN for Illumina and Agilent, loess for Exiqon and quantile for Miltenyi. In [78]Figure S3 the tumor/normal class comparisons in the 4 platforms, expressed as histograms of log P-value and FDR, are reported. The comparison identified, at a threshold P<0.005, 29 miRNAs that were modulated on Agilent, 4 on Exiqon, 42 on Illumina, and 3 on the Miltenyi platform, corresponding to 4.3%, 0.5%, 5.2%, and 0.8% of miRNAs detected, respectively. Inter-platform Agreement of Class Comparison Results To assess inter-platform concordance, we examined the miRNAs that were DE at P<0.005 in at least one platform; by combining these miRNAs, a consensus list of 68 miRNAs was generated. To highlight concordance among the four platforms, the P-values and fold-changes of the consensus list miRNAs are shown in a colorimetric scale in [79]Figure 2A and B respectively. Imposing a P<0.005 on all four platforms, no miRNAs were commonly DE. At P<0.05, hsa-miR-378, hsa-miR-375, hsa-miR21*, hsa-miR-145 were detected as DE by all platforms and a further 4 miRNAs (hsa-miR-96, hsa-miR21, hsa-miR147b, and hsa-miR-143) were DE on all but one platform; in fact, on the Miltenyi platform, hsa-miR-96 and hsa-miR-147b were not detected, while hsa-miR-21 and hsa-miR-143 did not reach a significant threshold. Twelve, 2, and 25 miRNAs were found to be exclusively DE on the Agilent, Exiqon, and Illumina platforms, respectively. The remaining 29 miRNAs were DE in at least two platforms. The fold changes are concordant across platforms with the only exception of two miRNAs (hsa-miR-218 and hsa-miR-302a) that were DE at P<0.05 in Illumina and Exiqon, but with discordant fold-changes ([80]Fig. 2B). Figure 2. Cross-platform comparison of the consensus list of DE miRNAs at P<0.005 in at least one platform. [81]Figure 2 [82]Open in a new tab (A) P-values of the tumor/normal class comparison visualized in a blue-white heat map; see scale in the figure. (B) Log[2] fold changes in the tumor/normal class comparison visualized in a red-green heat map; red = up-regulated; green = down-regulated in tumors. In order to verify that the limited number of commonly DE miRNAs was not a result of the normalization methods, we calculated the number of differentially expressed miRNAs in each platform and for each of the four normalization methods. For the 256 ( = 4^4) possible combinations, we identified a list of shared DE miRNAs. The union of all these lists gathered four miRNAs (hsa-miR-378, hsa-miR-375, hsa-miR-145, hsa-miR-21*), suggesting that different normalization methods can be worse than or, at best, equal to our choice ([83]Fig. S4A). Noteworthy, among the 4 common miRNAs the hsa-miR-378 was identified in all the possible combinations ([84]Fig. S4B). The overall platform comparability in terms of accuracy and ability to identify DE miRNAs was evaluated focusing respectively on fold changes and t-values obtained in the tumor/normal comparison for the 233 miRNAs commonly detected by the 4 platforms. After clustering analysis, the best correlation among log[2] fold changes were observed between Agilent and Exiqon (Pearson’s correlation = 0.63), whereas Illumina showed the most different pattern and wider fold changes ([85]Fig. 3A and [86]Fig. S5A). In the same way, only a partial similarity in t-values (Pearson’s correlation; range = 0.28–0.48; average = 0.40) is present among the 4 platforms, but this time Miltenyi showed the most divergent behavior ([87]Fig. 3B and [88]Fig. S5B). Figure 3. Clustering analysis of log2 fold changes and t-values. Figure 3 [89]Open in a new tab Hierarchical clustering (distance = Pearson correlation; linkage = average) of log2 fold changes (A) and t-values (B) obtained for each platform by comparing tumor and normal samples in the subgroup of commonly detected miRNAs. t-values were calculated using a t-test with random variance model. Inter-platform Agreement using miRNA Sets Previous studies comparing the performance of gene expression microarray platforms suggested that, despite a relatively low overlap among lists of DE genes was obtained with different platforms, a good agreement was found when looking at biologically related gene sets instead of single genes [90][14]. To test whether similar conclusions can be drawn for miRNA microarray platforms, we performed a miRNA set enrichment analysis on our data testing two series of miRNA sets: 1) the DE miRNAs identified by each platform in our study, to evaluate their enrichment among up or down-regulated miRNAs on the other platforms; 2) miRNAs identified as up- or down- regulated between colon cancer and normal mucosa in other microarray based studies from the literature ([91]Table S3). Most of the miRNA sets identified by each platform are coherently enriched in data from the other platforms, with the Miltenyi miRNA set showing the lower enrichments ([92]Fig. 4A). Moreover, the great majority of colon cancer associated miRNA sets derived from the literature were also validated in our data and, at least in part, independently of the tested platform ([93]Fig. 4B). Figure 4. miRNA set enrichment analysis. Figure 4 [94]Open in a new tab Summary of miRNA set enrichment analysis performed using GSEA. Using the expression data obtained with the 4 different platforms, we tested the enrichment of miRNAs DE (when comparing colorectal cancer and normal mucosa) in our study (A) or reported in the literature (B). miRNAs up- or down-regulated were tested separately. For the literature-derived miRNA sets, the firs author and the platform used were indicated (see also [95]Table S3). False Discovery Rates less than 5% or 10% were considered significant or marginally significant respectively. Comparison with qRT-PCR Data Microarray data are regularly validated by qRT-PCR. Different systems are commercially available and, as pointed out for the microarray platforms, qRT-PCR manufacturers also have to deal with the continuous update of miRBase annotations. As a validation method, depending on the availability of selected miRNA assays at the time the experiments were performed, SYBR Green LNA assays from Exiqon or Applied Biosystem Taqman assays were used. We focused our validation analysis on 18 miRNAs that summarize different situations found in the platform comparison ([96]Table 2). The 8 DE miRNAs in at least 3 of 4 array platforms were validated as significantly DE by qRT-PCR. For these 8 miRNAs, high correlations between qRT-PCR and array expression values and in pair-wise contrasts of array data were observed ([97]Table 3 and File S1) with two exceptions; in the case of hsa-miR-21*, although qRT-PCR data confirmed the differential expression found in all array platforms, its correlation with array data was limited (R coefficient’s range 0.27–0.44); for hsa-miR-21, the values on Illumina did not correlate with any other values obtained on arrays or by qRT-PCR. This latter discrepancy is likely attributable to the miR-21 expression values on Illumina that are near to saturation in all samples and, for this reason, concentrated in a limited range. Table 2. miRNA arrays and qRT-PCR class comparison. Class comparison tumor/normal qPCR Agilent Exiqon Illumina Miltenyi miRNA FC p-val FC p-val FC p-val FC p-val FC p-val Differentially expressed in at least 3/4 platforms Concordant hsa-miR-378 0.18 0.0000 0.49 0.0002 0.40 0.0000 0.40 0.0003 0.67 0.0130 hsa-miR-375 0.14 0.0009 0.40 0.0005 0.70 0.0055 0.55 0.0441 0.57 0.0337 hsa-miR-21* 1.54 0.0254 1.64 0.0460 1.32 0.0009 1.82 0.0009 1.47 0.0086 hsa-miR-145 0.10 0.0065 0.30 0.0027 0.49 0.0456 0.68 0.0019 0.35 0.0184 hsa-miR-96 3.73 0.0008 1.77 0.0050 1.18 0.0428 4.43 0.0024 hsa-miR-21 1.86 0.0118 2.47 0.0033 2.11 0.0159 1.11 0.0081 1.42 0.1709 hsa-miR-147b 0.12 0.0015 0.83 0.0000 0.81 0.0170 0.39 0.0005 hsa-miR-143 0.17 0.0118 0.42 0.0437 0.30 0.0364 0.69 0.0018 0.47 0.1252 Differentially expressed in at least 2/4 platforms Concordant hsa-miR-93 0.84 0.4667 1.61 0.0202 1.17 0.0790 1.36 0.0050 1.20 0.2097 hsa-miR-886-5p 2.41 0.0130 1.02 0.3686 1.73 0.0189 1.48 0.0002 hsa-miR-886-3p 0.93 0.7370 1.09 0.0004 1.25 0.0956 1.88 0.0002 hsa-miR-497 0.28 0.0051 0.61 0.0008 0.35 0.0060 0.81 0.1473 hsa-miR-30a 0.27 0.0016 0.46 0.0002 1.32 0.2676 0.50 0.0015 hsa-miR-182 3.60 0.0012 1.04 0.0256 1.10 0.4681 2.65 0.0000 hsa-miR-139-5p 0.10 0.0010 0.79 0.0025 0.76 0.0664 0.17 0.0023 0.77 0.2078 hsa-miR-136 0.23 0.0153 0.69 0.0037 0.98 0.7852 0.48 0.0176 Discordant hsa-miR-218 0.19 0.0410 0.96 0.3632 1.22 0.0363 0.34 0.0020 hsa-miR-302a 0.95 0.8335 1.02 0.3413 1.20 0.0247 0.54 0.0020 [98]Open in a new tab FC = fold change. Table 3. Pair-wise correlations. Pearson correlation analysis miRNA qPCR vs Agilent qPCR vs Exiqon qPCR vs Illumina qPCR vs Miltenyi Agilent vs Exiqon Agilent vs Illumina Agilent vs Miltenyi Exiqon vs Illumina Exiqon vs Miltenyi Illumina vs Milteny Differentially expressed in at least 3/4 platforms hsa-miR-378 0.67 0.87 0.83 0.54 0.64 0.63 0.75 0.9 0.59 0.68 hsa-miR-375 0.91 0.7 0.86 0.49 0.67 0.85 0.32 0.63 0.86 0.52 hsa-miR-21* 0.44 0.42 0.39 0.27 0.1 0.16 0.39 0.52 0.64 0.57 hsa-miR-145 0.94 0.81 0.96 0.83 0.78 0.92 0.81 0.81 0.96 0.84 hsa-miR-96 0.53 0.48 0.67 – 0.74 0.69 – 0.6 – – hsa-miR-21 0.76 0.77 −0.05 0.72 0.93 0.00 0.87 −0.23 0.87 0.02 hsa-miR-147b 0.67 0.55 0.72 – 0.47 0.76 – 0.61 – – hsa-miR-143 0.66 0.68 0.86 0.59 0.91 0.6 0.88 0.64 0.88 0.57 Differentially expressed in at least 2/4 platforms Concordant hsa-miR-93 0.33 −0.25 −0.19 −0.14 0.07 0.5 −0.07 0.4 0.61 0.44 hsa-miR-886-5p 0.15 – 0.76 0.26 – 0.25 0.27 – – 0.21 hsa-miR-886-3p 0.41 – 0.48 0.00 – 0.69 0.71 – – 0.4 hsa-miR-497 0.79 – 0.79 0.52 – 0.84 0.65 – – 0.63 hsa-miR-30a 0.7 −0.01 0.65 – −0.47 0.46 – −0.2 – – hsa-miR-182 0.48 0.17 0.86 – 0.55 0.58 – 0.04 – – hsa-miR-139-5p 0.83 0.76 0.87 0.27 0.63 0.79 0.22 0.74 0.91 0.48 hsa-miR-136 0.69 0.29 0.72 – 0.25 0.56 – 0.31 – – Discordant hsa-miR-218 0.5 −0.41 0.77 – 0.06 0.55 – −0.32 – – hsa-miR-302a 0.24 −0.01 0.06 – −0.2 −0.04 – −0.52 – – [99]Open in a new tab To better understand the basis of the poor overlap of class comparison results in the four platforms, we measured the expression of 10 further miRNAs ([100]Table 3 and File S1). Six of them (hsa-miR-136, hsa-miR-139-5p, hsa-miR-182, hsa-miR-30a, hsa-miR-497, and hsa-miR-93) were selected among the 14 DE miRNAs (P<0.05) according to both Agilent and Illumina. We validated the array data by qRT-PCR for 5 of these 6 miRNAs, with the relevant exception of hsa-miR-93. Correlation coefficients between qRT-PCR and either Agilent or Illumina data ranged from 0.65 to 0.87 for hsa-miR-136, hsa-miR-139-5p, hsa-miR-30a, and hsa-miR-497; for hsa-miR-182, whose probe intensities on Illumina were at intermediate levels and DE at P<0.005 and on Agilent were near to the background and DE at P<0.05, were 0.86 and 0.48, respectively. Two other miRNAs, hsa-miR-886-5p and hsa-miR-886-3p, selected for qRT-PCR validation were concordant in 2 of the four platforms. The differential expression of hsa-miR-886-5p, DE on Illumina and Miltenyi platforms, was confirmed by RT-qPCR, while, that of hsa-miR-886-3p, DE on Miltenyi and Agilent platforms, did not appear to be DE by qRT-PCR. Finally, we selected two miRNAs (hsa-miR-218 and hsa-miR-302a) that were DE on Exiqon and Illumina platforms but with opposite fold changes. hsa-miR-218 reduced expression in tumors on Illumina was confirmed by qRT-PCR while that of hsa-miR-302a was not validated using qRT-PCR. Real time PCR data are generally used to determine the sensitivity and specificity of data obtained with microarrays. To this aim, we compared our results to those obtained in an independent published qRT-PCR study, in which 70 of 665 unique miRNAs tested were found differentially expressed in 40 paired normal-colon cancer samples [101][15]. For each platform we selected miRNAs present in the qPCR dataset (527 for Agilent, 596 for Illumina, 545 for Exiqon and 278 for Miltenyi) and computed ROC curves using different thresholds of P-value. ([102]Fig. 5). The values of Area Under the ROC Curve (AUC) showed that Agilent and Illumina are very similar and are the most accurate platforms while Miltenyi is the less performing. Figure 5. Performance assessment of the platforms. Figure 5 [103]Open in a new tab Considering as gold standard the miRNAs identified as differentially expressed in a qPCR study on 40 paired tumor-normal samples, we evaluated the performance of each platform calculating sensitivity and specificity at different thresholds of P-value and plotting the resulting values in the ROC space. Biological Insight When the 68 miRNAs DE at P<0.005 in at least one of the four platforms were compared with literature data, we found that 25% of them were concordantly described in literature as deregulated in colorectal cancer in comparison to the non tumor counterpart ([104]Table S4). Furthermore, we found that 12 miRNAs belong to known co-expressed family clusters. The main biological data associated to the four miRNA clusters are reported in [105]table 4. Looking at their expression we observed that: for miR 25–106b cluster, only hsa-miR-25 and hsa-miR-93 are present in the list of 68 miRNAs at the thresholds we applied; the miR 182-96 cluster is particularly evident in Illumina where hsa-miR-182, −182*, −183, and −96 are among the most up-regulated miRNAs in this platform (fold changes tumor vs normal ranging from 4.42 to 2.65); the miRNA cluster 143–145 is coherently deregulated in all the four platforms of our study, being hsa-miR-143 the most down-regulated miRNA in tumor tissues on Exiqon platform (fold change tumor vs normal tumor = 0.30; p = 0.036) and hsa-miR-145 the most down-regulated in Agilent and Miltenyi (fold change tumor vs normal = 0.30 and 0.35; p = 0.0027 and 0.018 respectively). Table 4. Role in colon cancer of miRNA clusters DE in our study. miRNA cluster members Chromosome location Role in colon cancer Reference miR 195–497 hsa-miR-195hsa-miR-497 17p13.1 Chromosomal region frequently deleted in colorectal cancer.hsa-miR-195 is associated to lymph node metastasis, advanced tumor stage, and pooroverall survival. [106][35] [107][36] miR 25–106b hsa-miR-25hsa-miR-93hsa-miR-106b 7q22.1 hsa-miR-25 is associated with lymphatic andvenous invasion,a more aggressive tumor phenotype.This cluster is closely relatedwith oncomir1. [108][37] miR 182-96 hsa-miR-182 hsa-miR-182* hsa-miR-183 hsa-miR-183* hsa-miR-96 7q32.2 intergenicregion Not reported; in medulloblastoma this cluster promotes tumorigenesis regulating cellular migration. [109][38] miR 143–145 hsa-miR-143 hsa-miR-145 5q32 Altered expression is reported.This cluster is associated with negativeregulation on cell proliferation [110][39] [111]Open in a new tab Gene expression profiles of the same samples analyzed by miRNA expression arrays were available. Thus, we considered an integration approach to evaluate whether similar biological information could be retrieved from the four platforms, irrespectively of the overlap in DE miRNAs. To this aim, using the MAGIA tool, negatively correlated putative target genes of DE miRNAs were identified in each platform (File S2) and an enrichment analysis was performed by IPA software. To highlight the concordance among the four platforms, enrichment P-values for all the cancer-related pathways significantly enriched in at least one platform are shown in a colorimetric scale in [112]Figure 6A. Pathways related to cell cycle regulation and PTEN signalling were concordantly identified. When we looked at validated targets by TarBase software, the number of miRNA-mRNA interactions negatively correlated at p<0.05 was very limited (Agilent = 35, Exiqon = 2, Illumina = 45 and Miltenyi = 0) precluding a comparison across the four platforms. Figure 6. Computational integration of miRNA and gene expression profiles of the paired tumor/normal colon samples. [113]Figure 6 [114]Open in a new tab (A) Pathway enrichment analysis of anti-correlated predicted target genes of differentially expressed miRNAs according to each microarray platform. (B) Network between the top 8 differentially expressed miRNAs and their anti-correlated target genes. The 250 top interactions were used to generate the network using MAGIA tool. Furthermore, by considering the qRT-PCR data of the 8 most concordant miRNAs and the gene expression profiles, the same integration approach identified a total of 803 miRNA-negatively correlated gene (predicted as miRNA targets) interactions (File S2). The graphical representation of the top 250 interactions highlighted that many genes that were up-regulated in tumors are predicted targets of two or more down-regulated miRNAs ([115]Fig. 6B). In detail, there are 70 genes co-targeted by at least two miRNAs and 84% of them are regulated by miR 143–145 cluster ([116]Table S5). Among these genes those related to glycolysis and nutrient transport pathways seemed over-represented. Discussion Despite their relatively recent discovery, there is a rapidly growing interest in the study of the role of miRNAs in many pathological processes including cancer. Accordingly, high throughput technologies, initially developed for GW gene expression evaluation, were rapidly adapted to GW measurement of miRNAs. However, as highlighted in recent reviews [117][5]; [118][16]; [119][17], several factors, including short miRNA length, high degree of homology in miRNA families, the high rate of new miRNA identification (the actual number of miRNAs in miRBase 18, released in November 2011 is approaching two thousands) and the relatively high percent (about 10%) of artefactual miRNAs not confirmed by resequencing experiments, significantly complicate their analysis. The impact of these factors on the different methodologies applied by manufacturers of different available platforms must be considered in inter-platform comparison studies. The issues of intra- and inter- microarray platform reproducibility have been mainly addressed using experimental settings where tissues or cell lines of different origin are compared, with the assumption that, due to the wide range of expected expression modulations by such comparison, technical noise can become negligible. This type of approach mirrored the one followed in its first phase study by the MicroArray Quality Control (MAQC) consortium, aiming to assess the inter-platform and inter-laboratory reproducibility of gene-expression microarray data using two different RNAs (human brain and a universal human reference) [120][18]. This approach was strongly questioned in 2007 for its lack of consistency with real research settings [121][19]. However, in the majority of miRNA inter-platform comparison studies, quoted in Aldridge & Hadfield [122][16] and reported in [123]Table S1, the experimental design was biased toward the use of samples with strong difference in origin. Noteworthy, only two studies [124][7]; [125][10] compared the miRNA profile of biological meaningful samples on, at least, three different platforms, but even in these cases the samples are cell lines. Thus, our study represents the first attempt to compare miRNA platform performance in a clinical setting, where the inter-sample variability within the same class is expected to be higher than in cell lines. The majority of profiling studies using clinical samples aimed at revealing even subtle differences in expression but which are associated to a specific clinical context. In these settings, technical replicates are frequently not feasible due to RNA quantity and economical considerations. Thus, in the present study we addressed the issue of inter-platform comparison using samples belonging to two classes (paired tumor and normal colon tissues) which could theoretically lead to new insights in tumor biology and clinical applications. Our data, generated by profiling the same tissue-derived total RNAs using four different miRNA array platforms, showed little overlap between platforms except for a limited number of miRNAs for which very high correlations were observed. These data are essentially in agreement with those obtained using cell lines since also in these studies only few miRNAs were shared among all platforms [126][7]; [127][10]. The first issue we considered was the global distribution of the hybridization intensities. The Illumina platform showed the most diverging behavior in global distribution of intensities compared to the other three platforms. An explanation could be the amplification step of the starting material, according to the Illumina protocol, while for the other platforms direct labeling of the starting material is performed. The amplification step allows the detection of a higher number of miRNAs expressed at low levels (e.g. hsa-miR-182), but with the drawback that it can lead to saturation of signals for more abundant miRNAs such as hsa-miR-21, which is expected to be both biologically and clinically relevant in many cancer types including colorectal cancer [128][20]. Due to the withdrawal of the platform, the saturation of signals remains a note of caution for former Illumina users. The short length of miRNAs, their variable GC content, and the existence of families of miRNAs differing in one or only few nucleotides pose a set of technical challenges that each manufacturer has attempted to overcome through ad-hoc approaches. An evaluation of the GC content of detected and undetected probes in each platform confirmed the relevance of this parameter in determining the detection performance of all of them, but also highlighted that the Miltenyi platform is exceedingly sensitive to GC content, partially explaining its low detection rate. In class comparison analysis between tumor and normal samples, much more modulated miRNAs were identified on Agilent and Illumina platforms compared to the few identified on the Exiqon and Miltenyi platforms. In Exiqon data, most of miRNAs modulated in Agilent and Illumina were detectable, although they did not reach statistical significance; on the other hand, the same miRNAs were frequently undetected on the Miltenyi platform. Focusing on the 233 commonly detected miRNAs, Miltenyi clustered separately from the other three platforms considering t-values, while Illumina shows the worst correlations with the others three platforms when considering fold-changes. qRT-PCR is frequently used as a “gold standard” to corroborate data using microarrays, but, as previously reported by others [129][7]; [130][17], qRT-PCR might also perform poorly in measuring some miRNAs, thus challenging its role as a “gold standard”. Moreover, the validity of qRT-PCR as a reference technique requires the application of superior standards to ensure its validity and the adherence to MIQE, i.e. the specific guidelines for minimum information for publication of quantitative real time PCR experiments [131][21]. Thus, in our analysis, we decided to use this technique, as generally done in a clinical setting, selecting only a small subset of miRNAs. It is worthwhile noting that all the 8 miRNAs concordantly DE on at least 3 of the 4 platforms were confirmed as DE by qRT-PCR, while in regard to the other 10 miRNAs assessed by qRT-PCR, 7 were validated. Furthermore, since previous studies suggested that, despite a relatively low overlap among lists of DE genes obtained with different platforms, a higher agreement could be obtained looking at biologically related gene sets instead of single genes [132][14], we performed a miRNA set enrichment analysis on our data. In this case, we were able to appreciate a better inter-platform agreement compared to an approach based on single miRNA. In addition, a coherent enrichment was found for miRNA sets obtained from literature even using platforms different from the four analyzed in our study. Undoubtedly, technical and analytical challenges in measuring miRNAs still remain and further research is required in order to increase consistency between different microarray-based methodologies. Overall, the poor inter-platform comparability seems to be reasonably due to a high false negative rate, with some probes performing poorly; among the four tested platforms, Illumina and Agilent, due to their high throughput performance, to the good concordance with qRT-PCR for the most DE miRNAs, and to the good sensitivity/specificity by ROC curves, resulted adequate for miRNA GW evaluation of clinical specimens. Finally, comparison studies could be relevant to other researchers not only in making the proper decision regarding the best platform to use in their projects but also for a better interpretation of their results. Looking at literature data we found that some miRNAs, identified as DE in our study, have been already implicated in colon cancer development and progression (see also comments and references in [133]Table 4 and