Abstract

Background

   Microarray technology applied to microRNA (miRNA) profiling is a
   promising tool in many research fields; nevertheless, independent
   studies characterizing the same pathology have often reported poorly
   overlapping results. miRNA analysis methods have only recently been
   systematically compared but only in few cases using clinical samples.

Methodology/Principal Findings

   We investigated the inter-platform reproducibility of four miRNA
   microarray platforms (Agilent, Exiqon, Illumina, and Miltenyi),
   comparing nine paired tumor/normal colon tissues. The most concordant
   and selected discordant miRNAs were further studied by quantitative
   RT-PCR. Globally, a poor overlap among differentially expressed miRNAs
   identified by each platform was found. Nevertheless, for eight miRNAs
   high agreement in differential expression among the four platforms and
   comparability to qRT-PCR was observed. Furthermore, most of the miRNA
   sets identified by each platform are coherently enriched in data from
   the other platforms and the great majority of colon cancer associated
   miRNA sets derived from the literature were validated in our data,
   independently from the platform. Computational integration of miRNA and
   gene expression profiles suggested that anti-correlated predicted
   target genes of differentially expressed miRNAs are commonly enriched
   in cancer-related pathways and in genes involved in glycolysis and
   nutrient transport.

Conclusions

   Technical and analytical challenges in measuring miRNAs still remain
   and further research is required in order to increase consistency
   between different microarray-based methodologies. However, a better
   inter-platform agreement was found by looking at miRNA sets instead of
   single miRNAs and through a miRNAs – gene expression integration
   approach.

Introduction

   microRNAs (miRNAs) are small non-coding RNA molecules of 18–24
   nucleotides in length that are widely conserved in all eukaryotic
   organisms and serve as regulators of gene expression. miRNAs are
   involved in all major cellular processes and are implicated in a large
   number of human diseases including cancer [42][1]–[43][3].

   Over the past decade, DNA microarray technology has become an
   increasingly cost-effective methodology that is able to quickly
   generate high-throughput data, paving the way to genome-wide (GW)
   analysis of gene-expression, genomic copy number variations, SNPs, and
   epigenetic alterations. Microarray-based techniques have been
   extensively used in several areas of research and molecular assays
   using patterns of gene expression and predetermined mathematical
   algorithms, such as Mammaprint®) [44][4], are currently under
   validation by prospective multicentric clinical studies in breast
   cancer.

   More recently, microarray technology has been applied to miRNA
   profiling and is becoming a promising technique in many research
   fields, such as translational research in oncology, and can provide
   useful information on the role of miRNAs in both tumorigenesis and
   progression of cancer [45][2]. Nevertheless, independent studies
   characterizing the same pathology have often poorly overlapping
   results. This could be due to small sample size, high tumor variability
   and heterogeneity but also to technical reasons. A major advantage of
   the microarray approach consists on the high-throughput simultaneous
   screening of up to thousands molecules in a single assay, but this
   requires hybridization conditions to be the same for all probes on the
   array. This is not trivial for miRNA microarrays because the GC content
   of miRNAs is highly variable and the options for probe design are more
   limited than for mRNA due to their short length. For a complete review
   of general concepts and special challenges that are relevant to miRNA
   profiling refers to Pritchard et al [46][5]. A multitude of platforms
   for miRNA profiling are commercially available, and each manufacturer
   has developed its own technical procedures to maximize sensitivity and
   specificity in measuring miRNA expression levels. As a result, probe
   signals are expected to largely differ among platforms, and a direct
   comparison is not possible. In spite of this, the general patterns of
   differentially expressed (DE) miRNAs should be coherently detected by
   all platforms. Only recently the comparison of intra- and
   inter-platform reproducibility of miRNA microarrays has been analyzed
   in more than three different platforms (see [47]Table S1 for details)
   [48][6]–[49][11]. Taken together, these studies provide evidence that
   miRNA microarray platforms show excellent intra-platform
   reproducibility, but limited inter-platform concordance. Indeed,
   comparing miRNAs identified as DE within each platform, a significant
   variation in the total number as well as in the fold-change of miRNAs
   has been noted. Three of these studies [50][6]; [51][8]; [52][9] based
   their conclusions on the comparison of tissues or pools of tissues of
   completely different origin. Sah et al. analyzed the expression of
   seven synthetic miRNAs spiked in known concentration into a RNA from
   placental tissue and hybridized on five platforms [53][11]. To be
   nearer to a miRNA microarray application in cancer research, Git et al.
   analyzed a pool of normal breast tissues and two breast cancer cell
   lines [54][7] and Dreher et al. compared untrasfected and
   HPV-transfected human keratinocytes [55][10]. Even if, the former four
   comparisons represent a useful system to address technical issues, and
   the later two studies are undoubtedly more realistic, the issue of
   concordance of different platforms, when clinically specimens are used,
   has not been yet addressed.

   In the present study, we compared the miRNA expression profiles of nine
   colorectal cancer and normal colon mucosa samples from the same
   patients using four different commercial platforms (Agilent, Exiqon,
   Illumina and Miltenyi). The expression of the most concordant and
   selected discordant miRNAs among platforms was then evaluated with
   quantitative real time PCR (qRT-PCR). Finally, integrative analyses of
   miRNAs in the context of gene expression and literature data were
   performed as a proof of principle of the validity of microarray miRNA
   analysis in gaining insight into the biological role of these miRNAs.

Results

Experimental Setting

   To highlight the influence of the sample origin and the study design on
   the obtained results, we made a computational comparison of expression
   data from four microarray studies. We selected the data obtained on a
   common miRNA platform, i.e Agilent, from the two miRNA platform
   comparison studies (details in [56]Table S1) whose expression data on
   human samples are publicly available ([57]GSE13860 [58][6] and
   E-MTAB-96 [59][7]) and from two studies chosen as examples of
   experimental applications in a clinical setting, i.e. profiles
   associated with tumorigenesis of prostate [60][12] and gastric [61][13]
   cancer ([62]GSE21036 and [63]GSE28700, respectively; details in Fig.S1
   legend). As shown in [64]Figure S1, the number of DE miRNAs and the
   associated fold changes are considerably higher in the cross-platform
   analysis than in profiles looking at tumorigenesis. Imposing a uniform
   and arbitrary threshold (|log[2] fold change|>1), 88.5% ([65]GSE13860)
   and 25.9% (E-MTAB-96) of miRNAs present in the arrays were
   differentially expressed in the cross-platform datasets; on the other
   hand, only 6.9% ([66]GSE21036) and 6.7% ([67]GSE28700) of miRNAs were
   identified as DE at same threshold in the clinical datasets.

   With these premises, we decided to evaluate the inter-platforms
   reproducibility in a clinical setting by assessing the tumor and the
   normal counterpart miRNA profiles in samples collected from nine
   patients who underwent surgical resection for colon cancer (see
   [68]Table S2 for clinical and pathological characteristics). RNA
   aliquots from these samples were hybridized on four microarray
   platforms: Agilent SurePrint G3 human miRNA Microarray, Exiqon miRCURY
   LNA microRNA Array, Illumina Human_v2 microRNA expression Beadchips,
   and Miltenyi miRXplore Microarray. Main features of the four platforms
   are described in [69]Table 1.

Table 1. Platform description.

   Agilent Exiqon Illumina Miltenyi
   Array version Human miRNA V3 miRCURY LNA microRNA Array Human miRNA_V2
   miRXplore microarray V5
   Array per slide 8 1 12 1
   Channels Single Single Single Dual
   Input total RNA 100 ng 300 ng 600 ng 1200 ng
   Labeling Cy3 Hy3 Cy3 Hy3/Hy5
   Labeling process Alkaline phosphatase and 3′ ligation Alkaline
   phosphatase and3′ ligation Polyadenylation, RT, MSO[70]^§pool
   annealing, PCR Alkaline phosphatase and3′ ligation
   miRBase version miRBase V12.0 miRBase V14.0 miRBase V12.0 miRBase V14.0
   N° hsa-miR 866 891 858 911
   N° probes/miR 2 1 1 1
   N° replicates/probe 4–8 4 370 (average) 4
   [71]Open in a new tab
   ^§

   MSO = miRNA specific oligo.

   It should be noted that the platform from Illumina was withdrawn since
   March 2010; however, we decided to include it in our comparison due to
   its extensive use in laboratories worldwide, including those in our
   Institute. Accordingly, the issues addressed in the present
   investigation can be of interest to users of the Illumina platform to
   better interpret their results and to enable a more rationale switch to
   a different platform.

   The Agilent, Exiqon, and Illumina arrays were carried out in one-color.
   Miltenyi was hybridized in two colors: tissue samples were labeled with
   Hy5, and a synthetic reference purchased by Miltenyi with Hy3. Since
   the synthetic reference was designed on miRBase 9.2 and covered only a
   portion of the miRNAs present on the arrays designed on miRBase 14.0,
   only the Hy5 data were considered and used for normalization in order
   to enable a more direct comparison with the other three platforms.

   The Agilent, Exiqon, and Illumina platforms contained probes designed
   either on viral miRNA sequences or on putative miRNAs not yet annotated
   in miRBase, derived from literature and Next-Generation Sequencing
   studies. Since these sequences are present only in one platform, they
   were excluded from our analyses.

   miRBase database is the primary repository for all miRNA sequences and
   annotations used by all manufacturers for the design of the probes.
   However, the frequent update of miRBase results in annotation problems.
   To avoid possible bias, we selected arrays designed on close miRBase
   versions and the probes of the four tested platforms were designed on
   either v12.0 or v14.0 miRBase. We verified that the names and sequences
   of miRNAs present in v12.0 did not change in the newer miRBase version,
   while a set of new miRNAs was added. The difference in the total number
   of miRBase annotated miRNAs in the four platforms was relatively small
   (6%).

Evaluation of Data Distribution and Detection Rate

   Non-normalized signal intensities showed a platform dependent
   distribution reflecting the unique methods developed by manufacturers
   for labeling, hybridization stringency and data acquisition ([72]Fig.
   1A). For all platforms, the signals covered most of the dynamic range
   available for 16-bit scanners; Agilent, Exiqon, and Miltenyi signal
   distributions tended to have positive skewness (a right side long tail)
   and differed from Illumina distributions where many more probes showed
   intermediate to high expression levels.

Figure 1. Comparison of microarray platform performance.

   [73]Figure 1
   [74]Open in a new tab

   (A) Global non-normalized intensity distribution. (B) Graphical
   representation of miRNA detection; blue = detected,
   yellow = undetected, gray = not present. (C) Box-plot of the percentage
   of GC content in mature miRNA sequences; blue = detected,
   yellow = undetected. P-values were calculated by Student’s t test.

   For the Agilent and Illumina platforms, we followed the detection call
   criteria recommended by the manufacturers. Illumina’s software provides
   a detection P-value that estimates to what extent a signal is greater
   than the noise represented by negative controls; similarly, Agilent’s
   software provides a flag (gIsPosAndSignif) that estimates if the
   feature signal is positive and significant compared to the background.
   In contrast, for the Exiqon and Miltenyi platforms a detection call
   criteria was not defined; for these platforms, we established a
   threshold percentage of pixels for every spot in the array whose
   intensity was lower than background. Taking into account these
   filtering procedures, 675 (78% of miRBase annotated miRNAs present on
   the array), 775 (87%), 808 (94%), and 376 (41%) unique miRNAs were
   detectable in at least one of the samples in the Agilent, Exiqon,
   Illumina, and Miltenyi platforms, respectively ([75]Fig. 1B). There
   were 233 miRNAs that were shared by all platforms, being strongly
   limited by the low detection rate in the Miltenyi platform.

   In order to estimate to what extent the GC content impacted the
   detection call of each platform, we calculated the GC percentage of the
   miRNAs assayed and compared, for each platform, the GC content between
   detected and undetected miRNAs. Despite each manufacturer has adjusted
   probe design and hybridization procedures to overcome discrepancies in
   the thermodynamic stability of probe/target recognition, the GC content
   was significantly higher in the detected than in the undetected miRNAs
   in all platforms, and this difference was particularly evident for the
   Miltenyi platform ([76]Fig. 1C).

Normalization and Class Comparison Results

   Several normalization and data processing procedures are available,
   most translated by gene-expression studies and with little consensus
   among laboratories. Considering the unique characteristics of each
   platform, it is unlikely that the same normalization procedure could
   perform equally in all platforms to correct systematic differences.

   In order to choose the best normalization for each platform, we
   evaluated the ability of the four different methods (loess, quantile,
   rank invariant, and Robust Spline Normalization) to reduce the
   intra-class variability in normal and tumor samples through the use of
   Relative Log Expression (RLE) (see [77]Fig. S2). Moreover, we expected
   that the best normalization method should increase the fold changes and
   the number of differentially expressed miRNAs between tumor and normal
   tissue. According to these criteria, we chose RSN for Illumina and
   Agilent, loess for Exiqon and quantile for Miltenyi.

   In [78]Figure S3 the tumor/normal class comparisons in the 4 platforms,
   expressed as histograms of log P-value and FDR, are reported. The
   comparison identified, at a threshold P<0.005, 29 miRNAs that were
   modulated on Agilent, 4 on Exiqon, 42 on Illumina, and 3 on the
   Miltenyi platform, corresponding to 4.3%, 0.5%, 5.2%, and 0.8% of
   miRNAs detected, respectively.

Inter-platform Agreement of Class Comparison Results

   To assess inter-platform concordance, we examined the miRNAs that were
   DE at P<0.005 in at least one platform; by combining these miRNAs, a
   consensus list of 68 miRNAs was generated. To highlight concordance
   among the four platforms, the P-values and fold-changes of the
   consensus list miRNAs are shown in a colorimetric scale in [79]Figure
   2A and B respectively. Imposing a P<0.005 on all four platforms, no
   miRNAs were commonly DE. At P<0.05, hsa-miR-378, hsa-miR-375,
   hsa-miR21*, hsa-miR-145 were detected as DE by all platforms and a
   further 4 miRNAs (hsa-miR-96, hsa-miR21, hsa-miR147b, and hsa-miR-143)
   were DE on all but one platform; in fact, on the Miltenyi platform,
   hsa-miR-96 and hsa-miR-147b were not detected, while hsa-miR-21 and
   hsa-miR-143 did not reach a significant threshold. Twelve, 2, and 25
   miRNAs were found to be exclusively DE on the Agilent, Exiqon, and
   Illumina platforms, respectively. The remaining 29 miRNAs were DE in at
   least two platforms. The fold changes are concordant across platforms
   with the only exception of two miRNAs (hsa-miR-218 and hsa-miR-302a)
   that were DE at P<0.05 in Illumina and Exiqon, but with discordant
   fold-changes ([80]Fig. 2B).

Figure 2. Cross-platform comparison of the consensus list of DE miRNAs at
P<0.005 in at least one platform.

   [81]Figure 2
   [82]Open in a new tab

   (A) P-values of the tumor/normal class comparison visualized in a
   blue-white heat map; see scale in the figure. (B) Log[2] fold changes
   in the tumor/normal class comparison visualized in a red-green heat
   map; red = up-regulated; green = down-regulated in tumors.

   In order to verify that the limited number of commonly DE miRNAs was
   not a result of the normalization methods, we calculated the number of
   differentially expressed miRNAs in each platform and for each of the
   four normalization methods. For the 256 ( = 4^4) possible combinations,
   we identified a list of shared DE miRNAs. The union of all these lists
   gathered four miRNAs (hsa-miR-378, hsa-miR-375, hsa-miR-145,
   hsa-miR-21*), suggesting that different normalization methods can be
   worse than or, at best, equal to our choice ([83]Fig. S4A). Noteworthy,
   among the 4 common miRNAs the hsa-miR-378 was identified in all the
   possible combinations ([84]Fig. S4B).

   The overall platform comparability in terms of accuracy and ability to
   identify DE miRNAs was evaluated focusing respectively on fold changes
   and t-values obtained in the tumor/normal comparison for the 233 miRNAs
   commonly detected by the 4 platforms. After clustering analysis, the
   best correlation among log[2] fold changes were observed between
   Agilent and Exiqon (Pearson’s correlation = 0.63), whereas Illumina
   showed the most different pattern and wider fold changes ([85]Fig. 3A
   and [86]Fig. S5A). In the same way, only a partial similarity in
   t-values (Pearson’s correlation; range = 0.28–0.48; average = 0.40) is
   present among the 4 platforms, but this time Miltenyi showed the most
   divergent behavior ([87]Fig. 3B and [88]Fig. S5B).

Figure 3. Clustering analysis of log2 fold changes and t-values.

   Figure 3
   [89]Open in a new tab

   Hierarchical clustering (distance = Pearson correlation;
   linkage = average) of log2 fold changes (A) and t-values (B) obtained
   for each platform by comparing tumor and normal samples in the subgroup
   of commonly detected miRNAs. t-values were calculated using a t-test
   with random variance model.

Inter-platform Agreement using miRNA Sets

   Previous studies comparing the performance of gene expression
   microarray platforms suggested that, despite a relatively low overlap
   among lists of DE genes was obtained with different platforms, a good
   agreement was found when looking at biologically related gene sets
   instead of single genes [90][14]. To test whether similar conclusions
   can be drawn for miRNA microarray platforms, we performed a miRNA set
   enrichment analysis on our data testing two series of miRNA sets: 1)
   the DE miRNAs identified by each platform in our study, to evaluate
   their enrichment among up or down-regulated miRNAs on the other
   platforms; 2) miRNAs identified as up- or down- regulated between colon
   cancer and normal mucosa in other microarray based studies from the
   literature ([91]Table S3). Most of the miRNA sets identified by each
   platform are coherently enriched in data from the other platforms, with
   the Miltenyi miRNA set showing the lower enrichments ([92]Fig. 4A).
   Moreover, the great majority of colon cancer associated miRNA sets
   derived from the literature were also validated in our data and, at
   least in part, independently of the tested platform ([93]Fig. 4B).

Figure 4. miRNA set enrichment analysis.

   Figure 4
   [94]Open in a new tab

   Summary of miRNA set enrichment analysis performed using GSEA. Using
   the expression data obtained with the 4 different platforms, we tested
   the enrichment of miRNAs DE (when comparing colorectal cancer and
   normal mucosa) in our study (A) or reported in the literature (B).
   miRNAs up- or down-regulated were tested separately. For the
   literature-derived miRNA sets, the firs author and the platform used
   were indicated (see also [95]Table S3). False Discovery Rates less than
   5% or 10% were considered significant or marginally significant
   respectively.

Comparison with qRT-PCR Data

   Microarray data are regularly validated by qRT-PCR. Different systems
   are commercially available and, as pointed out for the microarray
   platforms, qRT-PCR manufacturers also have to deal with the continuous
   update of miRBase annotations. As a validation method, depending on the
   availability of selected miRNA assays at the time the experiments were
   performed, SYBR Green LNA assays from Exiqon or Applied Biosystem
   Taqman assays were used.

   We focused our validation analysis on 18 miRNAs that summarize
   different situations found in the platform comparison ([96]Table 2).
   The 8 DE miRNAs in at least 3 of 4 array platforms were validated as
   significantly DE by qRT-PCR. For these 8 miRNAs, high correlations
   between qRT-PCR and array expression values and in pair-wise contrasts
   of array data were observed ([97]Table 3 and File S1) with two
   exceptions; in the case of hsa-miR-21*, although qRT-PCR data confirmed
   the differential expression found in all array platforms, its
   correlation with array data was limited (R coefficient’s range
   0.27–0.44); for hsa-miR-21, the values on Illumina did not correlate
   with any other values obtained on arrays or by qRT-PCR. This latter
   discrepancy is likely attributable to the miR-21 expression values on
   Illumina that are near to saturation in all samples and, for this
   reason, concentrated in a limited range.

Table 2. miRNA arrays and qRT-PCR class comparison.

   Class comparison tumor/normal
   qPCR Agilent Exiqon Illumina Miltenyi
   miRNA FC p-val FC p-val FC p-val FC p-val FC p-val
   Differentially expressed in at least 3/4 platforms Concordant
   hsa-miR-378 0.18 0.0000 0.49 0.0002 0.40 0.0000 0.40 0.0003 0.67 0.0130
   hsa-miR-375 0.14 0.0009 0.40 0.0005 0.70 0.0055 0.55 0.0441 0.57 0.0337
   hsa-miR-21* 1.54 0.0254 1.64 0.0460 1.32 0.0009 1.82 0.0009 1.47 0.0086
   hsa-miR-145 0.10 0.0065 0.30 0.0027 0.49 0.0456 0.68 0.0019 0.35 0.0184
   hsa-miR-96 3.73 0.0008 1.77 0.0050 1.18 0.0428 4.43 0.0024
   hsa-miR-21 1.86 0.0118 2.47 0.0033 2.11 0.0159 1.11 0.0081 1.42 0.1709
   hsa-miR-147b 0.12 0.0015 0.83 0.0000 0.81 0.0170 0.39 0.0005
   hsa-miR-143 0.17 0.0118 0.42 0.0437 0.30 0.0364 0.69 0.0018 0.47 0.1252
   Differentially expressed in at least 2/4 platforms Concordant
   hsa-miR-93 0.84 0.4667 1.61 0.0202 1.17 0.0790 1.36 0.0050 1.20 0.2097
   hsa-miR-886-5p 2.41 0.0130 1.02 0.3686 1.73 0.0189 1.48 0.0002
   hsa-miR-886-3p 0.93 0.7370 1.09 0.0004 1.25 0.0956 1.88 0.0002
   hsa-miR-497 0.28 0.0051 0.61 0.0008 0.35 0.0060 0.81 0.1473
   hsa-miR-30a 0.27 0.0016 0.46 0.0002 1.32 0.2676 0.50 0.0015
   hsa-miR-182 3.60 0.0012 1.04 0.0256 1.10 0.4681 2.65 0.0000
   hsa-miR-139-5p 0.10 0.0010 0.79 0.0025 0.76 0.0664 0.17 0.0023 0.77
   0.2078
   hsa-miR-136 0.23 0.0153 0.69 0.0037 0.98 0.7852 0.48 0.0176
   Discordant hsa-miR-218 0.19 0.0410 0.96 0.3632 1.22 0.0363 0.34 0.0020
   hsa-miR-302a 0.95 0.8335 1.02 0.3413 1.20 0.0247 0.54 0.0020
   [98]Open in a new tab

   FC = fold change.

Table 3. Pair-wise correlations.

   Pearson correlation analysis
   miRNA qPCR vs Agilent qPCR vs Exiqon qPCR vs Illumina qPCR vs Miltenyi
   Agilent vs Exiqon Agilent vs Illumina Agilent vs Miltenyi Exiqon vs
   Illumina Exiqon vs Miltenyi Illumina vs Milteny
   Differentially expressed in at least 3/4 platforms hsa-miR-378 0.67
   0.87 0.83 0.54 0.64 0.63 0.75 0.9 0.59 0.68
   hsa-miR-375 0.91 0.7 0.86 0.49 0.67 0.85 0.32 0.63 0.86 0.52
   hsa-miR-21* 0.44 0.42 0.39 0.27 0.1 0.16 0.39 0.52 0.64 0.57
   hsa-miR-145 0.94 0.81 0.96 0.83 0.78 0.92 0.81 0.81 0.96 0.84
   hsa-miR-96 0.53 0.48 0.67 – 0.74 0.69 – 0.6 – –
   hsa-miR-21 0.76 0.77 −0.05 0.72 0.93 0.00 0.87 −0.23 0.87 0.02
   hsa-miR-147b 0.67 0.55 0.72 – 0.47 0.76 – 0.61 – –
   hsa-miR-143 0.66 0.68 0.86 0.59 0.91 0.6 0.88 0.64 0.88 0.57
   Differentially expressed in at least 2/4 platforms Concordant
   hsa-miR-93 0.33 −0.25 −0.19 −0.14 0.07 0.5 −0.07 0.4 0.61 0.44
   hsa-miR-886-5p 0.15 – 0.76 0.26 – 0.25 0.27 – – 0.21
   hsa-miR-886-3p 0.41 – 0.48 0.00 – 0.69 0.71 – – 0.4
   hsa-miR-497 0.79 – 0.79 0.52 – 0.84 0.65 – – 0.63
   hsa-miR-30a 0.7 −0.01 0.65 – −0.47 0.46 – −0.2 – –
   hsa-miR-182 0.48 0.17 0.86 – 0.55 0.58 – 0.04 – –
   hsa-miR-139-5p 0.83 0.76 0.87 0.27 0.63 0.79 0.22 0.74 0.91 0.48
   hsa-miR-136 0.69 0.29 0.72 – 0.25 0.56 – 0.31 – –
   Discordant hsa-miR-218 0.5 −0.41 0.77 – 0.06 0.55 – −0.32 – –
   hsa-miR-302a 0.24 −0.01 0.06 – −0.2 −0.04 – −0.52 – –
   [99]Open in a new tab

   To better understand the basis of the poor overlap of class comparison
   results in the four platforms, we measured the expression of 10 further
   miRNAs ([100]Table 3 and File S1).

   Six of them (hsa-miR-136, hsa-miR-139-5p, hsa-miR-182, hsa-miR-30a,
   hsa-miR-497, and hsa-miR-93) were selected among the 14 DE miRNAs
   (P<0.05) according to both Agilent and Illumina. We validated the array
   data by qRT-PCR for 5 of these 6 miRNAs, with the relevant exception of
   hsa-miR-93. Correlation coefficients between qRT-PCR and either Agilent
   or Illumina data ranged from 0.65 to 0.87 for hsa-miR-136,
   hsa-miR-139-5p, hsa-miR-30a, and hsa-miR-497; for hsa-miR-182, whose
   probe intensities on Illumina were at intermediate levels and DE at
   P<0.005 and on Agilent were near to the background and DE at P<0.05,
   were 0.86 and 0.48, respectively.

   Two other miRNAs, hsa-miR-886-5p and hsa-miR-886-3p, selected for
   qRT-PCR validation were concordant in 2 of the four platforms. The
   differential expression of hsa-miR-886-5p, DE on Illumina and Miltenyi
   platforms, was confirmed by RT-qPCR, while, that of hsa-miR-886-3p, DE
   on Miltenyi and Agilent platforms, did not appear to be DE by qRT-PCR.

   Finally, we selected two miRNAs (hsa-miR-218 and hsa-miR-302a) that
   were DE on Exiqon and Illumina platforms but with opposite fold
   changes. hsa-miR-218 reduced expression in tumors on Illumina was
   confirmed by qRT-PCR while that of hsa-miR-302a was not validated using
   qRT-PCR.

   Real time PCR data are generally used to determine the sensitivity and
   specificity of data obtained with microarrays. To this aim, we compared
   our results to those obtained in an independent published qRT-PCR
   study, in which 70 of 665 unique miRNAs tested were found
   differentially expressed in 40 paired normal-colon cancer samples
   [101][15]. For each platform we selected miRNAs present in the qPCR
   dataset (527 for Agilent, 596 for Illumina, 545 for Exiqon and 278 for
   Miltenyi) and computed ROC curves using different thresholds of
   P-value. ([102]Fig. 5). The values of Area Under the ROC Curve (AUC)
   showed that Agilent and Illumina are very similar and are the most
   accurate platforms while Miltenyi is the less performing.

Figure 5. Performance assessment of the platforms.

   Figure 5
   [103]Open in a new tab

   Considering as gold standard the miRNAs identified as differentially
   expressed in a qPCR study on 40 paired tumor-normal samples, we
   evaluated the performance of each platform calculating sensitivity and
   specificity at different thresholds of P-value and plotting the
   resulting values in the ROC space.

Biological Insight

   When the 68 miRNAs DE at P<0.005 in at least one of the four platforms
   were compared with literature data, we found that 25% of them were
   concordantly described in literature as deregulated in colorectal
   cancer in comparison to the non tumor counterpart ([104]Table S4).
   Furthermore, we found that 12 miRNAs belong to known co-expressed
   family clusters. The main biological data associated to the four miRNA
   clusters are reported in [105]table 4. Looking at their expression we
   observed that: for miR 25–106b cluster, only hsa-miR-25 and hsa-miR-93
   are present in the list of 68 miRNAs at the thresholds we applied; the
   miR 182-96 cluster is particularly evident in Illumina where
   hsa-miR-182, −182*, −183, and −96 are among the most up-regulated
   miRNAs in this platform (fold changes tumor vs normal ranging from 4.42
   to 2.65); the miRNA cluster 143–145 is coherently deregulated in all
   the four platforms of our study, being hsa-miR-143 the most
   down-regulated miRNA in tumor tissues on Exiqon platform (fold change
   tumor vs normal tumor = 0.30; p = 0.036) and hsa-miR-145 the most
   down-regulated in Agilent and Miltenyi (fold change tumor vs
   normal = 0.30 and 0.35; p = 0.0027 and 0.018 respectively).

Table 4. Role in colon cancer of miRNA clusters DE in our study.

   miRNA cluster members Chromosome location Role in colon cancer
   Reference
   miR 195–497 hsa-miR-195hsa-miR-497 17p13.1 Chromosomal region
   frequently deleted in colorectal cancer.hsa-miR-195 is associated to
   lymph node metastasis, advanced tumor stage, and pooroverall survival.
   [106][35] [107][36]
   miR 25–106b hsa-miR-25hsa-miR-93hsa-miR-106b 7q22.1 hsa-miR-25 is
   associated with lymphatic andvenous invasion,a more aggressive tumor
   phenotype.This cluster is closely relatedwith oncomir1. [108][37]
   miR 182-96 hsa-miR-182 hsa-miR-182* hsa-miR-183 hsa-miR-183* hsa-miR-96
   7q32.2 intergenicregion Not reported; in medulloblastoma this cluster
   promotes tumorigenesis regulating cellular migration. [109][38]
   miR 143–145 hsa-miR-143 hsa-miR-145 5q32 Altered expression is
   reported.This cluster is associated with negativeregulation on cell
   proliferation [110][39]
   [111]Open in a new tab

   Gene expression profiles of the same samples analyzed by miRNA
   expression arrays were available. Thus, we considered an integration
   approach to evaluate whether similar biological information could be
   retrieved from the four platforms, irrespectively of the overlap in DE
   miRNAs. To this aim, using the MAGIA tool, negatively correlated
   putative target genes of DE miRNAs were identified in each platform
   (File S2) and an enrichment analysis was performed by IPA software. To
   highlight the concordance among the four platforms, enrichment P-values
   for all the cancer-related pathways significantly enriched in at least
   one platform are shown in a colorimetric scale in [112]Figure 6A.
   Pathways related to cell cycle regulation and PTEN signalling were
   concordantly identified. When we looked at validated targets by TarBase
   software, the number of miRNA-mRNA interactions negatively correlated
   at p<0.05 was very limited (Agilent = 35, Exiqon = 2, Illumina = 45 and
   Miltenyi = 0) precluding a comparison across the four platforms.

Figure 6. Computational integration of miRNA and gene expression profiles of
the paired tumor/normal colon samples.

   [113]Figure 6
   [114]Open in a new tab

   (A) Pathway enrichment analysis of anti-correlated predicted target
   genes of differentially expressed miRNAs according to each microarray
   platform. (B) Network between the top 8 differentially expressed miRNAs
   and their anti-correlated target genes. The 250 top interactions were
   used to generate the network using MAGIA tool.

   Furthermore, by considering the qRT-PCR data of the 8 most concordant
   miRNAs and the gene expression profiles, the same integration approach
   identified a total of 803 miRNA-negatively correlated gene (predicted
   as miRNA targets) interactions (File S2). The graphical representation
   of the top 250 interactions highlighted that many genes that were
   up-regulated in tumors are predicted targets of two or more
   down-regulated miRNAs ([115]Fig. 6B). In detail, there are 70 genes
   co-targeted by at least two miRNAs and 84% of them are regulated by miR
   143–145 cluster ([116]Table S5). Among these genes those related to
   glycolysis and nutrient transport pathways seemed over-represented.

Discussion

   Despite their relatively recent discovery, there is a rapidly growing
   interest in the study of the role of miRNAs in many pathological
   processes including cancer. Accordingly, high throughput technologies,
   initially developed for GW gene expression evaluation, were rapidly
   adapted to GW measurement of miRNAs. However, as highlighted in recent
   reviews [117][5]; [118][16]; [119][17], several factors, including
   short miRNA length, high degree of homology in miRNA families, the high
   rate of new miRNA identification (the actual number of miRNAs in
   miRBase 18, released in November 2011 is approaching two thousands) and
   the relatively high percent (about 10%) of artefactual miRNAs not
   confirmed by resequencing experiments, significantly complicate their
   analysis. The impact of these factors on the different methodologies
   applied by manufacturers of different available platforms must be
   considered in inter-platform comparison studies.

   The issues of intra- and inter- microarray platform reproducibility
   have been mainly addressed using experimental settings where tissues or
   cell lines of different origin are compared, with the assumption that,
   due to the wide range of expected expression modulations by such
   comparison, technical noise can become negligible. This type of
   approach mirrored the one followed in its first phase study by the
   MicroArray Quality Control (MAQC) consortium, aiming to assess the
   inter-platform and inter-laboratory reproducibility of gene-expression
   microarray data using two different RNAs (human brain and a universal
   human reference) [120][18]. This approach was strongly questioned in
   2007 for its lack of consistency with real research settings [121][19].
   However, in the majority of miRNA inter-platform comparison studies,
   quoted in Aldridge & Hadfield [122][16] and reported in [123]Table S1,
   the experimental design was biased toward the use of samples with
   strong difference in origin. Noteworthy, only two studies [124][7];
   [125][10] compared the miRNA profile of biological meaningful samples
   on, at least, three different platforms, but even in these cases the
   samples are cell lines. Thus, our study represents the first attempt to
   compare miRNA platform performance in a clinical setting, where the
   inter-sample variability within the same class is expected to be higher
   than in cell lines.

   The majority of profiling studies using clinical samples aimed at
   revealing even subtle differences in expression but which are
   associated to a specific clinical context. In these settings, technical
   replicates are frequently not feasible due to RNA quantity and
   economical considerations. Thus, in the present study we addressed the
   issue of inter-platform comparison using samples belonging to two
   classes (paired tumor and normal colon tissues) which could
   theoretically lead to new insights in tumor biology and clinical
   applications. Our data, generated by profiling the same tissue-derived
   total RNAs using four different miRNA array platforms, showed little
   overlap between platforms except for a limited number of miRNAs for
   which very high correlations were observed. These data are essentially
   in agreement with those obtained using cell lines since also in these
   studies only few miRNAs were shared among all platforms [126][7];
   [127][10].

   The first issue we considered was the global distribution of the
   hybridization intensities. The Illumina platform showed the most
   diverging behavior in global distribution of intensities compared to
   the other three platforms. An explanation could be the amplification
   step of the starting material, according to the Illumina protocol,
   while for the other platforms direct labeling of the starting material
   is performed. The amplification step allows the detection of a higher
   number of miRNAs expressed at low levels (e.g. hsa-miR-182), but with
   the drawback that it can lead to saturation of signals for more
   abundant miRNAs such as hsa-miR-21, which is expected to be both
   biologically and clinically relevant in many cancer types including
   colorectal cancer [128][20]. Due to the withdrawal of the platform, the
   saturation of signals remains a note of caution for former Illumina
   users.

   The short length of miRNAs, their variable GC content, and the
   existence of families of miRNAs differing in one or only few
   nucleotides pose a set of technical challenges that each manufacturer
   has attempted to overcome through ad-hoc approaches. An evaluation of
   the GC content of detected and undetected probes in each platform
   confirmed the relevance of this parameter in determining the detection
   performance of all of them, but also highlighted that the Miltenyi
   platform is exceedingly sensitive to GC content, partially explaining
   its low detection rate.

   In class comparison analysis between tumor and normal samples, much
   more modulated miRNAs were identified on Agilent and Illumina platforms
   compared to the few identified on the Exiqon and Miltenyi platforms. In
   Exiqon data, most of miRNAs modulated in Agilent and Illumina were
   detectable, although they did not reach statistical significance; on
   the other hand, the same miRNAs were frequently undetected on the
   Miltenyi platform. Focusing on the 233 commonly detected miRNAs,
   Miltenyi clustered separately from the other three platforms
   considering t-values, while Illumina shows the worst correlations with
   the others three platforms when considering fold-changes. qRT-PCR is
   frequently used as a “gold standard” to corroborate data using
   microarrays, but, as previously reported by others [129][7]; [130][17],
   qRT-PCR might also perform poorly in measuring some miRNAs, thus
   challenging its role as a “gold standard”. Moreover, the validity of
   qRT-PCR as a reference technique requires the application of superior
   standards to ensure its validity and the adherence to MIQE, i.e. the
   specific guidelines for minimum information for publication of
   quantitative real time PCR experiments [131][21]. Thus, in our
   analysis, we decided to use this technique, as generally done in a
   clinical setting, selecting only a small subset of miRNAs. It is
   worthwhile noting that all the 8 miRNAs concordantly DE on at least 3
   of the 4 platforms were confirmed as DE by qRT-PCR, while in regard to
   the other 10 miRNAs assessed by qRT-PCR, 7 were validated.

   Furthermore, since previous studies suggested that, despite a
   relatively low overlap among lists of DE genes obtained with different
   platforms, a higher agreement could be obtained looking at biologically
   related gene sets instead of single genes [132][14], we performed a
   miRNA set enrichment analysis on our data. In this case, we were able
   to appreciate a better inter-platform agreement compared to an approach
   based on single miRNA. In addition, a coherent enrichment was found for
   miRNA sets obtained from literature even using platforms different from
   the four analyzed in our study.

   Undoubtedly, technical and analytical challenges in measuring miRNAs
   still remain and further research is required in order to increase
   consistency between different microarray-based methodologies. Overall,
   the poor inter-platform comparability seems to be reasonably due to a
   high false negative rate, with some probes performing poorly; among the
   four tested platforms, Illumina and Agilent, due to their high
   throughput performance, to the good concordance with qRT-PCR for the
   most DE miRNAs, and to the good sensitivity/specificity by ROC curves,
   resulted adequate for miRNA GW evaluation of clinical specimens.
   Finally, comparison studies could be relevant to other researchers not
   only in making the proper decision regarding the best platform to use
   in their projects but also for a better interpretation of their
   results.

   Looking at literature data we found that some miRNAs, identified as DE
   in our study, have been already implicated in colon cancer development
   and progression (see also comments and references in [133]Table 4 and