Abstract
Rice (Oryza sativa L.) is one of the essential staple food crops and
tillering, panicle branching and grain filling are three important
traits determining the grain yield. Although miRNAs have been reported
being regulating yield, no study has systematically investigated how
miRNAs differentially function in high and low yield rice, in
particular at a network level. This abundance of data from
high-throughput sequencing provides an effective solution for
systematic identification of regulatory miRNAs using developed
algorithms in plants. We here present a novel algorithm, Gene
Co-expression Network differential edge-like transformation (GRN-DET),
which can identify key regulatory miRNAs in plant development. Based on
the small RNA and RNA-seq data, miRNA-gene-TF co-regulation networks
were constructed for yield of rice. Using GRN-DET, the key regulatory
miRNAs for rice yield were characterized by the differential expression
variances of miRNAs and co-variances of miRNA-mRNA, including
osa-miR171 and osa-miR1432. Phytohormone cross-talks (auxin and
brassinosteroid) were also revealed by these co-expression networks for
the yield of rice.
Introduction
MicroRNAs (miRNAs) are approximately 21 nucleotides small non-coding
RNAs that regulate gene expression at the post-transcriptional
level^[38]1. In plants, miRNAs are transcribed by pol II enzyme and the
mature miRNA enters the RNA-induced silencing complex (RISC) and
negatively regulates gene expression via perfect or near-perfect
sequences complementary with their target mRNA resulting mRNA cleavage
or inhibiting mRNA translation^[39]2. The basic biological function of
miRNAs is that they interact with target mRNAs to interfere with the
expression level of mRNAs, which can be encoding proteins or factors
that control developmental and physiological process in plants and
animals^[40]3,[41]4. Due to the master modulators of gene expression,
microRNAs (miRNAs) and their target genes can be exploited for
improving agronomic traits in crops^[42]5.
As a key compontent of the gene regulatory networks, miRNAs have
attracted increasing attention with respect to the mechanisms of
miRNA-mediated gene regulation^[43]6. Transcription factors (TFs) are
also paramount regulators of gene expression in plants, and thus, the
triple-network among miRNA, target-genes and TFs (e.g. miRNA-gene-TF
co-regulation network) may conceivably be an important system in
regulating plant development. Interaction networks between miRNAs,
target genes and TFs are critical for an appropriate balance of gene
expression in plants. Many studies integrated the miRNA-mRNA expression
profile data for regulatory networks using miRNA target prediction. And
the overlapped genes list of these targeted genes of differentially
expressed miRNAs and differentially expressed genes in RNA-seq, as well
as the mRNA-miRNA pairs exhibited opposite expression profiles, both
provided important clues for plant development^[44]7,[45]8. It is well
known that miRNA mainly negatively regulates the expression of its
target genes^[46]1. However, in some cases, the targets of a miRNA are
not negatively correlated at the expression level^[47]9,[48]10,
suggesting that miRNA regulation in plant development may be a dynamics
process and involve many other factors.
Rice (Oryza sativa L.) is one of the essential staple food crops in the
world, and improving the yield has been the focus of rice breeding
programs. Tillering, panicle branching and grain filling are important
traits determining the grain yield. It has been reported that OsmiR156
regulates the famous yield gene OsSPL14^[49]11 and over-expression of
OsmiR397 can promote panicle branching, enhancing the grain
yield^[50]12. And miRNAs have also been reported in different tissues
or at a certain development stage of rice yield using small RNA
sequencing^[51]9,[52]10,[53]13–[54]16. However, regulation network of
these miRNAs with target genes and the signal pathways cross-talk for
yield of rice is still unknown. Both osa-miR397 and osa-miR396 invovled
in regulating the brassinosteroid (BR) signalling to control grain size
and affect yield in rice, their relationship is
elusive^[55]5,[56]12,[57]17. Thus, extensive studies are needed for
systematically elucidating the regulation networks in yield of rice,
which will provide useful strategies for crop improvement.
Recently, the abundance of data from high-throughput sequencing has
greatly facilitated plants research and makes to analyze the gene
regulation on systemic level possible. And new challenges arise to
effectively integrate the different omics data for studying the
biological complexity of yield in rice. Mathematical models of
biological systems by integrating experimental and theoretical
techniques are required to unravel the complexity of gene regulation in
the complex processes^[58]18. Gene Regulatory Networks (GRNs) is aim to
infer complex networks representing transcriptional regulatory
relationships from gene expression profiles, such as RNA-seq
data^[59]19,[60]20. Using co-expression network analysis, thousands of
genes/transcripts of special interest (e.g. differentially expressed)
are utilized to construct the network, identifying key
regulators/targets^[61]21. Using publicly available data and
protein-protein interactions (PPIs), a Gene Co-expression Network (GCN)
can be constructed on individual sample for candidate gene or
regulators selection and improving understanding of regulatory
pathways. Although existing non-linearity in GRN, the linear
correlation would be more effectively to approximate the network when
the samples are not large, especially combining with prior-known
background network. However, the linear correlation even might be less
practical when the samples are few^[62]22. Thus, we propose novel
edge-like correlations of gene-pairs even in one sample rather than
original expressions of genes to reconstruct the GCN. Such
correlation-based calculation would usually obtain undirected
association, so that, as widely applied in integrative study^[63]23, in
this work, the prior-known miRNA-> Target gene, TF-> miRNA and TF->
gene information are combined with GCN analysis, which come from the
well-established public interaction database. Therefore, our GCN can
supply potential (directed) regulatory relationships consistent with
prior-knowledge in a regulatory network screening manner.
The relations between miRNA-target genes enable users to derive
co-expressed genes that may be involved in similar biological processes
and functions in plants, similar to previous hypothesis and study in
human^[64]24. The target genes of miRNAs may be co-expression when they
are regulated by multiple miRNAs^[65]25. Using these co-expressed
genes, we can theoretically reconstruct the GCNs related to plant
development. In contrast to widely used analysis of differential
expression in traditional studies, such dynamic regulations (i.e.
sample-specific regulation or network) among different individuals
(samples) can be characterized by the differential expression variances
of miRNAs and co-variances of miRNAs-mRNAs, which would also be
important phenotype-related and dynamics-related features in biological
processes^[66]26. In the present study, we develop a new algorithm,
called differential edge-like transformation (DET), to analyze the gene
regulatory networks and identify key regulatory miRNA in plant
development.
Results
Gene regulatory network based on DET
DET transforms original expressions of genes to the edge-like
correlations of gene-pairs even in one sample (Fig. [67]1 and
Supplemental Methods). From the statistic viewpoint, those edge-like
correlations follow a correlated product distribution (Supplemental
Methods), which can be used for significance test of gene-pair
associations. This method can be used to not only characterize a single
sample by its network according to the weights of edges (i.e. the
edge-like correlations or gene-pair associations), but also estimate
the dys-regulations of genes in one pair of samples according to their
network changes (e.g. the topological differences between networks from
control and case samples), thus opening a new way to study the
molecular mechanism (e.g. regulatory miRNAs of rice yield) at a network
level even with one sample. Using this novel differential network model
based on DET, in the present study, the sample-specific miRNAs (SmiRNAs
with sample-specific network structures rather than sample-specific
expression) and their regulation networks for different tissues or
samples were identified.
Figure 1.
[68]Figure 1
[69]Open in a new tab
Identification of miRNAs invovled in rice yield using DET method.
Workflow of the differential network model based on Differential
Edge-like Transformation (DET).
Although conventional co-expression analysis, e.g. WGCNA (Weight Gene
Co-Expression Network Analysis)^[70]22, can be applied to study
co-expression networks, those methods require many samples (N ≥ 10) and
thus cannot be applied to the network or miRNA-mRNA interaction
analysis with fewer or even two samples (Table [71]1). In many plant
studies, the samples in the data are usually from different tissues or
developmental stages, but there are only a few samples for each
tissue/stage and thus the traditional network analyses failed.
Table 1.
Comparison of GRN-DET and WGCNA methods.
Co-expression network GRN-DET WGCNA
Model HMM Hierarchical clustering
Samples (N≥) 1 10
Gene expression Yes Yes
Pearson correlation coefficients Yes Yes
Expression variance Yes No
Expression covariance Yes No
Network changes Yes No
miRNA-mRNA interaction Yes No
[72]Open in a new tab
Based on the theory of DET, a new differential network model by
combining the miRNA-mRNA regulatory network and its edge-like
correlations (Gene regulatory network with DET, GRN-DET) is proposed in
this study (Fig. [73]1). In one condition or in one tissue, there is
usually only one sample, we can apply DET to capture the significant
gene expression and correlation changes simultaneously in a dynamical
and network manner.
Single sample analysis dependent on the statistics of edge-like correlations
From a viewpoint of statistics, the random variable, edge-like
correlation Z (∈ [−∞, ∞]), can be described as the product of two
random variables X and Y
[MATH: Z=XY=x−
μxσ
x⋅y−μyσ
y :MATH]
1
which follows a correlated product distribution of X and Y, i.e., a
product distribution with correlation between X and Y. In particular,
given X and Y are statistically independent, the edge-like correlation
Z = XY follows a product distribution, e.g., the probability density
function of Z as:
[MATH: fZ(z)=∫−∞∞fX(x)fY(z/x)1|x|dx :MATH]
2
Given X and Y are independent normal distributions (e.g. X and Y are
multivariate normal distribution with covariance as 0), Z is normal
product distribution; and even when X and Y are statistically dependent
(e.g. X and Y are multivariate normal distribution with covariance
closed to 1 or −1), Z will be long-tail normal product distribution
(Fig. [74]2).
Figure 2.
[75]Figure 2
[76]Open in a new tab
The numerical simulation of the correlated product distribution of
edge-like correlation. (A) The distribution of edge-like correlation
with given inherent correlations. (B) The distribution of edge-like
correlation with negative-correlation condition. (C) The distribution
of edge-like correlation with independent condition. (D) The
distribution of edge-like correlation with positive-correlation
condition.
Furthermore, when X and Y are standard normal distributions, the
expectation of Z is just the Pearson’s correlation coefficient ρ[X,Y]
between X and Y, i.e.
[MATH: ρX,Y=E[(X−μX)(Y−μY)]σXσY=E(XY)=E(Z) :MATH]
3
To mimic such correlated product distributions, our numerical
simulation is carried on by using matlab function ‘mvnrnd’ to produce
multivariate normal distribution X and Y, where the parameter
covariance is controlled by the inherentcorrelation between X and Y,
i.e. the Pearson’s correlation coefficient between X and Y. Given
acovariance matrix, N (N = 20, with 10000 times replication) samples
are produced for X and Y, so that, 200000 samples of Z are also be
transformed by above DET. Next, the distribution of Z can be estimated
by Kernel smoothing function. Finally, the distribution landscape of Z
determined by inherent correlation and edge-like correlation can be
shown in the 3-D plot (Fig. [77]2A). Obviously, when the determinant as
inherent correlation are from negative correlated (ρ[X,Y] = −1) to
independent (ρ[X,Y] = 0), and to positive correlated (ρ[X,Y] = 1), the
distribution of edge-like correlation displays left long-tail product
distribution (Fig. [78]2B), and symmetrical product distribution
(Fig. [79]2C), and right long-tail product distribution(Fig. [80]2D)
respectively. Thus, the Mann–Whitney U test is simply used to evaluate
the statistic significance of edge-like correlations in this work.
In addition to above numerical validation on the work of edge-like
correlation, the theoretical result of edge-like correlation on a
single sample is also supplied in Supplemental Methods.
Validation of GRN-DET method using public rice grain filling data
Firstly, we conducted the GRN analyses based on DET using the public
small RNA and gene expression dataset from NCBI, which was originally
used to investigate rice grain filling. During grain development, poor
grain-filling in inferior spikelets greatly decreased the yield of
Oryza sativa spp. japonica cv. ‘Xinfeng 2’^[81]9. Grains of superior
spikelets and inferior spikelets (10 days after flowering (DAF), 15DAF,
21DAF, and 27DAF, respectively) from rice cultivar ‘Xinfeng 2’ were
collected and sequenced for small RNA and mRNA profiling by these two
previous studies^[82]9,[83]27. The co-expression analysis of superior
spikelets and inferior spikelets showed that 47 differentially
expressed miRNAs (DEmiRNAs) might influence grain-filling of rice, and
the differential networks as co-variances of these miRNAs with mRNA
were also constructed. For example, the expressions of the target genes
regulated by osa-miR164 were decreased during the grain-filling.
Meanwhile, these expression patterns were significantly different
between superior and inferior spikelets. In addition, 20 SmiRNAs
related to grain-filling between the two different spikelets were
identified by DET (Fig. [84]3A and Table [85]S1). Most of the SmiRNAs
have been reported to be involved in rice grain-filling, including
osa-miR444b, osa-miR1861, osa-miR172c, osa-miR1862 and so on^[86]28
(Fig. [87]3A). Using GRN-DET, we identified some miRNAs which are no
differential expressions but have differential regulations (networks)
between superior spikelets and inferior spikelets, including
osa-miR395n, osa-miR164 a/b/f, osa-miR2102-5p, osa-miR1432 and
osa-miR166k/l (Table [88]S1 and Fig. [89]S1). Our network analysis
further showed that osa-miR1861 regulated many genes or TFs in the
superior spikelets and had cross-talk with the verified
yield-associated osa-miR159a.1 (Fig. [90]3A). The results were
consistent with previous studies, suggesting it could be an important
regulator of rice yield^[91]9,[92]13. GO enrichment analysis showed
that the genes co-regulated by miRNAs identified by DET were involved
in ‘nitrogen compound metabolic process’, ‘anatomical structure
morphogenesis’, ‘cell differentiation’ and ‘cell death’ (Fig. [93]3B).
These results were consistent with previous studies about metabolic
pathways from ‘embryo differentiation’ at the early phase to
‘senescence and dormancy’ at the late filling phase^[94]29.
Figure 3.
[95]Figure 3
[96]Open in a new tab
Co-expression networks and GO enrichment of identified candidate yield
miRNAs. (A) Co-expression networks of identified candidate yield miRNAs
with verified yield-associated miRNAs and their regulated genes or TFs
in superior and inferior spikelets. (B) GO enrichment of the
co-regulated genes with SmiRNAs identified by DET in the three key
stages (tiller, panicle and grain filling) for rice yield. (C)
Co-expression networks of identified candidate yield miRNAs with
verified yield-associated miRNAs and their regulated genes or TFs in
Tao yuan ultra-high yield rice. (D) Quantitative real-time PCR
(qRT-PCR) validation of osa-miR393a and osa-miR171a in tillers and
young panicles at Taoyuan and Jinghong rice, respectively. The
significant difference of expression level between Taoyuan and Jinghong
in IR64 was determined by Student’s t test, **p < 0.01.
Secondly, to validate the identified SmiRNAs by our network analysis
based on DET in the grain filling, we also performed a conventional
WGCNA study on 438 miRNAs from four published rice grain filling small
RNA sequencing data^[97]9,[98]10,[99]13,[100]28. On the combined 22
samples, WGCNA can be applied to obtain one network (e.g. common
associations) for miRNAs across multiple samples, rather than
individual samples. A total of 9 co-expression modules (e.g. common
co-expression pattern across multiple samples) were obtained from
highly correlated gene expression patterns (Fig. [101]S2A,B). Eight
miRNAs were both identified in DET network and WGCNA module, including
osa-miR169a, osa-miR166l, osa-miR444b, osa-miR1432 and so on
(Fig. [102]S2C and Table [103]S1). However, unlike WGCNA, different
interactive networks of these miRNAs with other miRNAs or mRNAs were
revealed by GRN-DET (Figs [104]3A and [105]S1). Thus, osa-miR1432 and
osa-miR444b might be the key yield-associated miRNA in grain filling of
rice (Fig. [106]3A). Therefore, our method can actually detect
sample-specific network and miRNAs (SmiRNAs), and some of them are also
interactive on the conventional multi-sample based common network,
which support the effectiveness of our method.
Indentify regulatory miRNAs for yield of rice using GRN-DET on in-house data
Using the effective method of GRN-DET, we can also analyze other
important stages for yield of rice. Taoyuan, Yunnan, China, has been a
well-known amazing place where the highest rice yield in the world was
recorded^[107]30. We collected the tillers, young panicles and flag
leaves from the rice variety IR64 at Taoyuan and the yield control
place Jinghong. In order to investigate the miRNA regulation roles for
ultra-high yield at Taoyuan, we sequenced small RNAs and mRNAs of
tissues from three developmental stages (i.e. tillering, panicle
branching and grain filling) of rice planted at Taoyuan and Jinghong,
respectively. Totally, 234, 82 and 134 DEmiRNAs were identified between
the three cases and their controls, respectively (Table [108]S2).
Moreover, using GRN-DET, 40, 35 and 34 SmiRNAs for tillers, panicles
and flag leaves were identified, respectively (Table [109]S1). Using
the corresponding transcriptome data, differential networks of the
SmiRNAs and their regulated genes or TFs derived from our GRN-DET have
also shown regulatory differences between Taoyuan and Jinghong rice
(Figs [110]3C and [111]S3, [112]S4). Some of these SmiRNAs were not
differential molecules which cannot be measured by traditional method
but have differential regulations or networks which were identified by
our method, such as osa-miR396b-5p in tillers, osa-miR171a in young
panicles and osa-miR812m in flag leaves, respectively (Fig. [113]S3 and
Tables [114]S1, [115]S2). At young panicle stage, the osa-miR171a
regulated the expression levels of more target genes in Jinghong than
that of Taoyuan, which may suppress some genes invovled in panicle
development, leading to decrease yield of rice (Fig. [116]S3B). In
Taoyuan and Jinghong, the three miRNAs were regulated different targets
and had differential networks, affecting the key development tissues
for yield in rice (Fig. [117]S3).
Two miRNAs (osa-miR393a and osa-miR171a) were randomly selected to
validate the expression level by quantitative RT-PCR (qRT-PCR) in IR64
from tillers and panicles of Taoyuan and Jinghong rice, respectively,
and the results are consistent with the sequencing data (Fig. [118]3D).
In other words, the functions of those SmiRNAs are facilitated not at
the expression level but at the network level. GO and KEGG enrichment
analyses of the genes regulated by these SmiRNAs in Taoyuan rice showed
that they involved in flower development, embryonic development,
nucleic acid metabolism, nitrogen compound metabolic process and so on
(Figs [119]3B and [120]S5).
Plant yield has demonstrated to be control by various plant hormones,
including auxin, brassinosteroid, gibberellic and cytokinin^[121]31. In
our study, we found that miRNAs invovled in two phytohormones, auxin
and BR signaling pathways to affect rice yield (Figs [122]4 and
[123]5). Auxin is mainly participate in the growth periods (vegetable
and reproductive growth stages), while BR is invovled in grain filling,
affecting the grain size (Fig. [124]5).
Figure 4.
[125]Figure 4
[126]Open in a new tab
Co-expression networks of candidate yield miRNAs identified by
differential edge-like transformation (DET) and reported yield miRNAs
with their target genes for high yield in rice from tillering to grain
filling stages. The bold line indicated the reported yield miRNAs and
their confirmed targets (See Table [127]S4). BR, brassinosteroid, GA,
gibberellin.
Figure 5.
[128]Figure 5
[129]Open in a new tab
The potential regulatory network model of miRNAs for yield at three
stages (tillering, panicle branching and grain filling) in rice. Solid
and dashed arrows are the verified and predicted regulatory
relationships, respectively.
Discussion
MiRNA-mRNA interactions have been predicted by some toolkits or
pipelines, including Mtide and Sparta^[130]32,[131]33. Most of the
packages or tools were made for animals or human which have much other
information (for instance, validated gene-gene interactions) to
integrate for identifying key miRNAs. However, the algorithm for key
regulatory miRNA identification in plant development is lacking.
Herein, the new algorithm of differential edge-like transformation
(DET) was developed for effectively identifying the key regulatory
miRNA for rice yield. Although miRNAs have been reported being
regulating yield, no study has systematically investigated how miRNAs
differentially function in high and low yield rice, in particular at a
network level. In this study, based on DET, we construct miRNA
differential regulation networks in the three key developmental stages
between high and low yield rice, which are further exploited to reveal
novel regulatory roles of netted miRNAs in grain yield. The application
of DET to the grain-filling and yield of rice datasets demonstractes
that GRN-DET provides high accuracy in identifying regulatory miRNAs
using RNA-seq and small RNA sequencing.
Using GRN-DET, osa-miR171 and osa-miR1432 has been screened to be
involved in panicle branching and grain filling (Figs [132]3 and
[133]S1, [134]S3). Scarecrow (SCR) is a member of GRAS family which is
essential for asymmertric cell division of shoot in
Arabidopsis^[135]34. Many members of the osa-miR171 target the SCR
transcription factor (Table [136]S3). In tomato, over-expression of the
target gene of miR171 (SlGRAS24) has been reported to cause alteration
of inflorescence architecture and lateral branch number^[137]35. The
target gene of osa-miR1432 is alpha-amylase (LOC_Os08g36910) that is a
starch-hydrolyzing enzyme. It’s reported that suppression of α-amylase
can ameliorates the grains during the ripening period under high
temperature^[138]36.
Some of the miRNAs showed cross-talks for yield in rice, for example
LOC_Os02g53690 (GRF) targeted by osa-miR396 and LOC_Os03g52320
(GRF-interacting factor 1, GIF1) which is the target of osa-miR393
(Fig. [139]4). GIF genes have been reported to play important role in
cell proliferation and shoot apical meristem, detemining organ size in
Arabidopsis^[140]37. The two miRNAs (osa-miR396 and osa-miR393) are
functionally characterized to be involved in tiller and panicle
branching, respectively^[141]38,[142]39. Thus, the cross-talk may
provide a clue for better understanding the miRNA regualtion in rice
yield.
Phytohormones play vital roles in the plant growth and development,
including yield in crops^[143]40. Auxin has demostrated to regulate
stem elongation, lateral branching and vascular development^[144]31.
Different miRNAs (osa-miR393a, osa-miR396b and osa-miR167) with their
targets were involved in auxin signaling to affect tillering, panicle
branching and grain filling in rice (Fig. [145]5). The results of our
study also provide some insights to the cross-talk and co-expression
network of osa-miR397 and osa-miR396 which involved in regulating the
BR signalling to control grain size and affect yield in rice
(Fig. [146]4). As the negative regulator of BR signalling, GSK2
interacts with OsGRF4 (target of osa-miR396) and inhibits its
transcription activation activity to mediate the specific regulation of
grain length in rice^[147]17. And suppressing of osa-miR396 (MIM396)
up-regulated many auxin synthesis and response genes (YUCCA, ARFs and
GH3), revealing OsGRF6 is a positive regulator of auxin signalling
pathway^[148]38. Overexpression of osa-miR397 also altered lots of the
brassinosteroid-related genes^[149]12. Therefore, we documented here
that some key miRNAs via phytohormone (auxin and BR) involved in the
regulation of yield in rice from tillering to grain filling stages
(Fig. [150]5). Furthermore, BR can promote GA accumulation by
regulating the expression of GA metabolic genes to stimulate cell
elongation^[151]41. SCR (target of miR171) has also been reported to
interact with DELLA porteins, mediating the GA-regulated chlorophyll
biosynthesis^[152]42 (Fig. [153]4). Using GRN-DET, the
phytohormone-related miRNAs and targets interactions as well as their
networks were revealed (Fig. [154]4). Thus, the cross-talk of BR and
auxin or BR and GA may play important roles in the yield of rice.
Based on the small RNA, RNA-seq and reported verified miRNA-target
interaction data, highly reliable and biologically meaningful
co-expression networks based on DET have been constructed for better
elucidating the regulatory roles of miRNAs in high yield of rice. This
work not only identified new regulatory miRNAs affecting the yield of
rice, but also provides a method to systematically reveal miRNA
regulation networks in limited but key samples. The results also
provide clues for future efforts of increasing rice yield using
non-coding RNAs.
Materials and Methods
Plant materials and high-throughput sequencing
Small RNA and transcriptome sequencing were performed for the variety
IR64 in Taoyuan (ultra-high yield) and Jinghong (natural yield) at
three different stages (tillering, panicle branching and grain filling)
for the yield of rice. Total RNA was extracted from three tissues
(tillers, young panicles and flag leaves) of rice IR64 using the Trizol
(Invitrogen). Libraries were generated according to the manufacturer’s
recommendations and sequenced by Illumina HiSeq2500 platform
(Supporting Information). Bioinformatic analysis of small
RNA-sequencing and RNA-seq data were as previous studies^[155]2,[156]43
(Supporting Information). All the small and RNA sequencing data were
deposited in the NCBI Short Read Archive (SRA)
([157]http://www.ncbi.nlm.nih.gov/sra) under the accession number:
SRP134071 and SRP144409, respectively.
Data sources
To collect the miRNAs and target genes involved in rice yield, we
conducted literature search for studies that directly assessed miRNA
regulation in tiller, young panicle, flag leaf and grain
filling^[158]13,[159]16,[160]39,[161]44. Then, a total of 150 yield
associated conserved miRNAs were retrieved from miRBase (release 21;
[162]http://www.mirbase.org/)^[163]45. The degradome sequencing data
from different tissues in rice ([164]GSM434596, [165]GSM455938,
[166]GSM455959 and [167]GSM476257) were also used in this study. The
targets of these miRNAs by predicted, degradome sequencing and
experimentally verified were merged^[168]46 (Table [169]S3). In
addition, the small RNA and transcriptome sequencing data from superior
and inferior spikelets in ‘Xingfeng2’ rice grains reported previously
were also collected for methold validation^[170]13,[171]16,[172]27.
Gene regulatory networks with DET (GRN-DET)
Generally, the differential co-expression analysis is to see if or not
the expression correlation of a gene-pair (e.g. two genes or molecules)
changes between control and case samples^[173]22,[174]47. Thus, the
Pearson correlation coefficient (PCC) between genes i and j in control
or case samples can be calculated as:
[MATH: Correlationincontrolcondition:1mx−1∑k=1mx(x<
mrow>ik−μ
mi>xi
σxi⋅xjk−μ
mi>xj
σxj)Correlationincasecondition:1my−1∑k=1my(y<
mrow>ik−μ
mi>yi
σyi⋅yjk−μ
mi>yj
σyj) :MATH]
4
where, there are m[x] control samples and m[y] case samples; for a
control sample k, its expressions on genes i and j are x[ik] and x[jk];
the expression average and variance for gene i (or gene j) on control
samples are μ[xi] and σ[xi] (or μ[xj] and σ[xj]); and conveniently, the
sample in case has these similar variables and annotations but with
index y. Note that a gene can be replaced by a molecule in this
mathematical framework.
DET transforms the expression of genes to the edge-like correlation of
gene-pairs in one sample, and the mean of edge-like correlation of a
gene-pair in all control and case samples is just the Pearson
correlation coefficient on all samples, so that this measurement has
equivalent numerical meaning for any control or case sample.
For each tissue, the selected miRNAs or mRNAs and their edge-like
correlations will consist of a tissue-specific network and displayed in
a topological structure, where the strength of each pair of molecules
(e.g., molecules i and j) in the network is the corresponding edge-like
correlation (e.g., edge between molecules i and j). Besides, for any
miRNA, its average PCC (i.e. the edge-like correlation) with other
relevant miRNAs or mRNAs in control or case is defined as AP[control]
and AP[case], then a factor as PCC induced key associated score for
this miRNA is computed as |AP[case] − AP[control]|.
miRNA-TF-gene network analysis and key miRNAs identifying
Based on RiceNetDB^[175]48 and RiceNet^[176]49 (version 2), the
TF-gene/miRNA regulatory relations were deciphered. To further
illustrate the regulatory structure of miRNA-TF-gene, we re-analyzed
the topological structures among miRNA-targets, miRNA-TF and TF-gene
associations. Known yield-associated miRNAs from literatures were
collected (Table [177]S4). The key miRNAs selection was performed by
Pearson correlation coefficient (PCC) between each pair of miRNAs or
genes and known yield-associated miRNAs calculated based on their
edge-like correlation profiles, and the sample-specific miRNAs,
(SmiRNAs) (i.e. with highest key-associated score) were found in
tillers, panicles, flag leaves and grains, respectively.
Identified key miRNAs for yield of rice using GRN-DET
To estimate the accuracy of GRN-DET, we have analyzed the small RNA
sequencing data for yield in rice. To construct a miRNA-target gene-TF
co-regulation network for the trait of rice yield by DET, the GRN was
conducted in the following ways (see the detail methods in Supporting
Information): (i) a set of 150 reported sequencing-screened yield
related miRNAs and target genes from literatures or degradome data were
collected, including 12 experimentally verified yield-associated miRNAs
(Tables [178]S3 and [179]S4); (ii) DET was used to transform the
original gene expression profiles (i.e. gene v.s. sample data matrix)
to edge-like correlation profiles (i.e. gene-pair v.s. sample data
matrix); (iii) the correlation between each pair of miRNAs or mRNAs
(target genes or TFs) and 12 verified yield-associated miRNAs on each
sample were further obtained based on such edge-like correlation
profiles (the correlation of a gene-pair in a sample is significant
when its edge-like correlation is large, otherwise non-significant when
the edge-like correlation is small, where the difference significance
of edge-like correlation in one v.s. multiple samples can be evaluated
by Mann–Whitney U test); (iv) on the edge-like correlation weighted
miRNA-target gene-TF co-regulation network, the key regulatory SmiRNAs
related to rice yield were identified according to the PCC-induced
key-associated scores (Supporting Information).
GO and KEGG pathway enrichment analysis
According to their degrees of nodes, network hubs were determined and
the top 5% of miRNAs, TFs and genes were considered as hub components.
Functional classifying the targets of these miRNAs was enriched by
AgriGO (Gene Ontology) ([180]http://bioinfo.cau.edu.cn/agriGO/)^[181]50
and KEGG database (/ftp.genome.jp/pub/kegg/pathway/). A p-value with
0.05 as the cutoff for enriched terms or pathways in GO and KEGG.
Weighted correlation network analysis (WGCNA)
Based on a group of the collected miRNA profiles for grain filling
(DAF, day after flowering) in rice^[182]9,[183]10,[184]13,[185]28, a R
package WGCNA^[186]22 has been carried on and several co-expression
modules have been identified. The miRNA expression profilings were
included at the three filling stages: milk-ripe (5DAF, 10DAF),
soft-dough (12DAF, 17DAF) and hard-dough (21DAF, 27DAF). Using Pearson
correlation coefficient, the gene co-expression similarity were
identified and clustered into network modules.
Stem-loop RT-PCR and quantitative real-time PCR
Total RNA was extracted from tillers and young panicles of the variety
IR64 in Taoyuan and Jinghong. For each reverse-transcription (RT)
reaction, 2 μg of total RNA was reverse transcribed into cDNA using a
miRNA specific stem-loop primers and reverse transcriptase (Takara,
Dalian, China) as previously described^[187]51. Reverse transcription
was performed with pulsed RT: the reactions were incubated for 30 min
at 16 °C, followed by 60 cycles at 30 °C for 30 s, 42 °C for 30 s and
50 °C for 1 s and finally the reactions were terminated at 70 °C for
5 min. Real time qRT-PCR analysis of the miRNA and their targets was
performed using the FastStart Universal SYBR Green Master Mix (Roche)
on the StepOne plus PCR platform (AppliedBiosystems). U6 snRNA was used
as an endogenous control. The primers were listed in Table [188]S5. To
avoid non-specific amplification, melting curve was carried out for
each PCR product. All qRT-PCR reactions were performed with three
biological replicates and the relative gene expression level was
analyzed using comparative 2^−ΔΔCt method^[189]52.
Electronic supplementary material
[190]Supplementary Information^ (8.4MB, pdf)
Acknowledgements