Abstract

   Two novel approaches were recently suggested for genome-wide
   identification of protein aspects synthesized at a given time. Ribo-Seq
   is based on sequencing all the ribosome protected mRNA fragments in a
   cell, while PUNCH-P is based on mass-spectrometric analysis of only
   newly synthesized proteins. Here we describe the first Ribo-Seq/PUNCH-P
   comparison via the analysis of mammalian cells during the cell-cycle
   for detecting relevant differentially expressed genes between G1 and M
   phase. Our analyses suggest that the two approaches significantly
   overlap with each other. However, we demonstrate that there are
   biologically meaningful proteins/genes that can be detected to be
   post-transcriptionally regulated during the mammalian cell cycle only
   by each of the approaches, or their consolidation. Such gene sets are
   enriched with proteins known to be related to intra-cellular signalling
   pathways such as central cell cycle processes, central gene expression
   regulation processes, processes related to chromosome segregation, DNA
   damage, and replication, that are post-transcriptionally regulated
   during the mammalian cell cycle. Moreover, we show that combining the
   approaches better predicts steady state changes in protein abundance.
   The results reported here support the conjecture that for gaining a
   full post-transcriptional regulation picture one should integrate the
   two approaches.
     __________________________________________________________________

   Gene expression is a multi-step process, with the first stage of this
   process (transcription) and its product (mRNA levels) comprehensively
   studied and measured. However, it was shown that the correlation
   between mRNA and protein levels is relatively
   limited[28]^1,[29]^2,[30]^3. Consequently, recently various
   technologies for studying post transcriptional regulation, and
   specifically translation, have emerged to close this
   gap[31]^1,[32]^3,[33]^4,[34]^5,[35]^6,[36]^7,[37]^8,[38]^9,[39]^10,[40]
   ^11,[41]^12,[42]^13,[43]^14,[44]^15,[45]^16,[46]^17,[47]^18,[48]^19,[49
   ]^20,[50]^21,[51]^22,[52]^23,[53]^24,[54]^25,[55]^26.

   Currently the most common technology for studying translation is
   ribosomal profiling (Ribo-Seq). Although ribosomal profiling was
   introduced only several years ago it has already been successfully
   employed for answering fundamental biological questions related to post
   transcriptional regulation of gene
   expression[56]^5,[57]^27,[58]^28,[59]^29,[60]^30. [61]Figure 1A
   includes the major steps of the ribosomal profiling approach: Cells are
   treated with cycloheximide (or a different drug) to arrest translating
   ribosomes; extracts from these cells are then treated with RNase to
   degrade regions of mRNAs not protected by ribosomes; the resulting 80S
   monosomes, many of which contain a ~30-nucleotide ribosomal protected
   footprint (RPF), are purified (e.g. using sucrose cushion) and then
   treated to release the RPFs, which are processed for Illumina
   high-throughput sequencing. The next steps are computational: the RPFs
   are mapped to the transcriptome, and based on them it is possible to
   infer various biophysical properties related to the translation
   elongation process. For example, each ribosomal footprint read is
   related to a certain codon along the mRNA, and was generated when the
   codon in one of the mRNA molecules is covered by a ribosome. Thus, from
   a biophysical perspective, relatively slower codons along the mRNA can
   be detected based on the fact that they are covered by ribosomes for
   longer periods of time, creating a higher number of reads.

Figure 1.

   [62]Figure 1
   [63]Open in a new tab

   A schematic illustration of the study: Ribo-Seq (A) and PUNCH-P (B)
   data are used simultaneously in-order to augment the information which
   can be extracted based on each alone (description of the methods
   appears in the main text). (C) Predictive power of steady-state protein
   levels based on the two approaches are assessed. (D) Protein-protein
   interaction (PPI) analyses are performed based on differentially
   expressed (DE) genes in the cell-cycle phases M and G1 based on each
   approach. (E) Pathway enrichment analyses are performed based on the DE
   genes detected based on each approach. (F) Clustering of the PPI
   sub-networks induced by DE genes detected based on each approach is
   performed.

   Recently a new approach called PUNCH-P[64]^31,[65]^32 was proposed.
   This approach is based on the combination of biotinylated puromycin
   with MS analysis to globally label newly synthesized proteins, enabling
   identifying the proteins translated in a certain condition. The method
   involves isolation of ribosomes by ultracentrifugation followed by
   cell-free labeling of nascent polypeptide chains with 5′
   biotin-dC-puromycin 3′ (Biot-PU), capture on immobilized streptavidin,
   and analysis by liquid chromatography-tandem MS (LC-MS/MS). This work
   flow leads to the identification of thousands of newly synthesized
   proteins in a certain condition, generating a snapshot of the cellular
   translatome, see [66]Fig. 1B.

   It is easy to see that both approaches measure very similar but
   non-identical aspects related to protein synthesis ([67]Fig. 1A,B).
   Roughly, Ribo-Seq is based on the total number of ribosomes on the mRNA
   molecules related to a certain gene; PUNCH-P, on the other hand, is
   based on the total amount of nascent peptide emerging from the
   ribosomes on the mRNA molecules related to a certain gene which are
   translating at the time of the experiment. Since not all ribosomes on
   the mRNA (i.e. can be detected by Ribo-Seq) are actually
   translating[68]^33,[69]^34,[70]^35 at a certain moment (i.e. can be
   detected by PUNCH-P), the signal detected by these approaches is not
   identical.

   Furthermore, the different approaches are expected to have different
   experimental biases/noise as they are based on different
   experimental/analysis techniques: sequencing vs. proteomics.

   It is important to mention that a priori it is not clear which approach
   (or if there is an approach that) performs better. This is true not
   only due to the different biases related to the different methods, but
   also due to the fact that each of them is expected to capture
   biological meaningful signals not detected by the other: changes in the
   number of translating ribosomes, but also the total number of
   ribosomes, are expected to be relevant to protein levels regulation.

   Thus, the aim of this study is to compare these two methods, which were
   both performed at the G1 and M phases of the cell cycle, and discern if
   their integration can yield improved predictions of relevant
   genes/proteins, and uncover otherwise elusive biological phenomena. To
   this end we: 1. Tested the predictive power of steady-state protein
   levels of each approach and their combination ([71]Fig. 1C). 2.
   Uncovered significant M/G1 differentially expressed genes with each of
   the approaches. 3. Exploiting these genes we discovered relevant
   intra-cellular pathways with each of the approaches ([72]Fig. 1E). 4.
   Discerned biological relevant properties related to the differentially
   expressed genes detected by each approach and the protein-protein
   interaction network ([73]Fig. 1D,F). In points 2.-3. we specifically
   studied the genes/proteins detected by only one of the methods.

   Data for PUNCH-P was taken from[74]^31 (see Methods), while we
   generated the Ribo-Seq data via two experiments, one with 3 replicates,
   and the other with one replicate, totalling 4 technical replicates per
   each cell-cycle phase, G1 and M (see [75]Methods and Supplementary
   Methods).

Results

Correlations based on Ribo-Seq and PUNCH-P with steady state protein levels

   Steady state protein levels are expected to be affected by all gene
   expression steps (e.g. transcription, translation, mRNA degradation,
   protein degradation). Thus, steady state protein levels (PSS) are
   expected to correlate with mRNA levels, PUNCH-P (PP), and Ribo-Seq
   (RP). In addition, it is easy to see that RP and PP (which encapsulate
   both the mRNA levels and the translation step, or the number of
   ribosomes on the mRNA molecules) are expected to have higher
   correlation than mRNA levels with steady state protein levels. Moreover
   we expect to see relatively high correlation between PP and RP as they
   measure similar variables. Finally, we also expect that the combination
   of the different measures can improve the prediction of steady state
   proteins, as each of them encapsulates non-identical aspects of gene
   expression and exhibits different experimental biases. All these points
   are verified in this sub-section.

   At the first step, we aimed at providing estimations for the effect of
   transcription and translation on steady state protein levels via the
   correlation of the products of these stages. Our analyses demonstrate
   that the correlation (all correlations reported in the paper are
   Spearman, see Methods) between the G1 and M phases of steady state
   protein levels and Ribo-Seq (r(PSS,RP)) are: 0.70 (p < 10^−454) (see
   [76]Fig. 2A,E) and 0.70 (p < 10^−454) respectively (see [77]Fig. 2B,F);
   the correlation is significant and high also when controlling for mRNA
   levels (r (PSS,RP|mRNA): 0.45 (p = 2.4·10^−252) (see [78]Fig. 2C,E) and
   0.47 (p = 4·10^−280) (see [79]Fig. 2D,F). The correlation between M and
   G1 phases of steady state protein levels and PUNCH-P (r(PSS,PP)) are:
   0.68 (p < 10^−454) (see [80]Fig. 3A,E) and 0.68 (p < 10^−454) (see
   [81]Fig. 3B,F); the correlation is significant and high also when
   controlling for mRNA levels (r (PSS,PP|mRNA): 0.48 (p = 3.3·10^−213)
   (see [82]Fig. 3C,E) and 0.48 (p = 2.7·10^−208) (see [83]Fig. 3D,F). The
   correlation of PSS with mRNA levels is indeed lower than with both RP
   and PP, (r(PSS,mRNA)): 0.61 (p < 10^−454) and 0.60 (p < 10^−454), for
   the G1 and M phases respectively (see [84]Supplementary Figure S4).

Figure 2.

   [85]Figure 2
   [86]Open in a new tab

   (A) Scatter plot of steady state protein levels (PSS) (y-axis , data is
   log2-scaled) and Ribo-Seq (RP) (x-axis, read count log2-scaled RPKM
   (see Methods)) G1 phase. (B) Scatter plot of PSS (y-axis
   log2(intensity)) and RP levels (x-axis, read count log2-scaled RPKM
   (see Methods)) M phase. (C) Correlation between PSS and RP G1 phase for
   different bins of genes sorted by mRNA levels (the y-axis is the
   correlation, the x-axis is the mRNA levels RPKM ranges, scatter plots
   are log2-scaled). (D) Correlation between PSS and RP M phase for
   different bins of genes sorted by mRNA levels (the y-axis is the
   correlation, the x-axis is the mRNA levels RPKM ranges, scatter plots
   are log2-scaled). (E) A summary of the 2 correlations performed with
   PSS: RP, and RP controlled for mRNA levels (partial correlation) for G1
   phase. F. A summary of the 2 correlations performed with PSS: RP, and
   RP controlled for mRNA levels (partial correlation) for M phase.

Figure 3.

   [87]Figure 3
   [88]Open in a new tab

   (A) Scatter plot of steady state protein levels (PSS) (y-axis
   log2(intensity)) and PUNCH-P (PP) (x-axis log2(intensity)) G1 phase.
   (B) Scatter plot of PSS (y-axis log2(intensity)) and PP levels (y-axis
   log2(intensity)) M phase. (C) Correlation between PSS and PP G1 phase
   for different bins of genes sorted by mRNA levels (the y-axis is the
   correlation, the x-axis is the mRNA levels RPKM ranges, scatter plots
   are log2-scaled). (D) Correlation between PSS and PP M phase for
   different bins of genes sorted by mRNA levels (the y-axis is the
   correlation, the x-axis is the mRNA levels RPKM ranges, scatter plots
   are log2-scaled). (E) A summary of the 2 correlations performed with
   PSS: PP, and PP controlled for mRNA levels (partial correlation) for G1
   phase. (E) A summary of the 2 correlations performed with PSS: PP, and
   PP controlled for mRNA levels (partial correlation) for M phase. PP
   data is log scaled.

   Next we estimated the correlation between the two methods PP and RP to
   evaluate the similarity between the prediction obtained by the two
   methods. Our analyses demonstrate that the correlations between the M
   and G1 phases of PUNCH-P and Ribo-Seq are 0.63 (p < 10^−454) (see
   [89]Fig. 4A,E) and 0.63 (p < 10^−454) (see [90]Fig. 4B,F) respectively;
   the correlations are high also when controlling for mRNA levels (r
   (PP,RP|mRNA)): 0.31 (p = 2.5·10^−112) (see [91]Fig. 4C,E) and 0.32
   (p = 2.2·10^−117) (see [92]Fig. 4D,F) for M and G1 respectively.

Figure 4.

   [93]Figure 4
   [94]Open in a new tab

   (A) Scatter plot of PUNCH-P (PP) (y-axis log2(intensity)) and Ribo-Seq
   (RP) G1 phase (x-axis, read count log2-scaled RPKM (see Methods)). (B)
   Scatter plot of PP (y-axis log2(intensity)) and RP levels M phase
   (x-axis, read count log2-scaled RPKM (see Methods)). (C) Correlation
   between PP and RP G1 phase for different bins of genes sorted by mRNA
   levels (the y-axis is the correlation, the x-axis is the mRNA levels
   RPKM ranges, scatter plots are log2-scaled). (D) Correlation between PP
   and RP M phase for different bins of genes sorted by mRNA levels (the
   y-axis is the correlation, the x-axis is the mRNA levels RPKM ranges,
   scatter plots are log2-scaled). (E) A summary of the 2 correlations
   performed with PP: RP, and RP controlled for mRNA levels (partial
   correlation) for G1 phase. (E) A summary of the 2 correlations
   performed with PP: RP, and RP controlled for mRNA levels (partial
   correlation) for M phase. PP data is log scaled.

   Finally, as can be seen in [95]Fig. 5, regressors based on PP and RP
   for the M and G1 phases (see Methods), as a function of RP coverage
   from >0 to ≥60%, achieve improved correlation with steady state protein
   levels in comparison to a regressor based only on either PP or RP,
   while including mRNA levels further improves the correlation (but not
   substantially), see [96]Supplementary file
   Supplementary_Table_S1_RegressorCorrs.xlsx.

Figure 5. Correlations with M and G1 steady state protein levels (PSS) for
three regressors based on PP, PP and RP, and PP, RP and mRNA respectively, as
a function of the RP coverage (>0 – 60%), the y-axis is the correlation.

   [97]Figure 5
   [98]Open in a new tab

   We performed a 2-fold cross validation 100 times per Spearman linear
   regressor, with the standard deviation of all the regressors being
   between 0.0062–0.0141, with the variation being lower for the most part
   as coverage increases and with more measurements combined. One can see
   that combining all 3 measurements improves correlations with steady
   state protein levels. See [99]Supplementary file
   Supplementary_Table_S1_RegressorCorrs.xlsx.

   The results demonstrate that the correlation between PP and RP is high
   (as expected) but is far from being perfect. In addition, these results
   support the hypothesis that the variance in protein levels can be
   explained by PP, RP, and mRNA levels; thus both changes in mRNA levels
   (regulated among others via transcription) and changes in ribosomal
   densities (as part of the translation step) effect the changes in
   protein abundance (translation, and not only transcription, as
   traditionally thought, has important contribution to changes in protein
   levels). The results also show that PP and RP have significant
   predictive power of protein levels. Finally, we demonstrate how a
   regression based both on PP and RP improves the prediction of steady
   state protein levels. Since steady state protein levels may also be
   affected by proteins not translated at the moment of the experiment, a
   predictor based on both PP and RP improves the prediction of steady
   state protein levels upon a predictor based on PP or RP alone.

There are relevant genes detected to be differentially expressed exclusively
by each method

   At the next step our objective was to show that both PP and RP can be
   used for detecting relevant differentially transcriptional and post
   transcriptional regulated genes, and that each of these methods
   exclusively detects relevant genes.

   To demonstrate this point we first inferred the set of differentially
   expressed (DE) genes between the G1 and M phases of the cell cycle
   detected for PUNCH-P (PP) and Ribo-Seq (RP) separately. M/G1
   differentially expressed (DE) genes were determined according to
   DESeq[100]^36 for Ribo-Seq (RP), where the top 10% most significant FDR
   p-values were selected (See [101]Methods and Supplementary Methods),
   and for PUNCH-P (PP) according the top 10% ANOVA significant fold
   change (see Methods,[102]^31). At the next step we defined three DE
   gene groups: 1. RP-PP (genes that are significantly DE in RP but not in
   PP; 1,090 genes). 2. PP-RP (genes that are significantly DE in PP but
   not in RP; 200 genes). 3. RP∩PP (genes that are significantly DE both
   in PP and in RP; 125 genes). These two DE sets, and the three DE groups
   derived from them will be employed throughout the paper. We performed
   pathway and biological process enrichment for each of the groups
   (Methods). To achieve our objective, we aimed to show that relevant
   pathways and biological processes are significantly enriched with DE
   genes in all three cases.

   As can be seen in [103]Fig. 6 (for a full pathway list please see
   [104]Supplementary Information Table 1 ([105]section 3.2), and for a
   full biological process list see Supplementary files
   [106]Supplementary_Table_S3_RPDavidReports.xlsx,
   [107]Supplementary_Table_S4_PPDavidReports.xlsx,
   [108]Supplementary_Table_S5_RPiPPDavidReports.xlsx, see further details
   in Methods), each technique enables detecting meaningful genes/proteins
   that are not detectable by the other. The detected differentially
   expressed post-translational regulatory pathways related to the three
   sets described via enrichment analysis include: central cell cycle,
   central gene expression regulation, DNA damage and replication, and
   chromosome arrangement.

Figure 6.

   Figure 6
   [109]Open in a new tab

   Selected pathways and biological processes which are significantly
   enriched by the 3 groups of DE genes (for a full pathway list please
   see [110]Supplementary Information Table 1 ([111]section 3.1), and for
   a full biological process list see Supplementary files
   Supplementary_Table_S3_RPDavidReports.xlsx,
   Supplementary_Table_S4_PPDavidReports.xlsx,
   Supplementary_Table_S5_RPiPPDavidReports.xlsx).

   For example, all three sets are enriched with genes related to the cell
   cycle and M phase; RP-PP and RP∩PP are enriched with genes related to
   apoptosis regulation, while PP-RP is enriched with genes related to
   cell proliferation; RP-PP is enriched with genes related to Spindle
   Organization and DNA Damage response, while PP-RP and RP∩PP are
   enriched with genes related to Spindle Microtubule/Microtubule
   organization center and DNA replication.

   We would like to emphasize the fact that aside from detecting distinct
   biologically relevant pathway enrichments, there are cases that the
   sets RP-PP and PP-RP are enriched with genes related to the same (or
   very similar) pathways, suggesting that the different techniques tend
   to find different parts of the same relevant pathways. This evidence
   again demonstrates the advantage of combining/considering the two
   methods.

   Now, in order to further demonstrate that each of the techniques, RP
   and PP, uncovers biologically relevant protein-protein interactions
   that cannot be detected by the other technique, three PPI network
   colouring schemes were defined, where “black” nodes represent
   differentially expressed genes (DE; see Methods) as above between the
   G1 and M phases of the cell cycle. In the first case, the black nodes
   were defined as genes that are DE according to RP but not PP (RP-PP);
   in the second case the black nodes were defined as genes that are DE
   according to PP but not RP (PP-RP); in the third case the black nodes
   were defined as genes that are DE according to both RP and PP;
   similarly to the previous analysis.

   We computed the mean distance (md) between all black nodes in each of
   the aforementioned three cases. For each case, we computed a PPI
   empirical p-value by randomizing each PPI network 100 times
   respectively generating random networks with a similar degree
   distribution as the original one, and calculating the black node
   distance, showing that the mean distances are shorter in the real graph
   in comparison to the random ones (see details in the Methods section).
   Shorter distances between DE PPI nodes means more meaningful biological
   signals, as if indeed we uncover real regulatory changes in signalling
   pathways, we expect them to be clustered/close in the PPI network (we
   expect to see physical interactions between DE genes). All p-values
   were <10^−2 (when 100 permutations are performed a p-value <10^−2 means
   that the observed distance was always shorter than the distances
   obtained during all 100 random permutations), with the mean distance
   being shorter (2.01) in the case of the RP∩PP than in the case of the
   RP-PP and the PP-RP groups (2.12 and 2.13, respectively) (see [112]Fig.
   7).

Figure 7. Three PPI network colouring schemes were defined, where black nodes
represent DE genes (based on PP and/or RP): 1. RP-PP. 2. PP-RP. 3. RP∩PP DE.

   Figure 7
   [113]Open in a new tab

   We compute the mean distance (md) in each between all black nodes. For
   each case we compute a PPI empirical p-value by randomizing each PPI
   network 100 times respectively and calculating the black node distance.
   Shorter distances between DE PPI nodes means more meaningful biological
   signals (we expect to see physical interactions between DE genes).

   Our analyses demonstrate that genes detected by each of the methods
   (even if not detected by the other) tend to be closer to each other
   than expected by the null model in the PPI network. Thus, this result
   supports the hypothesis that not all biological meaningful genes
   detected by one of the methods are detected by the other.

Modules of differentially post-transcriptionally expressed genes and physical
interactions

   To better understand the differentially expressed genes detected by PP
   and RP we performed a clustering analysis (Newman algorithm[114]^37,
   see Methods), on the PPI network using the previously described DE
   genes according to RP and PP respectively, divided into the following
   three aforementioned groups: 1. RP-PP. 2. PP-RP. 3. RP∩PP (See
   [115]Supplementary Figure S5 for RP∪PP ([116]Supplementary section
   3.3)). We projected each of the 3 groups on to the PPI network
   respectively, and only selected genes from each group that have a
   neighbour in that group in the PPI. In each case, the Newman algorithm
   partitions the PPI networks to sub-networks, where each sub-network is
   modular and includes nodes related only to the corresponding group. To
   understand the pathways related to each module we performed pathway
   enrichment based on the genes in each module (for all significantly
   enriched pathways see [117]Supplementary file
   Supplementary_Table_S6_ClusterPathwayEnrichment.xlsx). As can be seen
   in [118]Fig. 8, the number of modules detected for each of the groups
   RP-PP/PP-RP/RP∩PP were 4/13/15 respectively. The modules in all cases
   were enriched with relevant pathways related to the cell-cycle, DNA
   Damage and replication, and gene expression regulation and signalling.
   This analysis demonstrates again that meaningful sub networks of
   physical interactions are detected by each of the methods separately
   and together.

Figure 8.

   [119]Figure 8
   [120]Open in a new tab

   (A) RP-PP clusters: 879 genes participate, resulting in 4 clusters. (B)
   PP-RP clusters: 96 genes participate, resulting in 13 clusters. (C)
   RP∩PP clusters: 90 genes participate, resulting in 15 clusters. The
   functional enrichment related to each cluster appears in the figure.
   There are 4 node sizes depicted in the figure, according to their
   centrality (the 4^th size being equal for most nodes is a
   coarse-grained portrayal for simplicity). For the full cluster pathway
   enrichment see Supplementary_Table_S6_ClusterPathwayEnrichment.xlsx.

Genes detected to be of opposite regulatory direction based on the different
methods

   Finally, we aimed to examine if there are genes that are detected to be
   significantly expressed based on both RP and PP but in opposite
   directions. To this end we looked at the following groups:

   (a) Genes that have RP M/G1 fold-change >0 and PP M/G1 fold-change <0

   (b) Gens that have RP M/G1 fold-change <0 and PP M/G1 fold-change >0

   In total 78 genes appear in the first group and 68 genes in the second
   (the list of genes appears in [121]Supplementary file
   Supplementary_Table_S7_RPopPPdiffGenes.xlsx). Both lists of genes were
   enriched with relevant pathways related to gene regulation and cell
   cycle (see [122]Supplementary table 2 in Supplementary section 3.4).
   For example, the first group is enriched with genes related to DNA
   Replication and cell cycle control, while the second group is enriched
   with genes related to various central signalling pathways.

   This result suggests that increasing/decreasing ribosomal density as
   detected by Ribo-Seq is not always related to increasing/decreasing the
   ribosomal density involved in protein synthesis at a certain time point
   as detected by Punch-P. There can be various explanations for this
   discrepancy which may be related (among others) to the fact that
   translation elongation (and not only translation initiation) is
   controlled during the mammalian cell cycle. For example, regulatory
   changes that cause ribosomal stalling during
   elongation[123]^33,[124]^34,[125]^35 may cause traffic jams, for
   example, near the beginning of the ORF where the ribosomes are not
   translating, or there is no nascent peptide emerging from the ribosome;
   since such ribosomes can theoretically be detected by RP and not PP
   they may increase RP but decrease PP. It is also possible that in some
   cases, due to traffic jams, the RNase does not accurately digest the
   mRNA between ribosome protected regions. This may result in
   underestimation of ribosome density and may lead to a decrease in
   measured ribosome density when the actual density increases (see, for
   example,[126]38).

   It is also important to emphasize that aspects related to changes in
   mRNA levels can’t trivially explain the observed discrepancies since
   both RP and PP are expected to be proportional to mRNA levels (if there
   are no traffic jams and biases).

Details regarding some of the post-transcriptionally regulated genes detected

   The major aim of this study was to show in an objective, large scale,
   quantitative manner that combining RP and PP measurements (in
   comparison to each measure independently) is expected to improve the
   ability to detect meaningful post transcriptional regulation signals.
   Thus, we focused on objective quantitative measures. Nevertheless, in
   this section we provide some biological examples related to
   meaningful/relevant biological cell cycle signals detected by PP and
   RP. To this end, we will focus on the module inference/clustering
   analysis performed based on protein-protein interactions among genes
   detected to be differentially expressed based both on PP and RP (90
   genes, see [127]Fig. 8C and [128]supplementary table
   Supplementary_Table_S9_RPiPP_ClusterPEDetails.xlsx). As mentioned, we
   detected 15 modules (see [129]Fig. 8C); here we will discuss in further
   detail the four largest modules.

   The first module of size 11 genes/proteins includes many genes that
   encode ribosomal proteins (e.g. RPL3, RPL34, RPS10, RPL35, RPL32,
   RPL29) which are down regulated (in terms of both RP and PP) in M in
   comparison to G1. This result supports the hypothesis that translation
   (specifically the canonical regulatory mechanisms) is globally down
   regulated during M phase[130]^39,[131]^40,[132]^41 in mammalian cells,
   and that the down regulation occurs and can be detected also post
   transcriptionally.

   The second module of size 27 genes/proteins includes various M phase
   specific genes/proteins mainly related to spindle morphogenesis and
   chromosome movement that are found to be up-regulated based on PP and
   RP in M phase: for example, one hub in this module is the gene/protein
   ESPL1 which stabilizes cohesion between sister chromatids before
   anaphase, and their timely separation during anaphase is critical for
   chromosome inheritance. Another hub is the gene/protein BUB3 that is
   involved in spindle checkpoint function, which is up-regulated in M
   phase together with BUB1. Interestingly the module also includes
   several kinesins KIF22, KIF20A, KIF18A, KIF23, KIF2C, KIFC1; it was
   suggested that kinesins and proteins interacting with them are known to
   have important spindle morphogenesis and chromosome movement in cell
   division[133]^42,[134]^43,[135]^44,[136]^45, and our analysis
   emphasizes their post-transcriptional regulation. Naturally this module
   also includes (among others) cell cycle regulatory proteins such as
   CDC20 and CDC8 that are involved in nuclear movement prior to anaphase,
   chromosome separation, and spindle formation. It also includes various
   Kinases (e.g. PLK1, CDK1, and TTK) that are involved in regulating the
   processes mentioned above.

   The third module includes 13 genes/proteins related mainly to DNA
   replication. One hub in this module is the gene FZR1; it is
   up-regulated in M-phase and is a key regulator of ligase activity of
   the anaphase promoting complex/cyclosome. The module includes
   genes/proteins related to DNA replication regulation, and activation
   and maintenance of the checkpoint mechanisms in the cell cycle that
   coordinate S phase and mitosis: MCM6, CDC6, MCM3, PCNA, RFC4; all these
   genes are down regulated (based on RP and PP) at the M-phase as there
   is no DNA replication during M-phase[137]^46. The module also includes
   various genes related to gene expression regulation and proliferation
   such as the gene DMAP1 which represses transcription and is
   up-regulated in M-phase. Finally, it includes genes related to cell
   cycle progression such as the genes CCNA2 and CDK4 which are
   up-regulated in M-phase.

   The fourth module (module number 14) includes 12 genes/proteins which
   are related among others to dynamic microtubules polymerization, which
   is an important step of the M-phase[138]^46. For example, the module’s
   main hub, TUBB4B (Tubulin, beta 4B class IVb), and 3 additional
   tubulins (TUBB6, TUBA4A, TUBB4A) are up regulated (according to RP and
   PP) in M-phase; this fact emphasizes the post transcriptional
   regulation of microtubules polymerization during M-phase.

   To summarize the details depicted above, the genes/proteins detected by
   RP and PP are highly relevant to cell-cycle biology and teach us about
   the central role of post transcriptional regulation during the cell
   cycle.

Discussion

   This study includes the first comparison of RP and PP. We report
   various analyses that demonstrate that RP and PP can exclusively detect
   relevant differentially expressed genes. Specifically, based on
   enrichment and PPI network analyses, we show that genes that are
   detected by each of these methods, but not by the other, tend to
   include biologically relevant signals. We evince that the prediction of
   steady state protein levels can be improved by combining PP and RP
   measurements. Furthermore, we show that the relevant DE genes detected
   by each of the methods may have opposite fold-change, demonstrating
   that the two techniques can detect different aspects of translational
   regulation, and are thus in part synergistic.

   There are three major explanations to the fact that the correlation
   between 1) a model based on RP, PP, and mRNA and 2) steady state
   protein levels is not prefect: First, steady state protein levels are a
   result of many gene expression steps such as the regulation of protein
   degradation, post-translational regulation, and secretion of proteins.
   Second, there are different biases in the cases of the various
   experiments/measurements. For example, the sequencing based experiments
   have biases related to RNase, while the proteomic based approaches have
   biases related to protein digestion; in addition, the distribution of
   protein/peptide length is different in PUNCH-P (where truncated
   proteins are generated at the first stage) and in steady state protein
   levels measurements.

   Third, some of the differences are due to natural variability among
   technical repeats and may also be related to the stochasticity
   (specifically for lowly expressed genes) of the gene expression steps
   (see, for example,[139]47).

   We would like to summarize some of the different biases in the RP and
   PP experiments. The RP major biases can be related to
   preferences/non-uniform efficiency of the RNase, sequencing biases,