Abstract

   The steady-state localisation of proteins provides vital insight into
   their function. These localisations are context specific with proteins
   translocating between different subcellular niches upon perturbation of
   the subcellular environment. Differential localisation, that is a
   change in the steady-state subcellular location of a protein, provides
   a step towards mechanistic insight of subcellular protein dynamics.
   High-accuracy high-throughput mass spectrometry-based methods now exist
   to map the steady-state localisation and re-localisation of proteins.
   Here, we describe a principled Bayesian approach, BANDLE, that uses
   these data to compute the probability that a protein differentially
   localises upon cellular perturbation. Extensive simulation studies
   demonstrate that BANDLE reduces the number of both type I and type II
   errors compared to existing approaches. Application of BANDLE to
   several datasets recovers well-studied translocations. In an
   application to cytomegalovirus infection, we obtain insights into the
   rewiring of the host proteome. Integration of other high-throughput
   datasets allows us to provide the functional context of these data.

   Subject terms: Proteomics, Mass spectrometry, Bayesian inference, Data
   processing
     __________________________________________________________________

   Changes in protein subcellular localization can be determined using
   mass spectrometry. Here, the authors present a statistical approach to
   determine relocalising proteins from spatial proteomics experiments.

Introduction

   The cell is compartmentalised into organelles and subcellular niches,
   allowing many biological processes to occur in synchrony^[36]1.
   Proteins are localised to these niches in accordance to their function
   and thus to shed light on the function of a protein, it is necessary to
   determine its subcellular location. A number of pathologies have
   implicated incorrect localisation as a contributing factor, including
   obesity^[37]2, cancers^[38]3, neurological disorders^[39]4, as well as
   multiple others^[40]5. It is estimated that up to 50% of proteins
   reside in multiple locations^[41]6,[42]7, which complicates the study
   of their localisations. Community approaches have led to substantial
   improvements in our understanding of subcellular
   localisation^[43]7,[44]8. However, image-based approaches are often low
   in throughput and high-throughput alternatives are desirable.
   Furthermore, many biological processes are regulated by re-localisation
   of proteins, such as transcription factors shuttling from the cytoplasm
   to the nucleus, which are difficult to map using imaging methods at
   scale^[45]9.

   To simultaneously study the steady-state localisation and
   re-localisation of proteins, one approach is to couple gentle cell
   lysis and whole-cell fractionation with high-accuracy mass spectrometry
   (MS)^[46]6,[47]10–[48]12. These approaches have already led to
   high-resolution subcellular maps of mouse embryonic stem cell
   (mESC)^[49]6, human cell lines^[50]11,[51]12, S. cerevisiae (bakers’
   yeast)^[52]13, cyanobacterium^[53]14 and the apicomplexan Toxoplasma
   Gondii^[54]15. Dynamic experiments have given us unprecedented insight
   into HCMV infection^[55]16, EGF stimulation^[56]17, and EGFR
   inhibition^[57]12. In addition, CRISPR-Cas9 knockouts coupled with
   spatial proteomics has given insights into AP-4 vesicles^[58]4, as well
   as AP-5 cargo^[59]18. These adaptor protein complexes are involved in
   facilitating the transport of cargo proteins between membranous
   organelles with both AP-4 and AP-5 associated mutations implicated in
   severe neurological disease. In a study by Shin et al.^[60]19, the
   golgin long coiled-coil proteins that selectively capture vesicles
   destined for the Golgi were re-located to the mitochondria by replacing
   their Golgi targeting domains with a mitochondrial transmembrane
   domain^[61]19. This allowed the authors to readily observe the vesicle
   cargo and regulatory proteins that are redirected to the mitochondria,
   while avoiding technical issues that arise because of the redundancy of
   the golgins and their transient interaction with vesicles. Together,
   these collections of experiments suggest spatial proteomics can provide
   unprecedented insight into biological function.

   Mass spectrometry-based spatial proteomics currently relies on
   supervised machine learning methods, such as support vector machines,
   to assign proteins to a subcellular niche using marker proteins with
   known localisations^[62]20,[63]21. Advanced computational approaches
   have also been developed, including novelty detection
   algorithms^[64]22,[65]23 and transfer learning approaches^[66]24. These
   approaches are implemented in the pRoloc software suite^[67]25,[68]26,
   which builds on the MSnbase software^[69]27 as part of the Bioconductor
   project^[70]28,[71]29. However, most machine learning methods fail to
   quantify uncertainty (estimate reliability) in the assignment of a
   protein to an organelle, which is paramount to obtaining a rich
   interrogation of the data.^[72]30 developed a Bayesian model to analyse
   spatial proteomics data and highlighted that uncertainty quantification
   can give insights into patterns of multi-localisation. The method is
   implemented as a tool as part of the Bioconductor project^[73]31.

   In dynamic and comparative experiments; that is, those where we expect
   re-localisation upon some stimulus to subcellular environment, the data
   analysis is more challenging. The task can no longer be phrased as a
   supervised learning problem, but the question under consideration is
   clear: which proteins have different subcellular niches after cellular
   perturbation? Procedures to answer this question have been presented by
   authors^[74]16,[75]17,[76]32,[77]33 and reviewed in ref. [78]34. The
   approach of refs. [79]17, [80]32 relies on coupling a multivariate
   outlier test and a reproducibility score—termed the
   movement-reproducibility (MR) method. A threshold is then applied to
   these scores to obtain a list of proteins that re-locate; "moving"
   proteins. However, these scores can be challenging to interpret, since
   their ranges differ from one experiment to another and require
   additional replicates to calibrate the scores. Furthermore, the test
   ignores the spatial context of each protein, rendering the approach
   inefficient with some applications allowing false discovery rates of up
   to 23%^[81]18. Finally, the approach does not quantify uncertainty
   which is of clear importance when absolute purification of subcellular
   niches is impossible and multi-localising proteins are present.
   Recently, Kennedy et al.^[82]33 introduced a computational pipeline for
   analysing dynamic spatial proteomics experiments by reframing it as a
   classification task. However, this formulation ignores that some
   changes in localisation might be shifts in multi-localisation patterns
   or only partial changes, and it relies on the success of the
   classification. Furthermore, their approach cannot be applied to
   replicated experiments and so its applicability is limited. In
   addition, the authors found that they needed to combine several of the
   organelle classes together to obtain good results. Finally, the framing
   of the problem as a classification task only allows a descriptive
   analysis of the data. These considerations motivate the development of
   a more sophisticated and reasoned methodology.

   We present Bayesian ANalysis of Differential Localisation Experiments
   (BANDLE)—an integrative semi-supervised functional mixture model, to
   obtain the probability of a protein being differentially localised
   between two conditions. Posterior Bayesian computations are performed
   using Markov-chain Monte-Carlo and so uncertainty estimates are also
   available^[83]35. We associate the term differentially localised to
   those proteins which are assigned different subcellular localisations
   between two conditions. Then, we refer precisely to this phenomenon as
   differential localisation, throughout the text. Hence, our main
   quantity of interest is the probability that a protein is
   differentially localised between two conditions.

   BANDLE models the quantitative protein profiles of each subcellular
   niche in each replicate of each experiment^[84]36. A first layer of
   integration combines replicate information in each experiment to obtain
   the localisation of proteins within a single experimental condition. To
   probabilistically integrate the two conditions, we use a probability
   distribution that combines localisation information in each condition
   so that the datasets are modelled together. By examining the
   differences in localisations, we can obtain a differential localisation
   probability. Two prior distributions are proposed: one using a matrix
   extension of the Dirichlet distribution and another, more flexible
   prior, based on Pólya-Gamma augmentation^[85]37–[86]39.

   In this work, we demonstrate the utility of BANDLE, by first performing
   extensive simulations and compare to the MR approach. Our simulation
   study shows that our approach reduces the number of Type I and Type II
   errors, and, as a result, can report an increased number of
   differentially localised proteins. These simulations also highlight the
   robustness of our approach to a number of experimental scenarios
   including batch effects. Our simulation studies also highlight that
   BANDLE provides interpretative improvements and clearer visualisations,
   and makes less restrictive statistical assumptions. We then apply our
   method to a number of datasets with well-studied examples of
   differential localisation, including EGF stimulation and AP-4-dependent
   localisation. We recover known biology and provided additional cases of
   differential localisation, and suggest that TMEM199, a transmembrane
   protein involved in Golgi homoeostasis, localisation is potentially
   AP-4 dependent. Finally, we apply BANDLE to a human cytomegalovirus
   (HCMV) dataset—a case where MR approach is not applicable because the
   MR approach requires multiple replicates. Integration of
   high-throughput transcriptomic and proteomic data, along with
   degradation assays, acetylation experiments and a cytomegalovirus
   interactome allows us to provide the functional context of these data.
   In particular, we provide the spatial context of the HCMV interactome
   data.

Results

The BANDLE workflow

   To perform statistical analysis of differential subcellular
   localisation experiments we developed BANDLE. BANDLE is a
   semi-supervised integrative functional mixture model that allows the
   computation of a differential localisation probability. The BANDLE
   workflow, visualised in Fig. [87]1, begins with a
   mass-spectrometry-based spatial proteomics experiment. A cellular
   perturbation of interest is performed alongside control experiments in
   wild-type cells or another suitable control, depending on the
   application. To avoid confounding factors, control and treatment
   experiments should be performed in parallel with identical
   mass-spectrometry settings^[88]20,[89]21. We further recommend checking
   that the experiment is successful by performing western blots on
   organelle markers prior to mass-spectrometry analysis, as well as
   examining the quality of clustering of marker proteins
   computationally^[90]20,[91]21,[92]40. Our Bayesian approach, BANDLE, is
   applicable to experiments with any number of replicates, as well as
   several subcellular fractionation approaches or mass-spectrometry
   methods. BANDLE supports multiple mass-spectrometry-based methods,
   including label-free and labelled quantitation (e.g. SILAC and isobaric
   tags), and data-dependent and data-independent acquisition. BANDLE
   models the mass-spectrometry profiles of the subcellular niches using a
   model that learns spatial correlations known as a Gaussian random
   field, which is also frequently referred to as a Gaussian Process (see
   ‘Methods’). Niches are modelled separately across replicates and
   conditions to allow for variations. Each dataset is hence modelled as a
   mixture of the different subcellular niches. A distribution of
   localisations is determined for each protein in each condition.
   Subsequently, this information is shared across the two conditions by
   using a probability distribution that integrates protein localisation
   information between the datasets. A visualisation in the case of two
   replicates and two conditions is given in Fig. [93]1c. By examining the
   inferred subcellular localisation across the two conditions and using
   Bayesian inference, we can compute a differential localisation
   probability. To apply the Bayesian model, we first calibrate the prior
   based on prior predictive checks (simulation of data from the prior
   probability distribution)^[94]41. In all scenarios, we check the prior
   expectation of the number of differentially localised proteins and the
   probability that more than l proteins are differentially localised.
   These are reported in the supplement. We then proceed with Bayesian
   parameter inference using Markov-Chain Monte-Carlo (MCMC), which
   samples from posterior distribution of the model parameters^[95]35, and
   the checking of convergence. We visualise our results principally using
   rank plots, where proteins are ranked from those most likely to be
   differentially localised or not (Fig. [96]1d).

Fig. 1. An overview of the BANDLE workflow.

   [97]Fig. 1
   [98]Open in a new tab

   a A differential localisation experiment is set-up with a
   perturbation/treatment of interest. Orbitrap Image was generated by
   Fredrik Edfors at the Noun Project. b Mass-spectrometry-based spatial
   proteomics methods are applied to generate abundance profiles across
   the subcellular fraction. c BANDLE is applied by first calibrating the
   prior. The prior parameter is denoted by α. The model is visualised as
   follows: each dataset is described as a mixture of subcellular niches,
   modelled as Gaussian random fields. Allocations are obtained for each
   condition and then integrated between the two conditions, using a joint
   prior. π is a K × K matrix, where K is the number of subcellular niches
   analysed, where entry (j, h) is the probability that a protein is
   localised to niche j in the control and niche h in the treatment.
   z[i,d]is a categorical variable denoting the localisation of protein i
   in condition d. d The major results of BANDLE are represented in a rank
   plot. Sample size = 1000. Data are presented as medians with notches
   representing the interquartile range (IQR), and the whiskers extending
   to 1.5*IQR. e Example circos plot generated from the results of BANDLE.
   Source data are provided as a Source data file in the Supplementary
   material.

Simulations demonstrate superior performance of BANDLE

   To assess the performance of BANDLE and the MR approach, we run a
   number of simulations allowing us to ascertain the difference between
   each method in scenarios where we know the ground truth. Two variations
   of the MR approach were proposed and we compare to both (see ‘Methods’
   section ‘The movement-reproducibility method’). We first start with a
   real dataset from Drosophila embryos and simulate replicates, as well
   as 20 protein re-localisations^[99]42. To simulate these datasets a
   bootstrapping approach is used, coupled with additional noise effects.
   The first simulation uses a simple bootstrapping approach, where a
   niche-specific noise component is included (see [100]Supplementary
   Methods). The subsequent simulations start with the basic bootstrapping
   approach and add additional effects. The second and third simulations
   add batch effects: random and systematic respectively
   (see [101]Supplementary Methods). While the fourth simulation generates
   misaligned features, i.e. subcellular fraction, across datasets by
   permuting them (fraction swapping)—this models misaligned fractions
   between replicates (see [102]Supplementary Methods). The final
   simulation includes both batch effects and feature permutations. The
   simulations are repeated 10 times, where each time we simulate entirely
   new datasets and re-localisations—this is repeated for each simulation
   task. We assess the methods on two metrics—the area under the curve
   (AUC) of the true positive rate and false positive rate for the
   detection of differential localised proteins. Furthermore, we determine
   the number of correctly differentially localised proteins at fixed
   thresholds (see [103]Supplementary Methods).

   Our proposed method, BANDLE, significantly outperforms the MR methods
   with respect to AUC in all scenarios (t-test p < 0.01, Fig. [104]2).
   Furthermore, it demonstrates that BANDLE is robust to a variety of
   situations, including batch effects. We used two separate approach to
   incorporate prior information into BANDLE. The performance of BANDLE
   based on the Dirichlet prior is already very good and thus it is
   unsurprising that we do not observe any significant improvements in AUC
   by including prior information on correlations captured by the
   Pólya-Gamma prior. Additional comparisons are made in the supplement
   where we make similar observations (see Supplementary Note [105]2).

Fig. 2. Comparison of BANDLE with other approaches.

   [106]Fig. 2
   [107]Open in a new tab

   Boxplots comparing the performance of the MR approach (2016 and 2017)
   and our proposed method BANDLE. BANDLE is separated into whether a
   Dirichlet-based prior was used or if the Polya-Gamma augmentation was
   applied. Each boxplot corresponds to a different simulation scenario.
   The boxplots show BANDLE has significantly improved AUC in all
   scenarios. Simulation are over 10 datasets. Data are presented as
   medians with notches representing the interquartile range (IQR), and
   the whiskers extending to 1.5*IQR. Source data are provided as a Source
   data file in the [108]Supplementary Material.

   The improved AUC, which demonstrates improved control of false
   positives and increased power, translates into increased discovery of
   differentially localised proteins. Indeed, BANDLE with the Dirichlet
   prior discovers around twice as many such re-localising proteins than
   the MR approach (see Supplementary Figs. [109]2 and [110]3). Allowing
   prior correlations through the Pólya-Gamma prior demonstrates that
   additional differentially localised proteins are discovered. This is an
   important reality of those performing comparative and dynamic spatial
   proteomics experiments, since the experiments become more worthwhile
   with additional biological discoveries. In practice, the authors of the
   MR approach advocate additional replicates to calibrate which
   thresholds are used to declare a protein differentially localised. This
   assumes that the perturbation of interest does not have a strong effect
   on the properties of the subcellular niches, which restricts
   applicability. In contrast, BANDLE does not need additional
   mass-spectrometry experiments to calibrate its probabilistic ranking
   meaning more discoveries are made at lower cost. In the Supplementary
   Methods [111]21.19), we demonstrate that the differential localisation
   probabilities provide estimates of the FDR at a 1% level and at all
   levels once outliers are filtered.

   In the following section, we examine the differences between the
   approaches in a simulated example. There we focus on the output,
   interpretation and statistical qualities of each approach, rather than
   the predictive performance of the methods.

BANDLE quantifies uncertainty and is straightforward to interpret

   In this section, we further explore the application of BANDLE with a
   Dirichlet prior and the MR approach, focusing on the interpretation and
   statistical properties of the two methods. Again, we simulate dynamic
   spatial proteomics data, starting from the Drosophila experiment in the
   scenario in which the MR method performed best. This is where there are
   cluster-specific noise distributions but no other effects, such as
   batch effects, were included (see Fig. [112]2). Sample PCA plots of the
   data are presented in Fig. [113]3a. There is a clear pattern of
   localisations across the data where proteins with known subcellular
   localisations are closer to each other. However, the organelle
   distributions clearly overlap and in some cases are highly dispersed—a
   representation of the challenges faced in real data. These data are
   annotated with 11 subcellular niches and 888 proteins are measured
   across 3 replicates of control and 3 of treatment (totalling 6
   experiments). Re-localisations are simulated for 20 proteins.

Fig. 3. A detailed comparison of BANDLE and the movement-reproducibility
approach.

   [114]Fig. 3
   [115]Open in a new tab

   a Example PCA plots where pointers correspond to proteins. Marker
   proteins are coloured according to their subcellular niche, while
   proteins with unknown localisation are in grey. Simulated
   translocations are indicated in black letters, where the left
   corresponds to first control and right to first the perturbed simulated
   datasets. b An MR-plot showing movement score against reproducibility
   score. Each pointer corresponds to a protein and orange pointers
   correspond to simulated translocations and blue otherwise. Teal lines
   are drawn at suggested thresholds with proteins in the top right corner
   considered hits. c A histogram of the raw statistics underlying the MR
   method. A Chi-square (orange) and Gamma (blue) fit are overlaid
   (obtained using maximum likelihood estimation). The Gamma distribution
   clearly captures the tail behaviour. d A BANDLE rank plot where
   proteins are ranked from most to least likely to differentially
   localised. The differentially localisation probability is recorded on
   the y-axis. (right) A BANDLE rank plot of the top 30 differentially
   localised proteins with uncertainty estimates for the differential
   localisation probability. Proteins marked in orange were simulated
   translocations. e Violin plots for the differential localisation
   probabilities (BANDLE), the M score (MR method) and R score (MR
   method). The distributions are split between differential localised
   (movers) and spatially stable proteins. Clearly, the differential
   localisation probabilities correlate most closely with the phenomena of
   interest. Boxplots are presented as medians with notches representing
   the interquartile range (IQR), and the whiskers extending to 1.5*IQR.
   Sample size is 888. Source data are provided as a Source data file in
   the [116]Supplementary Material.

   We first apply the MR method according to the methods in refs. [117]17,
   [118]32. We provide a brief description of the approach with full
   details in the ‘Methods’. To begin, the difference profiles are
   computed by subtracting the quantitative values for each treatment from
   each control. Then the squared Mahalanobis distance is computed to the
   centre of the data and under a Gaussian assumption the null hypothesis
   is that these distances follow a Chi-squared distribution, ergo a
   p-value is obtained. This process is repeated across the 3 replicates
   and the largest p-value was then cubed and then corrected from multiple
   hypothesis testing using the Benjamini–Hochberg procedure^[119]43. A
   negative
   [MATH: <msub><mrow><mi>log</mi></mrow><mrow><mn>10</mn></mrow></msub>
   :MATH]
   transform is then performed to obtain the M-score. To produce the
   R-score, Pearson correlations are computed between each difference
   profile for all pairwise combination of difference profiles. The lowest
   of the three R-scores is reported. The M-score and R-score are plotted
   against each other (see Fig. [120]3b) and the proteins with high
   M-score and high R-score are considered differentially localised.

   There are a number of assumptions underlying the MR methodology.
   Firstly, comparing difference profiles pairwise assumes that the
   features in both datasets exactly correspond. However, this precludes
   any stimuli that changes the biochemical properties of the organelles,
   since changing these properties may result in differing buoyant
   densities, pelleting of niches at different centrifugation speeds or
   differential detergent solubility. Thus, whether density-gradient,
   differential centrifugation or differential solubilisation is used for
   organelle separation this assumption must be carefully assessed.
   Secondly, the Gaussian assumption ignores the natural clustering
   structure of the data because of the different organelle properties.
   Indeed, examination of the p-value distributions in a histogram
   (Supplementary Fig. [121]5) shows that it clearly deviates from the
   mixture of distributions expected (p-values are uniformly distributed
   under the null). The peaking of p-value towards 1 suggests poor
   distributional assumptions^[122]44. Thus perhaps the Chi-squared
   distribution is a poor fit for the statistic of interest. Exploring
   this further, we fit a Chi-squared and Gamma distribution empirically
   to the statistics using maximum likelihood estimation (MLE) of the
   parameters. Figure [123]3c shows that the Gamma distribution is a
   better distributional fit—successfully capturing the tail behaviour of
   the statistic (log Likelihood ratio: 1644 on 1 degree of freedom). The
   Chi-squared family is nested in the Gamma family of distributions, so
   if the theoretical Chi-squared distribution was a good fit the
   distributions would overlap. For a quantitative assessment of model
   fits we compute the negative log-likelihood of the data given the
   optimal distributions—the Gamma distribution has a markedly lower
   negative log-likelihood (log Likelihood Ratio 1644). A p-value
   histogram is provided in Supplementary Note [124]4. This provides
   strong evidence that the underlying Gaussian assumptions are likely
   violated. Thirdly, it is inappropriate to cube p-values: to combine
   p-value across experiments one could use Fisher’s
   method^[125]45–[126]47 or the Harmonic mean p-value
   (HMP)^[127]48,[128]49 depending on the context. Indeed, the cube of the
   p-value is no longer a p-value. To elaborate, if
   [MATH: <mi class="MJX-tex-caligraphic" mathvariant="script">P</mi>
   :MATH]
   are a set of p-values, then under the assumption of the null hypothesis
   [MATH: <mi class="MJX-tex-caligraphic" mathvariant="script">P</mi>
   :MATH]
   is uniformly distributed; however, the cube is clearly not uniformly
   distributed. Since we no longer work with p-values, Benjamini–Hochberg
   correction becomes meaningless in this context. Transforming these
   values to a "Movement score", conflates significance with effect size
   which confounds data interpretation. Finally, summarising to a single
   pair of scores ignores their variability across experimental
   replicates.

   BANDLE first models each subcellular niche non-parametrically (since
   the underlying functional forms are unknown^[129]36). Visualisation of
   the posterior predictive distributions from these fits for selected
   subcellular niches is given in Supplementary Fig. [130]4 (Supplementary
   Note [131]3)—we observe a good correspondence between the model and the
   data. We can see that the different subcellular niches have contrasting
   correlation structures and thus niche-specific distributions are
   required. These distributions are specific for each replicate of the
   experiment and also the two experimental conditions. The information
   from the replicates, and the control and treatment are combined using
   an integrative mixture model. Briefly, mixing proportions are defined
   across datasets allowing information to be shared between the control
   and treatment (see ‘Methods’ for more details). This formulation allows
   us to compute the probability that a protein is assigned to a different
   subcellular niche between the two experiments—the differential
   localisation probability. The proteins can then be ranked from most
   probably differentially localised to least (Fig. [132]3d). The figure
   is simple to interpret: the proteins with highest rank are the most
   likely to have differentially localised between the experiment, having
   been confidently assigned to different subcellular niches in the
   control versus treatment. The proteins with lowest rank are highly
   unlikely to have moved during the experiment—the localisations are
   stable. This is important information in itself, especially when
   combined with other information; such as, changes in abundance or
   post-translational modification. Figure [133]3d (right) shows the 30
   proteins with highest rank visualising the uncertainty in the
   differential localisation probability (see ‘Methods’). This ranking
   allows us to prioritise which proteins to follow up in validation
   experiments. Example changes in localisation are provided in
   Supplementary Note [134]5. The ranking can also be mapped onto other
   experimental data, such as expression or protein–protein interaction
   data. The probabilistic ranking produced by BANDLE is more closely
   aligned with the phenomenon of interest. Indeed, if we divide the data
   into the proteins that were differentially localised and those that
   were not. Then from plotting the distribution of the statistics from
   the respective methods, it is clear that output from BANDLE is most
   closely associated with re-localisation events (Fig. [135]3e).

   To examine the calibration of the probabilities we computed the
   expected calibration error—the average difference between the predicted
   probability and true probability (see [136]Supplementary Methods). We
   found that the expected calibration error was 0.042, suggesting that
   these probabilities are well-calibrated, since they are similar to
   typical errors found on approaches that have undergone post hoc
   calibration^[137]50. For further simulations, see
   the [138]Supplementary Methods. We also performed a prior-sensitivity
   analysis and found that the influence of the prior is weak (see
   Supplementary Note [139]11). We also examine the effect of
   normalisation procedures on the outcomes of the analysis and found that
   this has a limited influence on the performance of BANDLE but can
   influence the results of the MR approach considerably
   (see [140]Supplementary Methods).

Characterising differential localisation upon EGF stimulation

   Having carefully assessed the statistical properties of our approach,
   BANDLE, and the MR method, we apply these approaches to a number of
   datasets. First, we consider the Dynamic Organeller Maps (DOMs) dataset
   of ref. [141]17, exploring the effects of EGF stimulation in HeLa
   cells. In this experiment, SILAC labelled HeLa cells were cultured and
   recombinant EGF was added to the culture at a concentration of
   20 ng ml^−1 (see ref. [142]17). A total of 2237 complete protein
   profiles were measured across 3 replicates of control and 3 replicates
   of EGF treated HeLa cells. Principal Component Analysis (PCA)
   projections of the data can be visualised in the supplement
   (Supplementary Figs. [143]24 and [144]25). A quality control assessment
   was performed using the approach of ref. [145]40. As a result, nuclear
   pore complex, peroxisome and Golgi annotations were removed, since the
   marker proteins of these classes were highly dispersed.

   The MR method was applied as described in the ‘Methods’ and the results
   can be visualised in Fig. [146]4a. 7 proteins are predicted to be
   differentially localised using the MR method with the thresholds
   suggested by ref. [147]17. These include 3 core proteins of the EGF
   signalling pathway SHC1, GRB2 and EGFR^[148]51 and other, potentially
   related, proteins TMEM214, ACOT2, AHNAK, PKN2. Since the MR approach
   does not provide information about how the functional residency of the
   proteins change, it is challenging to interpret these results without
   further analytical approaches.

Fig. 4. Analysis of an EGF stimulation dataset.

   [149]Fig. 4
   [150]Open in a new tab

   a An MR-plot where dark green lines are drawn at suggested threshold
   and hits are highlighted in orange. b BANDLE rank plot showing the
   distribution of differentially localised proteins. c The top
   differentially localised proteins from BANDLE plotted with uncertainty
   estimates. d Bootstrap distributions of correlations with a
   phosphoproteomic time-course experiment. The BANDLE confidence
   intervals differ significantly from 0, while the MR method do not. 5000
   bootstrap samples were used with data presented as medians with notches
   representing the interquartile range (IQR), and the whiskers extending
   to 1.5*IQR. e PCA plots with (smoothed) localisation probabilities
   project onto them. Each colour represents an organelle and ellipses
   represent lines of isoprobability. The inner ellipse corresponds to
   0.99 and the proceed line 0.95 with further lines decreasing by 0.05
   each time. The annotated proteins demonstrate examples of differential
   localisations. EGFR (P005330) clearly relocalises from the PM to
   endosome, while SHC-1 ([151]P29353) and GRB2 ([152]P62993) relocalise
   from unknown localisation to the lysosome. Source data are provided as
   a Source Data file in the [153]Supplementary Material.

   To quantify uncertainty and gain deeper insight into the perturbation
   of HeLa cell after EGF stimulation we applied our BANDLE pipeline, with
   an informative prior (see ‘Methods’). Sensitivity to these prior
   choices are assessed in the [154]Supplementary Materials (Supplementary
   Note [155]11). Firstly, the rank plots display a characteristic shape
   suggesting that most proteins are unlikely to be differentially
   localised upon EGF stimulation (Fig. [156]4b). Furthermore, we provide
   uncertainty estimates in the probability that a protein is
   differentially localised for selected top proteins (Fig. [157]4c).
   Furthermore, we visualise the change in localisation for the proteins
   known to re-localise upon EGF stimulation: SHC1, GRB2 and EGFR
   (Fig. [158]4e). This is displayed by projecting the posterior
   localisation probabilities on to the corresponding PCA coordinates.
   These probabilities are then smoothed using a Nadaraya-Watson kernel
   estimator^[159]52,[160]53 and visualised as contours. PCA plots of the
   raw data are found in the [161]Supplementary Materials (Supplementary
   Note [162]13).

   Given the well-documented interplay between phosphorylation and
   subcellular localisation^[163]54–[164]57, we hypothesised that proteins
   with the greatest differential phosphorylation would correlate with
   proteins that were more likely to be differentially localised. To this
   end, we integrated our analysis with a time-resolved phosphoproteomic
   dataset of EGF stimulation using MS-based quantitation^[165]58. In
   their study, cells were harvested at eight different time points after
   EGF stimulation: 0, 2, 4, 8, 16, 32, 64 and 128 min (see supplementary
   Note [166]7). Cells were harvested and protein digested to peptides
   using trypsin. Peptides corresponding to each time point were labelled
   with a different iTRAQ tag before combining all samples together and
   quantifying using LC-MS/MS. Immunoprecipitation was used to enrich for
   phosphorylated tyrosine residues^[167]59 and the enrichment of
   phosphosites on serine and threonine residues was performed via
   immobilized metal affinity chromatography (IMAC)^[168]60,[169]61.

   For each phosphopeptide corresponding to a unique protein, we computed
   the largest log[2] fold change observed across the time course. Given
   that the changes in localisation occur within 20 min, we restricted
   ourselves to the first 6 time points^[170]17. We then took the top 10
   proteins ranked by each of the MR method and BANDLE. These rankings are
   then correlated with rankings obtained from the changes in
   phosphorylation. The Spearman rank correlations were recomputed for
   5000 bootstrap resamples to obtain bootstrap distributions of
   correlations (see Fig. [171]4). We report the mean correlation and the
   95% bootstrap confidence intervals. The correlation between the ranks
   of the MR method and the phosphoproteomic dataset was ρ[S] = 0.40
   (−0.49, 0.85), while the correlation when using the ranking of BANDLE
   was ρ[S] = 0.68 (0.02, 0.98). That is, phosphorylated proteins are more
   likely to be differential localised and this signal is more clear using
   the ranking obtained from using BANDLE. Alongside the statistical and
   interpretable benefits of BANDLE, it is clear the approach has the
   utility to provide insight into localisation dynamics.

BANDLE obtains deeper insights into AP-4-dependent localisation

   The adaptor protein (AP) complexes are a set of heterotetrameric
   complexes, which transport transmembrane cargo protein
   vesicles^[172]62. The AP1-3 complexes are well characterised: AP-1
   mediates the transport of lysosomal hydrolases from the trans-Golgi to
   the endosomes^[173]63,[174]64; AP-2 has a significant role in the
   regulation of endocytosis^[175]65; AP-3 is involved in the sorting of
   trans-Golgi proteins targeted to the lysosome^[176]66. The role of the
   AP-4 complex has become better understood in recent
   years^[177]4,[178]67–[179]73 and is of noted interest because
   loss-of-function mutations resulting in early-onset progressive spastic
   paraplegia^[180]74. The altered subcellular distribution of ATG9A, as a
   result of loss-of-functions AP-4 mutation^[181]4,[182]69,[183]70, is
   believed to be a key contributor to the pathology of AP-4 deficiency
   syndrome^[184]4,[185]69–[186]73.

   AP-4 consists of four subunits (β4, ε, μ4 and σ4) forming an obligate
   complex^[187]66.^[188]4 study the functional role of AP-4 using spatial
   proteomics; in particular, the DOM workflow mentioned previously. As
   part of their study, they used AP-4 CRISPR knockout cells to
   interrogate the effect on the spatial proteome when AP-4 function has
   been ablated.

   Re-analysis of this subcellular proteomics experiment provides full
   quantitative measurements for 3926 proteins across two replicates of
   wild-type cells and two replicates where the β4 subunit has been
   knocked-out. They also produce two replicates where AP4E1 is
   knocked-out but this is not considered here for brevity. The data are
   visualised as PCA plots (see Supplementary Note [189]14). As in the
   previous analysis, we run a quality control step removing the actin
   binding protein and nuclear pore complex annotations^[190]40. This
   dataset is particularly challenging to analyse because there are only
   two replicates for each condition. The value of Bayesian analysis is
   the ability to provide prior information to regularise, as well as the
   quantification of uncertainty, which is more critical in data sparse
   scenarios.

   Previous application of the MR methods led to authors to find that
   SERINC 1 ([191]Q9NRX5), SERINC 3 ([192]Q13530) were differentially
   localised, as well as an altered subcellular distribution for ATG9A
   ([193]Q7Z3C6)^[194]4. Their results suggest these are cargo proteins of
   the AP-4 complex that are packaged into vesicles at the trans-Golgi
   before being transported to the cell periphery. Altogether, their
   results suggest AP-4 provides spatial regulation of autophagy and that
   AP-4 neurological pathology is linked to disturbances in membrane
   trafficking in neurons^[195]4,[196]69.

   We apply our method BANDLE to gain further insights into AP-4-dependent
   localisation. We compute the differential localisation probability; the
   associated uncertainty estimates and rank proteins according to this
   statistic (see Fig. [197]5a and Supplementary Note [198]9).
   Characteristic S-shaped plots are observed with most proteins not
   differentially localised upon knock-out of AP-4 β4. The results of both
   SERINC 1 and 3 are validated, as we compute a differential localisation
   probability greater than 0.95 for these proteins. Furthermore, 16 of
   the top 20 proteins are membrane-bound or membrane-associated proteins
   (FDR < 0.01 hyper-geometric test). To demonstrate the benefit of our
   probabilistic ranking, we perform two-sided KS rank test against the
   functional annotations provided in the STRING database (corrected for
   multiple testing within each functional framework)^[199]75. We find
   that processes such as ER to Golgi transport and lipid metabolism are
   more highly ranked than would be expected at random (FDR < 0.01), as
   well as endosomes and Golgi localisations (FDR < 0.01). While processes
   associated with translation, ribosome localisation and function appear
   significantly lower in the ranking (FDR < 0.01). As expected, this
   provides a high-level overview and evidence for the functional nature
   of AP-4 in the secretory pathway.

Fig. 5. BANDLE applied to the AP-4 dataset.

   [200]Fig. 5
   [201]Open in a new tab

   a The top differentially localised proteins from BANDLE plotted with
   uncertainty estimates. Data are presented as medians with notches
   representing the interquartile range (IQR), and the whiskers extending
   to 1.5*IQR. 667 MCMC samples were used. b A Spearman correlation
   heatmap showing strong correlation behaviours of proteins that have
   AP-4 dependent localisation (lower). Normalised mass-spectrometry
   profiles plotted as a heatmap from the AP-4 knockout data. Proteins are
   shown to have similar behaviour with greater intensity in fraction 5,
   where light membrane organelles are likely to pellet (upper). c PCA
   plots with (smoothed) localisation probabilities project onto them.
   Each colour represents an organelle and ellipses represent lines of
   isoprobability. The inner ellipse corresponds to 0.99 and the proceed
   line 0.95 with subsequent lines decreasing by 0.05 each time. The
   proteins SERINC 1 and 3, as well as TMEM199 are highlighted
   demonstrating example relocalisations. d Normalised abundance profiles
   showing that SERINC 1, SERINC 3 and TMEM199 show similar behaviour upon
   knockout of AP-4. Source data are provided as a Source data file in
   the [202]Supplementary Material.

   Taking a more precise view on our results, we examine the top 20
   differentially localised proteins in more detail. We compute the
   Spearman correlation matrix between these proteins and observe strong
   correlation, suggesting the proteins act in a coordinated way (see
   Fig. [203]5b). Visualising the data in a heatmap (Fig. [204]5b), after
   mean and variance normalisation, we observe a highly concordant
   pattern: most proteins are enriched in fractions 4 and 5. These
   fractions are obtained from the highest centrifugation speeds and so
   differentially pellet light membrane organelles, such as endosomes and
   lysosomes^[205]11,[206]17. Again, further evidence for the role of
   AP-4-dependent localisation dynamics within the secretary pathway.

   In Fig. [207]5b, we observed a large cluster of 9 proteins, which
   included SERINC 1 and 3. Amongst these 9 proteins is SLC38A2, a
   ubiquitously expressed amino-acid transporter that is widely expressed
   in the central nervous system and is recruited to the plasma membrane
   from a pool localised in the trans-Golgi^[208]76–[209]79. Thus, its
   suggested differential localisation here provides further evidence for
   the role of AP-4 as a membrane trafficker from the trans-Golgi. Another
   protein in this cluster is TMEM 199 ([210]Q8N511) a protein of unknown
   function that is involved in lysosomal degradation^[211]80.
   Furthermore, it has been implicated in Golgi homoeostasis but the
   functional nature of this process is unknown^[212]81. Probing further,
   we observe that TMEM199 acts in a coordinated fashion with SERINC 1 and
   3. Marked re-localisations are observed on PCA plots toward the
   endo/lysosomal regions (see Fig. [213]5c) and we note that the
   quantitative profiles of SERINC 1, SERINC 3 and TMEM199 act in an
   analogous way upon AP-4 knockout (see Fig. [214]5d). Our findings
   motivate additional studies to elucidate AP-4 dependent localisation
   and separate these observations from potential clonal artefacts.

Rewiring the proteome under cytomegalovirus infection

   Human cytomegalovirus (HCMV) infection is a ubiquitous herpesvirus that
   burdens the majority of the population^[215]82. In healthy immune
   systems, HCMV establishes latent infection following initial viral
   communication^[216]83 and reactivation can lead to serious pathology in
   some imunno-compromised individuals^[217]84. HCMV has the largest
   genome of any known human virus, at 236 kbp it encodes for over 170
   proteins that modulate almost all aspects of the hosts cellular
   environment for its benefit^[218]85–[219]87.

   Initial viral infection involves endocytosis of the virion into the
   cell^[220]88, host machinery is then used to transport viral capsids
   into the nucleus^[221]89. Within the host nucleus viral transcription
   and genome replication occurs^[222]90–[223]92. Meanwhile, other viral
   proteins are targeted to the secretory pathway to inhibit the host
   immune response and regulate the expression of viral
   genes^[224]93–[225]98, rewire signalling pathways^[226]99 and modulate
   metabolism^[227]100. In later phases, the cellular trafficking pathways
   and the secretory organelles are hijacked for the formation of the
   viral assembly complex (vAC)^[228]101–[229]105. Due to the diversity of
   cellular processes manipulated during HCMV infection, it is often used
   as a paradigm to analyse virus-host interactions^[230]106.

   There has been a recent flurry in applying system-wide proteomic
   approaches to the HCMV infection model.^[231]106 developed quantitative
   temporal viromics a multiplexed proteomic approach to understand the
   temporal response of thousands of cellular host and viral proteins.
   More recently, to discover proteins involved in the innate immune
   response, a multiplexed proteasome-lysosome degradation assay found
   that more than 100 host proteins are degraded shortly after
   viral-infection^[232]107. Meanwhile, a comprehensive mass spectrometry
   interactome analysis has identified thousands of host-virus
   interactions^[233]108. Furthermore, high-throughput temporal proteomic
   analysis has revealed the importance of protein acetylation
   (post-translational modification of lysine amino acids), as an integral
   component during HCMV infection^[234]109.

   Reference [235]16 use spatial and temporal proteomics to investigate
   the response of the human host proteome to HCMV infection. The authors
   perform subcellular fractionation on uninfected (control) and
   HCMV-infected (treated) cells at 5 different time points:
   24, 48, 72, 96, 120 h post infection (hpi). The authors then used
   neural networks to classify proteins to subcellular niches at each time
   point in the control and treated cells, allowing a descriptive initial
   analysis of the data. Proteins with differential classification at each
   time point are those that are believed to be differential localised.
   However, the challenge of this study is that only a single replicate is
   produced in each condition. This renders the MR method of ref. [236]17
   inapplicable.

   Differential classification is a reasonable approach to probe
   differential localisation though it neglects information shared across
   both experiments and it is not quantitative (i.e. no p-value or
   posterior probability of change). In the case of single replicates, by
   sharing information and providing prior information we are able to
   improve inference and obtain deeper insights. We apply BANDLE to
   control and HCMV-treated cells at 24 hpi, in the interest of brevity,
   to explore further the host spatial-temporal proteome (see
   Supplementary Notes [237]15 and [238]20). Our analysis reflects a
   potential rewiring of the proteome with many possible proteins
   differentially localised on HCMV infection. We highlight an example of
   differential localisation with SCARB1 (see Fig. [239]6a), with a
   localisation in the secretory pathway shifting toward a PM/cytosolic
   localisation, similar to what has previously been observed^[240]16.

Fig. 6. BANDLE applied to the HCMV dataset.

   [241]Fig. 6
   [242]Open in a new tab

   a PCA plots with (smoothed) localisation probabilities projected onto
   them. Each colour represents a subcellular niche and ellipses represent
   lines of isoprobability. The inner ellipse corresponds to 0.99 and the
   proceed line 0.95 with further lines decreasing by 0.05 each time. The
   relocalisation of SCARB1 is highlighted on the plot. b The spatial
   allocation derived from BANDLE where each entry of the heatmap is the
   number of proteins. Off-diagonal entries only include confident
   differential localisations with probability >0.999. c UL148A
   interactome mapped onto the BANDLE determined spatial patterns. Source
   data are provided as a Source data file in the [243]Supplementary
   Material.

   To obtain global insights into the functional behaviour of the
   differentially localised proteins, we performed a Gene Ontology (GO)
   enrichment analysis. An extensive list of terms is enriched and these
   can be divided broadly into subcategories such as translation and
   transcription; transport; viral processes; and immune process (see
   Supplementary Note [244]16). These results reflect closely the early
   phase of HCMV infection^[245]87. Pathway enrichment analysis highlights
   terms related to a viral infection (Viral mRNA Translation, Influenza
   Life Cycle, Infectious disease, Innate Immune System, Immune System,
   MHC class II antigen presentation, Antigen processing-Cross
   presentation, Host Interactions of HIV factors, HIV Infection) (see
   Supplementary Note [246]16). Pathway analysis also reveals known
   processes that are modulated during HCMV infection, such as membrane
   trafficking^[247]110–[248]112, Extracellular matrix
   organization^[249]113 and Rab regulation of trafficking^[250]114.

Integrating HCMV proteomic datasets to add functional relevance to spatial
proteomics data

   The spatial information obtained here allows us to perform careful
   integration with other high-resolution proteomic datasets. The
   degradation screens by ref. [251]107 identified proteins that were
   actively degraded during HCMV infection but gave no information
   regarding the spatial location of the targets. To determine the
   location of host proteins targeted by HCMV for degradation, the BANDLE
   revised spatial data at 24 hpi was overlapped with proteins that were
   degraded by the proteasome or lysosome. The subcellular location of the
   host proteins is displayed for the 24 h timepoint. To determine the
   spatial granularity of the degradation data we tested whether the
   proteins assigned to each spatial pattern had a significantly different
   degradation distribution than the distribution of all proteins in the
   experiment (t-test). We note that proteins that are differentially
   localised are no more likely to be targeted for degradation than those
   that are not (see Supplementary Note [252]17).

   Analysis of changes in protein abundance can be used to generate
   turnover rates in both HCMV-infected and mock-infected cells. Comparing
   these turnover rates allows us to calculate the rescue ratio, which
   identifies proteins that exhibit increased degradation during viral
   infection compared to baseline. Specifically, the rescue ratio is
   obtained by comparing abundance during HCMV infection ∓ inhibitor with
   protein abundance during mock infection ∓ inhibitor. Degradation data
   from ref. [253]107 are overlaid as a heatmap, showing a
   [MATH:
   <mo>−</mo><msub><mrow><mi>log</mi></mrow><mrow><mn>10</mn></mrow></msub
   ><mrow><mo>(</mo><mrow><mi>p</mi><mspace
   width="0.25em"></mspace><mstyle><mtext>-value</mtext></mstyle></mrow><m
   o>)</mo></mrow> :MATH]
   for each inhibitor (see Supplementary Figs. [254]37 and [255]39). For
   proteasomal targeted proteins (MG132 inhibitor), the data highlight a
   high number of proteins degraded from the mitochondria. The
   mitochondria act as a signalling platform for apoptosis and innate
   immunity and it is already well-established that HCMV can subvert these
   processes to its advantage^[256]115. Furthermore, there is a high
   degree of protein degradation as one might expect in proteasome
   fractions (dense cytosol), with an enrichment of proteins recruited
   from the ER and cytosol (see Supplementary Fig. [257]37). For lysosomal
   targeted proteins (leupeptin) there was a high degree of proteins
   degraded from the mitochondria, cytosol and plasma membrane. There were
   also several proteins degraded that moved from the cytosol to the dense
   cytosol (see Supplementary Note [258]17).

   Many host proteins are up-or-down regulated upon HCMV
   infection^[259]106. We examine more recent abundance data from ref.
   [260]109 at 24 hpi and first we note that differentially localised
   proteins are not more abundant than spatially stable proteins (see
   Supplementary Fig. [261]43). However, we see a strong spatial pattern
   when we overlay the abundance pattern on a heatmap. In Supplementary
   Fig. [262]36, we report the mean log2 fold change for proteins
   stratified according to predicted subcellular localisation. It is
   important to combine spatial and abundance data, since a differentially
   localised protein may not undergo a true translocation event but rather
   a new pool of proteins is synthesised. Which of these options is true
   could be further investigated by coupling spatial proteomics workflows
   with time-resolved incorporation of stable isotope labelled amino acids
   or azido-homoalanine to interrogate the location of newly synthesized
   proteins. The significance of these abundance changes is highlighted in
   Supplementary Fig. [263]35. For example, there is a significant
   decrease in the abundance of the protein recruited to the dense cytosol
   from the ER (see Supplementary Fig. [264]44). Some of the larger
   changes are not significant because there are too few proteins with the
   same spatial pattern. We note that FAM3C, a protein involved in
   platelet degranulation, is upregulated at 24 hpi. Furthermore, FAM3C
   relocalises from the Golgi to the lysosome, its Golgi localisation is
   in concordance with the Human Protein Atlas (HPA)^[265]7 and its
   lysosome relocalisation suggests that it is trafficked through the
   secretory pathway before undergoing degranulation.

   Upon integration of the acetylation data of ref. [266]109, the spatial
   patterns are much more nuanced (see Supplementary Note [267]18).
   Perhaps surprisingly, we do not observe increased acetylation levels
   amongst differentially localised proteins (see Supplementary
   Fig. [268]47). The only significant pattern is for proteins
   relocalising from the dense cytosol to the cytosol; however, we observe
   this is driven by a single protein Skp1 (see Supplementary
   Fig. [269]48), which shows a 2.5-fold increase in acetylation at 24 hpi
   for Skp1 and there is an increase in its RNA transcript at
   24 hpi^[270]107. The Skp1 protein is part of an E3 ubiquitin ligase
   complex that targets proteins for degradation. E3 ligases are often
   manipulated by viruses in order to control cellular processes to create
   a cell state that benefits viral replication and survival^[271]116. It
   is therefore possible that HCMV is controlling Skp1 activity through
   acetylation at its C-terminus, leading to its translocation and likely
   change in function.

   The recent publication of the HCMV interactome has provided a wealth of
   data that gives insights into the function of the 170 canonical and two
   non-canonical viral protein-coding genes^[272]108. However, a common
   difficulty with analysing large interactome projects is the ability to
   reduce the number of false-positive interactions, leading to poor
   agreement between experimental and computational datasets. This can be
   controlled through replicates, supervised machine learning and
   increased statistical stringency; however, background contamination can
   never be eliminated. If a protein is located in a single location, you
   would expect true positive interactors to be located in the same
   subcellular compartment. Therefore, to narrow the list of viral-protein
   interactors, we overlapped spatial information from Beltran et
   al.^[273]16 with the viral interactors from Nobre et al.^[274]108
   (Fig. [275]6c, Supplementary Figs. [276]51 and [277]52). This provides
   a far more stringent set of high confidence protein–protein
   interactions. Although, this comes at the cost of removing some
   interactions between proteins that are located in more than one
   subcellular location and are therefore absent from the spatial dataset
   from Beltran et al.^[278]16.

   We plot a heatmap to indicate the spatial distribution of the host
   proteins (Fig. [279]6c). The overall distribution is plotted in the
   heatmap of Fig. [280]6b. Firstly, we are interested in scenarios where
   the interacting host proteins were more likely to retain their
   localisation upon HCMV infection (than the computed posterior
   distribution would have predicted). Thus, for each viral bait, we
   simulated from a binomial A ~ Bin(n, p) where p is the posterior
   probability that a random protein was assigned to the same localisation
   and n is the number of interactors of that viral bait. We then
   simulated from this distribution 5000 times to obtain a histogram (see
   Supplementary Note [281]19). Viral baits of interest are those where
   the observed statistic was in the tails of these histograms.

   Examples of such cases are shown for viral proteins UL8 and UL70
   (Supplementary Figs. [282]52 and [283]53). The majority of UL8
   interactors were located in the plasma membrane and cytosol. UL8 is a
   transmembrane protein that is transiently localised at the cell
   surface, with a small cytoplasmic pool^[284]117, perfectly mimicking
   the location of the majority of UL8 interactors. Practically all UL70
   interactors were located in the cytosol. Viral UL70 is a primase known
   to locate to both the nucleus and cytoplasmic compartments during HCMV
   infection^[285]118. As the nucleus was removed prior to fractionation
   then one expects only to be able to interrogate cytosolic interactors.
   An example where the host proteins were spatially diffuse was UL148A an
   elusive viral protein of unknown function, believed to be involved with
   modulating the innate immune response^[286]119. UL148A appears to
   interact with host proteins distributed throughout the cell suggesting
   it is highly promiscuous (Fig. [287]6c). Perhaps UL148A is a protein
   with multiple possible functions^[288]120 making its function hard to
   pinpoint and such an observation would not be uncommon for viral
   proteins because of limited genomic size^[289]121,[290]122. These
   results illustrate the strength in overlapping spatial proteomics with
   interactome studies to decrease the number of false positives and focus
   research on higher confidence protein–protein interactions. The entire
   list of spatially resolved viral protein interactions is shown in
   the [291]Supplementary Material.

Discussion

   We have presented a Bayesian model for comparative and dynamic spatial
   proteomic experiments. Unlike current approaches, our flexible
   integrative mixture model allows any number of replicate experiments to
   be included. Furthermore, subcellular profiles are modelled separately
   for each condition and each replicate, allowing cases where the
   correlation profiles differ between experiments. Crucially, our model
   facilitates the computation of differential localisation probability,
   which cannot be performed by other methods in the literature.
   Furthermore, BANDLE probabilistically assigns proteins to organelles
   and can model outliers meaning that further supervised machine learning
   after application of BANDLE is not required. The probabilistic ranking
   obtained from BANDLE can be used for downstream pathway or GO
   enrichment analysis, likewise it can be mapped onto other orthogonal
   high-throughput datasets.

   We compared BANDLE to the MR approach of refs. [292]17, [293]32. The MR
   method is not as broadly applicable as BANDLE, and BANDLE does not
   require additional experiments to interpret the thresholds. In our
   careful simulation study, we demonstrate reduced Type 1 error and
   increased power when using our approach. In a further simulation, we
   demonstrated that BANDLE has more desirable statistical properties than
   the MR approach, the results are easier to interpret and more
   information is available. Since we are in a Bayesian framework, our
   approach also quantifies uncertainty allowing us to obtain differential
   localisation probabilities.

   Application of our approach to three dynamic and comparative
   mass-spectrometry-based spatial proteomic experiments demonstrates the
   broad applicability of our approach. We validated many previously known
   findings in the literature, placing confidence in these results. When
   BANDLE was applied to EGF stimulation dataset, we saw increased
   correlation between our differential localisation results and a
   phosphoproteomic timecourse than when compared to the results of the MR
   approach.

   We applied BANDLE to an AP-4 knockout dataset to investigate AP-4
   dependent localisation and, as with other studies, we observe SERINC 1
   and SERINC 3 are examples AP-4 of Cargo. Furthermore, we implicate
   TMEM199 as potentially overlooked AP-4 cargo, though it remains to rule
   it out as a potential clonal artefact. We apply BANDLE to a dataset
   where the MR approach is not applicable—an HCMV infection spatial
   proteomic dataset. Pathway and GO enrichment results implicate
   differentially localised protein in well-studied processes of early
   viral infection; such as, membrane trafficking and immune response.

   We then carefully integrated several HCMV proteomic datasets and placed
   a spatial perspective on these data, including proteins targeted for
   degradation, as well as abundance and acetylation dataset. In addition,
   we augment a recent HCMV interactome by placing it in its spatial
   context and note that most host protein interactomes are in the same
   localisation as their viral bait. This provides an excellent resource
   for the community and highlights the benefit of integrating spatial
   proteomics and interactomics datasets.

   Our analysis here highlights the potential role for post-translational
   modifications (PTMs) and their influence on localisation. The current
   datasets are limited because the spatial information is averaged over
   different PTMs. Thus, it is vital to develop methods to obtain spatial
   PTM information and develop corresponding computational tools to
   analyse these data. Furthermore, our approach here can only look at
   pairs of conditions at a time. In the future, more complex spatial
   proteomics designs will be available that will study multiple
   perturbations simultaneously and our approach will be adapted
   accordingly.

   Overall, differential localisation experiments seek to add an
   orthogonal perspective to other assays, such as classical
   high-throughput differential abundance testing. Currently, differential
   localisation has not been extensively explored in high-throughput. We
   hope rigorous statistical methods will spur extensive and illuminating
   applications. An R-package is provided for analysis at
   [294]https://ococrook.github.io/bandle/, building on a suite of
   packages for analysing spatial proteomics data^[295]25,[296]27,[297]31.

Methods

The movement-reproducibility method

   The movement-reproducibility (MR) method was proposed by refs. [298]17,
   [299]32 and this is our interpretation of their method. We suppose that
   we are given two spatial proteomics experiments under a single
   contrast/perturbation/treatment, and denote unperturbed by (d = 1) and
   (d = 2) for the perturbed condition. Furthermore, assume we measure
   each condition with r = 1, . . . , R biological replicates. Let
   [MATH:
   <msub><mrow><mi>X</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>=</mo><m
   row><mo>[</mo><mrow><msubsup><mrow><mi>X</mi></mrow><mrow><mn>1</mn></m
   row><mrow><mrow><mo>(</mo><mrow><mn>1</mn></mrow><mo>)</mo></mrow></mro
   w></msubsup><mo>,</mo><mo>.</mo><mo>.</mo><mo>.</mo><mo>,</mo><mspace
   width="0.16em"></mspace><msubsup><mrow><mi>X</mi></mrow><mrow><mn>1</mn
   ></mrow><mrow><mrow><mo>(</mo><mrow><mi>R</mi></mrow><mo>)</mo></mrow><
   /mrow></msubsup></mrow><mo>]</mo></mrow> :MATH]
   denote the concatenation of replicates for condition 1 and, likewise,
   for condition 2 we denote
   [MATH:
   <msub><mrow><mi>X</mi></mrow><mrow><mn>2</mn></mrow></msub><mo>=</mo><m
   row><mo>[</mo><mrow><msubsup><mrow><mi>X</mi></mrow><mrow><mn>2</mn></m
   row><mrow><mrow><mo>(</mo><mrow><mn>1</mn></mrow><mo>)</mo></mrow></mro
   w></msubsup><mo>,</mo><mo>.</mo><mo>.</mo><mo>.</mo><mo>,</mo><mspace
   width="0.16em"></mspace><msubsup><mrow><mi>X</mi></mrow><mrow><mn>2</mn
   ></mrow><mrow><mrow><mo>(</mo><mrow><mi>R</mi></mrow><mo>)</mo></mrow><
   /mrow></msubsup></mrow><mo>]</mo></mrow> :MATH]
   . We first compute delta matrices as follows
   [MATH: <mi
   mathvariant="normal">Δ</mi><mo>=</mo><msub><mrow><mi>X</mi></mrow><mrow
   ><mn>1</mn></mrow></msub><mo>−</mo><msub><mrow><mi>X</mi></mrow><mrow><
   mn>2</mn></mrow></msub><mo>,</mo> :MATH]
   1

   where Δ = [Δ^(1), . . . , Δ^(R)]. This assumes that both features and
   replicates are comparable in some way; that is, a feature in the rth
   replicate is directly comparable to the same feature in another
   replicate. Then, for each Δ[r], r = 1, . . , R, the squared Mahalanobis
   distance D[M] from each protein to the empirical mean is computed using
   a robust estimate of the covariance matrix—the minimum covariance
   determination method^[300]123. Under a Gaussian assumption on Δ[r],
   D[M](p[i]) follows a chi-squared distribution with degrees of freedom
   equal to the dimension of the data G. Then, for each protein and each
   replicate a p-value is computed, such that there are R such p-values
   for each protein. These p-values are combined into a score by taking
   the cube of the largest p-value for each protein, correcting for
   multiple hypothesis testing using the Benjamini–Hochberg procedure and
   computing the
   [MATH:
   <mo>−</mo><msub><mrow><mi>log</mi></mrow><mrow><mn>10</mn></mrow></msub
   > :MATH]
   of the resultant value. For ref. [301]17, the p-value is not cubed and
   simply the largest p-value is taken. The final score is called the M
   score.

   This process means that the computed value can no longer be interpreted
   as truly derived from a p-value. To maintain this interpretation, one
   could instead combine p-values using Fisher’s method^[302]45.
   Furthermore, the authors are, implicitly, concerned with finding any
   false positives and as such control over the FWER is desired rather
   than the FDR. Since FWER ≥ FDR, control of the FDR does not lead to
   control over the FWER.

   A so-called reproducibility (R) score is obtained by first computing
   the Pearson correlation pairwise between matrices Δ[i], Δ[j], i ≠ j for
   each protein. A final R score, for each protein, is obtained by taking
   the minimum value for each protein. Again this score could be
   interpreted in a formal testing procedure using a permutation
   test^[303]124 and furthermore includes an assumption of bivariate
   normality. Moreover, Pearson’s correlation is unresponsive to many
   non-linear relationships which might be present.

   Finally, each protein has an associated pair of scores, referred to as
   the MR-score. To determine thresholds for these scores the authors take
   a desired FDR = 0.01. Thus they repeat a control experiment 6 times to
   determine thresholds M = 2, R = 0.9: a region with no false
   discoveries.

   Repeating the control experiment 6 times is a costly process and likely
   to be prohibitive for most experiments, particularly for cells that are
   expensive to culture. Furthermore, since the thresholds are empirically
   derived, this process needs to be repeated for every new experiment to
   determine optimal thresholds.

BANDLE

A model for differential localisation

   In the following, we lay out our model for BANDLE, along with methods
   for inference, and approaches for summarising and visualising the
   output. Firstly, suppose we have two spatial proteomics experiments
   with unperturbed (d = 1) and perturbed conditions (d = 2). Furthermore,
   assume we measure each condition with r = 1, . . . , R biological
   replicates. Let
   [MATH:
   <msub><mrow><mi>X</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>=</mo><m
   row><mo>[</mo><mrow><msubsup><mrow><mi>X</mi></mrow><mrow><mn>1</mn></m
   row><mrow><mrow><mo>(</mo><mrow><mn>1</mn></mrow><mo>)</mo></mrow></mro
   w></msubsup><mspace
   width="0.16em"></mspace><mo>,</mo><mo>.</mo><mo>.</mo><mo>.</mo><mo>,</
   mo><mspace
   width="0.16em"></mspace><msubsup><mrow><mi>X</mi></mrow><mrow><mn>1</mn
   ></mrow><mrow><mrow><mo>(</mo><mrow><mi>R</mi></mrow><mo>)</mo></mrow><
   /mrow></msubsup></mrow><mo>]</mo></mrow> :MATH]
   denote the concatenation of replicates for condition 1 and likewise for
   condition 2 denotes
   [MATH:
   <msub><mrow><mi>X</mi></mrow><mrow><mn>2</mn></mrow></msub><mo>=</mo><m
   row><mo>[</mo><mrow><msubsup><mrow><mi>X</mi></mrow><mrow><mn>2</mn></m
   row><mrow><mrow><mo>(</mo><mrow><mn>1</mn></mrow><mo>)</mo></mrow></mro
   w></msubsup><mo>,</mo><mo>.</mo><mo>.</mo><mo>.</mo><mo>,</mo><msubsup>
   <mrow><mi>X</mi></mrow><mrow><mn>2</mn></mrow><mrow><mrow><mo>(</mo><mr
   ow><mi>R</mi></mrow><mo>)</mo></mrow></mrow></msubsup></mrow><mo>]</mo>
   </mrow> :MATH]
   . We introduce the following latent allocation variable z[i,d],
   representing the localisation of protein i in condition d. Thus, if
   z[i,d] = k this means that protein i localises to organelle k in
   dataset d. Given this latent allocation variable, we assume that the
   data from replicate r = 1, . . . , R arises from some component density
   [MATH:
   <mi>F</mi><mrow><mo>(</mo><mrow><mo>⋅</mo><mo>∣</mo><msubsup><mrow><mi>
   θ</mi></mrow><mrow><mi>k</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</
   mi></mrow><mo>)</mo></mrow></mrow></msubsup></mrow><mo>)</mo></mrow>
   :MATH]
   . Hence, letting θ be the set of all component parameters, we can write
   [MATH:
   <msubsup><mrow><mi>x</mi></mrow><mrow><mi>i</mi><mo>,</mo><mi>d</mi></m
   row><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</mo></mrow></mro
   w></msubsup><mo>∣</mo><msub><mrow><mi>z</mi></mrow><mrow><mi>i</mi><mo>
   ,</mo><mi>d</mi></mrow></msub><mo>,</mo><mspace
   width="0.16em"></mspace><mi>θ</mi><mo>~</mo><mi>F</mi><mfenced
   close=")"
   open="("><mrow><msubsup><mrow><mi>x</mi></mrow><mrow><mi>i</mi><mo>,</m
   o><mi>d</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</
   mo></mrow></mrow></msubsup><mo>∣</mo><msubsup><mrow><mi>θ</mi></mrow><m
   row><msub><mrow><mi>z</mi></mrow><mrow><mi>i</mi><mo>,</mo><mi>d</mi></
   mrow></msub></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</
   mo></mrow></mrow></msubsup></mrow></mfenced><mo>.</mo> :MATH]
   2

   We assume that biological replicates are independent and so we
   factorise as follows
   [MATH:
   <mi>p</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>x</mi></mrow><mrow><mi>
   i</mi><mo>,</mo><mi>d</mi></mrow></msub><mo>∣</mo><msub><mrow><mi>z</mi
   ></mrow><mrow><mi>i</mi><mo>,</mo><mi>d</mi></mrow></msub><mo>,</mo><ms
   pace
   width="0.16em"></mspace><mi>θ</mi></mrow><mo>)</mo></mrow><mo>=</mo><mu
   nderover accent="false"
   accentunder="false"><mrow><mo>∏</mo></mrow><mrow><mi>r</mi><mo>=</mo><m
   n>1</mn></mrow><mrow><mi>R</mi></mrow></munderover><mi>p</mi><mrow><mo>
   (</mo><mrow><msubsup><mrow><mi>x</mi></mrow><mrow><mi>i</mi><mo>,</mo><
   mi>d</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</mo>
   </mrow></mrow></msubsup><mo>∣</mo><msub><mrow><mi>z</mi></mrow><mrow><m
   i>i</mi><mo>,</mo><mi>d</mi></mrow></msub><mo>,</mo><mspace
   width="0.16em"></mspace><msubsup><mrow><mi>θ</mi></mrow><mrow><msub><mr
   ow><mi>z</mi></mrow><mrow><mi>i</mi><mo>,</mo><mi>d</mi></mrow></msub><
   /mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</mo></mrow></m
   row></msubsup></mrow><mo>)</mo></mrow><mo>.</mo> :MATH]
   3

   To couple the two conditions together we assume a joint prior structure
   for the latent allocation variable in each dataset. To be more precise,
   we construct a prior for the pair (z[i,1], z[i,2]). We fix the possible
   number of subcellular niches to which a protein may localise to be K.
   Now, we introduce the matrix Dirichlet distribution, which we denote as
   [MATH: <mi class="MJX-tex-caligraphic" mathvariant="script">M</mi><mi
   mathvariant="normal">Dir</mi><mrow><mo>(</mo><mrow><mi>α</mi><mo>,</mo>
   <mspace width="0.16em"></mspace><mi>K</mi></mrow><mo>)</mo></mrow>
   :MATH]
   . The concentration parameter α is a K × K matrix, such that for a
   matrix π, the pdf of the matrix Dirichlet distribution is
   [MATH: <mi>f</mi><mrow><mo>(</mo><mrow><mi
   mathvariant="bold-italic">π</mi><mo>∣</mo><mi>α</mi></mrow><mo>)</mo></
   mrow><mo>=</mo><munderover accent="false"
   accentunder="false"><mrow><mo>∏</mo></mrow><mrow><mi>k</mi><mo>=</mo><m
   n>1</mn></mrow><mrow><mi>K</mi></mrow></munderover><mfrac><mrow><mn>1</
   mn></mrow><mrow><mi class="MJX-tex-caligraphic"
   mathvariant="script">B</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>α</mi>
   </mrow><mrow><mi>k</mi></mrow></msub></mrow><mo>)</mo></mrow></mrow></m
   frac><munderover accent="false"
   accentunder="false"><mrow><mo>∏</mo></mrow><mrow><mi>j</mi><mo>=</mo><m
   n>1</mn></mrow><mrow><mi>K</mi></mrow></munderover><msubsup><mrow><mi>π
   </mi></mrow><mrow><mi>j</mi><mi>k</mi></mrow><mrow><msub><mrow><mi>α</m
   i></mrow><mrow><mi>j</mi><mi>k</mi></mrow></msub><mo>−</mo><mn>1</mn></
   mrow></msubsup><mo>,</mo> :MATH]
   4

   where
   [MATH: <mi class="MJX-tex-caligraphic" mathvariant="script">B</mi>
   :MATH]
   denotes the beta function, α[k] denotes the kth row of α, and
   ∑[j,k] π[jk] = 1. Thus, we propose the following hierarchical structure
   [MATH: <mi
   mathvariant="bold-italic">π</mi><mo>∣</mo><mi>α</mi><mo>~</mo><mi
   class="MJX-tex-caligraphic" mathvariant="script">M</mi><mi
   mathvariant="normal">Dir</mi><mrow><mo>(</mo><mrow><mi>α</mi><mo>,</mo>
   <mspace width="0.16em"></mspace><mi>K</mi></mrow><mo>)</mo></mrow>
   :MATH]
   5
   [MATH:
   <mrow><mo>(</mo><mrow><msub><mrow><mi>z</mi></mrow><mrow><mi>i</mi><mo>
   ,</mo><mn>1</mn></mrow></msub><mo>,</mo><mspace
   width="0.16em"></mspace><msub><mrow><mi>z</mi></mrow><mrow><mi>i</mi><m
   o>,</mo><mn>2</mn></mrow></msub></mrow><mo>)</mo></mrow><mo>~</mo><mi>c
   </mi><mi>a</mi><mi>t</mi><mrow><mo>(</mo><mrow><mi
   mathvariant="bold-italic">π</mi></mrow><mo>)</mo></mrow><mo>,</mo>
   :MATH]
   6

   where (z[i,1], z[i,2]) ~ cat(π) means that the prior allocation
   probabilities are given by
   [MATH:
   <mi>p</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>z</mi></mrow><mrow><mi>
   i</mi><mo>,</mo><mn>1</mn></mrow></msub><mo>=</mo><mi>k</mi><mo>,</mo><
   mspace
   width="0.16em"></mspace><msub><mrow><mi>z</mi></mrow><mrow><mi>i</mi><m
   o>,</mo><mn>2</mn></mrow></msub><mo>=</mo><msup><mrow><mi>k</mi></mrow>
   <mrow><mo>′</mo></mrow></msup><mo>∣</mo><mi
   mathvariant="bold-italic">π</mi></mrow><mo>)</mo></mrow><mo>=</mo><msub
   ><mrow><mi>π</mi></mrow><mrow><mi>k</mi><msup><mrow><mi>k</mi></mrow><m
   row><mo>′</mo></mrow></msup></mrow></msub><mo>.</mo> :MATH]
   7

   The above model is conjugate, and so if
   [MATH:
   <msub><mrow><mi>n</mi></mrow><mrow><mi>j</mi><mo>,</mo><mi>k</mi></mrow
   ></msub><mo>=</mo><mo>∣</mo><mfenced close="}"
   open="{"><mrow><mrow><mo>(</mo><mrow><msub><mrow><mi>z</mi></mrow><mrow
   ><mi>i</mi><mo>,</mo><mn>1</mn></mrow></msub><mo>,</mo><mspace
   width="0.16em"></mspace><msub><mrow><mi>z</mi></mrow><mrow><mi>i</mi><m
   o>,</mo><mn>2</mn></mrow></msub></mrow><mo>)</mo></mrow><mo>=</mo><mrow
   ><mo>(</mo><mrow><mi>j</mi><mo>,</mo><mspace
   width="0.16em"></mspace><mi>k</mi></mrow><mo>)</mo></mrow></mrow></mfen
   ced><mo>∣</mo> :MATH]
   , it follows that the conditional posterior of π is
   [MATH: <mi
   mathvariant="bold-italic">π</mi><mo>∣</mo><mrow><mo>(</mo><mrow><msub><
   mrow><mi>Z</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>,</mo><mspace
   width="0.16em"></mspace><msub><mrow><mi>Z</mi></mrow><mrow><mn>2</mn></
   mrow></msub></mrow><mo>)</mo></mrow><mo>,</mo><mi>α</mi><mo>~</mo><mi
   class="MJX-tex-caligraphic" mathvariant="script">M</mi><mi
   mathvariant="normal">Dir</mi><mrow><mo>(</mo><mrow><mi>γ</mi><mo>,</mo>
   <mspace
   width="0.16em"></mspace><mi>K</mi></mrow><mo>)</mo></mrow><mo>,</mo>
   :MATH]
   8

   where γ[j,k] = α[jk] + n[j,k]. The likelihood models for the data are
   Gaussian Random Fields, which we elaborate on in the following section.
   Hence, the conditional posterior of the allocation probabilities are
   given by
   [MATH:
   <mi>p</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>z</mi></mrow><mrow><mi>
   i</mi><mo>,</mo><mn>1</mn></mrow></msub><mo>=</mo><mi>j</mi><mo>,</mo><
   msub><mrow><mi>z</mi></mrow><mrow><mi>i</mi><mo>,</mo><mn>2</mn></mrow>
   </msub><mo>=</mo><mi>k</mi><mo>∣</mo><mi
   mathvariant="bold-italic">π</mi></mrow><mo>)</mo></mrow><mo>∝</mo><msub
   ><mrow><mi>π</mi></mrow><mrow><mi>j</mi><mi>k</mi></mrow></msub><munder
   over accent="false"
   accentunder="false"><mrow><mo>∏</mo></mrow><mrow><mi>r</mi><mo>=</mo><m
   n>1</mn></mrow><mrow><mi>R</mi></mrow></munderover><mi>p</mi><mrow><mo>
   (</mo><mrow><msubsup><mrow><mi>x</mi></mrow><mrow><mi>i</mi><mo>,</mo><
   mn>1</mn></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</mo>
   </mrow></mrow></msubsup><mo>∣</mo><msub><mrow><mi>z</mi></mrow><mrow><m
   i>i</mi><mo>,</mo><mn>1</mn></mrow></msub><mo>=</mo><mi>j</mi></mrow><m
   o>)</mo></mrow><mi>p</mi><mrow><mo>(</mo><mrow><msubsup><mrow><mi>x</mi
   ></mrow><mrow><mi>i</mi><mo>,</mo><mn>2</mn></mrow><mrow><mrow><mo>(</m
   o><mrow><mi>r</mi></mrow><mo>)</mo></mrow></mrow></msubsup><mo>∣</mo><m
   sub><mrow><mi>z</mi></mrow><mrow><mi>i</mi><mo>,</mo><mn>2</mn></mrow><
   /msub><mo>=</mo><mi>k</mi></mrow><mo>)</mo></mrow><mo>.</mo> :MATH]
   9

Likelihood model

   The model described in the previous section is presented in a general
   form, so it could be applied to many different modes of data. We
   describe the model for a single spatial proteomics experiment, since
   the same model is assumed across all spatial proteomics experiments
   that are then subsequently joined together using the approach in the
   previous section. Though the model is the same across experiments, the
   parameters are experiment-specific.

   We assume that the protein intensity x[i] at each fraction s[j] can be
   described by some regression model with unknown regression function:
   [MATH:
   <msub><mrow><mi>x</mi></mrow><mrow><mi>i</mi></mrow></msub><mrow><mo>(<
   /mo><mrow><msub><mrow><mi>s</mi></mrow><mrow><mi>j</mi></mrow></msub></
   mrow><mo>)</mo></mrow><mo>=</mo><msub><mrow><mi>μ</mi></mrow><mrow><mi>
   i</mi></mrow></msub><mrow><mo>(</mo><mrow><msub><mrow><mi>s</mi></mrow>
   <mrow><mi>j</mi></mrow></msub></mrow><mo>)</mo></mrow><mo>+</mo><msub><
   mrow><mi>ε</mi></mrow><mrow><mi>i</mi><mi>j</mi></mrow></msub><mo>,</mo
   > :MATH]
   10

   where μ[i] is some unknown deterministic function of space and ε[ij] is
   a noise variable, which we assume is
   [MATH:
   <msub><mrow><mi>ε</mi></mrow><mrow><mi>i</mi><mi>j</mi></mrow></msub><m
   o>~</mo><mi class="MJX-tex-caligraphic"
   mathvariant="script">N</mi><mrow><mo>(</mo><mrow><mn>0</mn><mo>,</mo><m
   space
   width="0.16em"></mspace><msubsup><mrow><mi>σ</mi></mrow><mrow><mi>i</mi
   ></mrow><mrow><mn>2</mn></mrow></msubsup></mrow><mo>)</mo></mrow>
   :MATH]
   . Proteins are grouped together according to their subcellular
   localisation; such that, all proteins associated to subcellular niche
   k = 1, . . . , K share the same regression model. Hence, we write
   μ[i] = μ[k] and σ[i] = σ[k]. Throughout, for clarity, we refer to
   subcellular structures, whether they are organelles, vesicles or large
   protein complexes, as components. The regression functions μ[k] are
   unknown and thus we place priors over these functions to represent our
   prior uncertainty. Protein intensities are spatially correlated and
   thus we place Gaussian random field (GRF) priors over these regression
   functions. We pedantically refer to these as GRF priors rather than
   Gaussian process (GP) priors to make the distinction between the 1D
   spatial process that separates subcellular niches and the experimental
   cellular perturbations, which are potentially temporal in nature.
   Hence, we write the following
   [MATH:
   <msub><mrow><mi>μ</mi></mrow><mrow><mi>k</mi></mrow></msub><mo>~</mo><m
   i>G</mi><mi>R</mi><mi>F</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>m</mi
   ></mrow><mrow><mi>k</mi></mrow></msub><mrow><mo>(</mo><mrow><mi
   mathvariant="bold-italic">s</mi></mrow><mo>)</mo></mrow><mo>,</mo><mspa
   ce
   width="0.16em"></mspace><msub><mrow><mi>C</mi></mrow><mrow><mi>k</mi></
   mrow></msub><mrow><mo>(</mo><mrow><mi
   mathvariant="bold-italic">s</mi><mo>,</mo><mspace
   width="0.16em"></mspace><msup><mrow><mi
   mathvariant="bold-italic">s</mi></mrow><mrow><mo>′</mo></mrow></msup></
   mrow><mo>)</mo></mrow></mrow><mo>)</mo></mrow><mo>,</mo> :MATH]
   11

   which is defined as:

   Definition 1. Gaussian Random Field

   If
   [MATH: <mi>μ</mi><mrow><mo>(</mo><mrow><mi
   mathvariant="bold-italic">s</mi></mrow><mo>)</mo></mrow><mo>~</mo><mi>G
   </mi><mi>R</mi><mi>F</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>m</mi></
   mrow><mrow><mi>k</mi></mrow></msub><mrow><mo>(</mo><mrow><mi
   mathvariant="bold-italic">s</mi></mrow><mo>)</mo></mrow><mo>,</mo><msub
   ><mrow><mi>C</mi></mrow><mrow><mi>k</mi></mrow></msub><mrow><mo>(</mo><
   mrow><mi mathvariant="bold-italic">s</mi><mo>,</mo><msup><mrow><mi
   mathvariant="bold-italic">s</mi></mrow><mrow><mo>′</mo></mrow></msup></
   mrow><mo>)</mo></mrow></mrow><mo>)</mo></mrow> :MATH]
   then for any finite dimensional collection of indices
   s[1], . . . , s[n],
   [MATH: <mfenced close="]"
   open="["><mrow><mi>μ</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>s</mi></
   mrow><mrow><mn>1</mn></mrow></msub></mrow><mo>)</mo></mrow><mo>,</mo><m
   o>.</mo><mo>.</mo><mo>.</mo><mo>,</mo><mi>μ</mi><mrow><mo>(</mo><mrow><
   msub><mrow><mi>s</mi></mrow><mrow><mi>n</mi></mrow></msub></mrow><mo>)<
   /mo></mrow></mrow></mfenced> :MATH]
   is multivariate Gaussian with mean
   [MATH: <mfenced close="]"
   open="["><mrow><mi>m</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>s</mi></
   mrow><mrow><mn>1</mn></mrow></msub></mrow><mo>)</mo></mrow><mo>,</mo><m
   o>.</mo><mo>.</mo><mo>.</mo><mo>,</mo><mi>m</mi><mrow><mo>(</mo><mrow><
   msub><mrow><mi>s</mi></mrow><mrow><mi>n</mi></mrow></msub></mrow><mo>)<
   /mo></mrow></mrow></mfenced> :MATH]
   and covariance matrix such that C[ij] = C(s[i], s[j]).

   Thus, each component is captured by a Gaussian Random Field model and
   the full complement of proteins as a finite mixture of GRF models. The
   protein intensity for each experiment might be measured in replicates.
   For a sufficiently flexible model, we allow different regression models
   across different replicates. To be more precise, consider the protein
   intensity
   [MATH:
   <msubsup><mrow><mi>x</mi></mrow><mrow><mi>i</mi></mrow><mrow><mrow><mo>
   (</mo><mrow><mi>r</mi></mrow><mo>)</mo></mrow></mrow></msubsup> :MATH]
   for the ith protein measured in replicate r at fraction
   [MATH:
   <msubsup><mrow><mi>s</mi></mrow><mrow><mi>j</mi></mrow><mrow><mrow><mo>
   (</mo><mrow><mi>r</mi></mrow><mo>)</mo></mrow></mrow></msubsup> :MATH]
   , then we can write the following
   [MATH:
   <msubsup><mrow><mi>x</mi></mrow><mrow><mi>i</mi></mrow><mrow><mrow><mo>
   (</mo><mrow><mi>r</mi></mrow><mo>)</mo></mrow></mrow></msubsup><mfenced
   close=")"
   open="("><mrow><msubsup><mrow><mi>s</mi></mrow><mrow><mi>j</mi></mrow><
   mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</mo></mrow></mrow></m
   subsup></mrow></mfenced><mo>=</mo><msubsup><mrow><mi>μ</mi></mrow><mrow
   ><mi>k</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</m
   o></mrow></mrow></msubsup><mfenced close=")"
   open="("><mrow><msubsup><mrow><mi>s</mi></mrow><mrow><mi>j</mi></mrow><
   mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</mo></mrow></mrow></m
   subsup></mrow></mfenced><mo>+</mo><msubsup><mrow><mi>ε</mi></mrow><mrow
   ><mi>i</mi><mi>j</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mro
   w><mo>)</mo></mrow></mrow></msubsup><mo>,</mo> :MATH]
   12

   having assumed that the ith protein is associated to the kth component.
   The (hyper)parameters for the Gaussian Random Field priors for the rth
   replicate in experiment d are denoted by
   [MATH:
   <msubsup><mrow><mi>θ</mi></mrow><mrow><mi>k</mi><mo>,</mo><mi>d</mi></m
   row><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</mo></mrow></mro
   w></msubsup> :MATH]
   . We denote by θ the collection of all hyperparameters and the
   collection of priors for these hyperparameters by G[0](θ). The loss of
   conjugacy between the prior on the hyperparameters and likelihood is
   unavoidable.

   The GRF is used to model the uncertainty in the underlying regression
   functions; however, we have yet to consider the uncertainty that a
   protein belongs to each of these components. To capture these
   uncertainties, we can use the model in the previous section, allowing
   information to be shared across each condition. Following from the
   previous section, the conditional posterior of the allocation
   probabilities is
   [MATH:
   <mi>p</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>z</mi></mrow><mrow><mi>
   i</mi><mo>,</mo><mn>1</mn></mrow></msub><mo>=</mo><mi>j</mi><mo>,</mo><
   msub><mrow><mi>z</mi></mrow><mrow><mi>i</mi><mo>,</mo><mn>2</mn></mrow>
   </msub><mo>=</mo><mi>k</mi><mo>∣</mo><mi
   mathvariant="bold-italic">π</mi></mrow><mo>)</mo></mrow><mo>∝</mo><msub
   ><mrow><mi>π</mi></mrow><mrow><mi>j</mi><mi>k</mi></mrow></msub><munder
   over accent="false"
   accentunder="false"><mrow><mo>∏</mo></mrow><mrow><mi>r</mi><mo>=</mo><m
   n>1</mn></mrow><mrow><mi>R</mi></mrow></munderover><mi>p</mi><mfenced
   close=")"
   open="("><mrow><msubsup><mrow><mi>x</mi></mrow><mrow><mi>i</mi><mo>,</m
   o><mn>1</mn></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</
   mo></mrow></mrow></msubsup><mo>∣</mo><msub><mrow><mi>z</mi></mrow><mrow
   ><mi>i</mi><mo>,</mo><mn>1</mn></mrow></msub><mo>=</mo><mi>j</mi></mrow
   ></mfenced><mi>p</mi><mfenced close=")"
   open="("><mrow><msubsup><mrow><mi>x</mi></mrow><mrow><mi>i</mi><mo>,</m
   o><mn>2</mn></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</
   mo></mrow></mrow></msubsup><mo>∣</mo><msub><mrow><mi>z</mi></mrow><mrow
   ><mi>i</mi><mo>,</mo><mn>2</mn></mrow></msub><mo>=</mo><mi>k</mi></mrow
   ></mfenced><mo>,</mo> :MATH]
   13

   where, in the specific case of our likelihood model the probabilities
   in the terms of the product can be computed using the appropriate GRF.

   We assume that our GRFs are centred and that the covariance is from the
   Matern class^[304]125. The Matern covariance is specified as follows
   [MATH:
   <msub><mrow><mi>C</mi></mrow><mrow><mi>v</mi></mrow></msub><mrow><mo>(<
   /mo><mrow><mi>d</mi></mrow><mo>)</mo></mrow><mo>=</mo><msup><mrow><mi>a
   </mi></mrow><mrow><mn>2</mn></mrow></msup><mfrac><mrow><msup><mrow><mn>
   2</mn></mrow><mrow><mn>1</mn><mo>−</mo><mi>ν</mi></mrow></msup></mrow><
   mrow><mi
   mathvariant="normal">Γ</mi><mrow><mo>(</mo><mrow><mi>ν</mi></mrow><mo>)
   </mo></mrow></mrow></mfrac><msup><mrow><mfenced close=")"
   open="("><mrow><msqrt><mrow><mn>8</mn><mi>ν</mi></mrow></msqrt><mfrac><
   mrow><mi>d</mi></mrow><mrow><mi>ρ</mi></mrow></mfrac></mrow></mfenced><
   /mrow><mrow><mi>ν</mi></mrow></msup><msub><mrow><mi
   class="MJX-tex-caligraphic"
   mathvariant="script">K</mi></mrow><mrow><mi>v</mi></mrow></msub><mfence
   d close=")"
   open="("><mrow><msqrt><mrow><mn>8</mn><mi>ν</mi></mrow></msqrt><mfrac><
   mrow><mi>d</mi></mrow><mrow><mi>ρ</mi></mrow></mfrac></mrow></mfenced><
   mo>,</mo> :MATH]
   14

   where Γ is the gamma function and
   [MATH: <msub><mrow><mi class="MJX-tex-caligraphic"
   mathvariant="script">K</mi></mrow><mrow><mi>v</mi></mrow></msub> :MATH]
   denotes the modified Bessel function of the second kind of order ν > 0.
   Furthermore, a and ρ are positive parameters of the covariance. a^2 is
   interpreted as a marginal variance, while the non-standard choice of
   [MATH: <msqrt><mrow><mn>8</mn><mi>ν</mi></mrow></msqrt> :MATH]
   in the definition of the Matern covariance, allows us to interpret ρ as
   a range parameter and thus ρ is the distance at which the correlation
   is 0.1 for any ν^[305]126. The Matern covariance arises from solutions
   of the following linear fractional stochastic partial differential
   equation (SPDE):
   [MATH:
   <msup><mrow><mrow><mo>(</mo><mrow><msup><mrow><mi>κ</mi></mrow><mrow><m
   n>2</mn></mrow></msup><mo>−</mo><mi
   mathvariant="normal">Δ</mi></mrow><mo>)</mo></mrow></mrow><mrow><mi>α</
   mi><mo>/</mo><mn>2</mn></mrow></msup><mi>x</mi><mrow><mo>(</mo><mrow><m
   i>u</mi></mrow><mo>)</mo></mrow><mo>=</mo><mi
   class="MJX-tex-caligraphic"
   mathvariant="script">W</mi><mrow><mo>(</mo><mrow><mi>u</mi></mrow><mo>)
   </mo></mrow><mo>,</mo><mspace
   width="0.16em"></mspace><mi>u</mi><mo>∈</mo><msup><mrow><mi
   mathvariant="double-struck">R</mi></mrow><mrow><mi>d</mi></mrow></msup>
   <mspace width="0.25em"></mspace><mspace
   width="0.25em"></mspace><mi>α</mi><mo>=</mo><mi>ν</mi><mo>+</mo><mi>d</
   mi><mo>/</mo><mn>2</mn><mo>,</mo><mspace
   width="0.25em"></mspace><mspace width="0.25em"></mspace><mspace
   width="0.25em"></mspace><mi>κ</mi><mspace
   width="0.16em"></mspace><mo>></mo><mspace
   width="0.16em"></mspace><mn>0</mn><mo>,</mo><mspace
   width="0.16em"></mspace><mi>ν</mi><mspace
   width="0.16em"></mspace><mo>></mo><mspace
   width="0.16em"></mspace><mn>0</mn><mo>,</mo> :MATH]
   15

   where
   [MATH: <mi class="MJX-tex-caligraphic"
   mathvariant="script">W</mi><mrow><mo>(</mo><mrow><mi>u</mi></mrow><mo>)
   </mo></mrow> :MATH]
   is spatial Gaussian white noise with unit variance and Δ is the
   Laplacian. The parameter ν controls the differentiability of the
   resulting sample paths; such that, ⌈v⌉ is the number of mean-square
   derivatives. For typical applications, ν is poorly identifiable and
   fixed; ν = 1/2 recovers the exponential covariance, whereas taking the
   limit ν → ∞ one obtains the squared exponential (Gaussian) covariance.
   We fix ν = 2.

   A ridge in the marginal likelihood for the marginal variance and range
   parameters of the Matern covariance makes inference challenging.
   Indeed, different hyperparameters lead to unconditional prior
   simulations with the same spatial pattern but different
   scales^[306]127,[307]128. Furthermore, when the intrinsic dimension of
   the Gaussian random field is less than four, there is no consistent
   estimator under in-fill asymptotics for ρ and a. A principled prior,
   which allows domain expertise to be expressed, is thus desired to
   enable stable inferences. A number of works considered reference priors
   for GRFs^[308]129–[309]132. Here, we employ a recently introduced
   collection of weakly-informative penalised complexity (PC) priors,
   which we explain in the next section.

Penalised complexity priors for GRFs

   Here, we briefly described the PC priors used for the hyperparameters
   of the GRF models. Recall that ν models the smoothness and is fixed at
   2. The idea behind the PC prior is to shrink the model towards a
   simpler model of lower complexity. In the case of GRF, these are models
   that cannot excessively curve, choosing to explain high frequency
   fluctuations with a wide variance. Fuglstad et al.^[310]128 derive the
   appropriate PC prior as the following on the amplitude a and the
   length-scale/range ρ:
   [MATH:
   <mi>π</mi><mrow><mo>(</mo><mrow><mi>a</mi><mo>,</mo><mi>ρ</mi></mrow><m
   o>)</mo></mrow><mo>=</mo><mfrac><mrow><msub><mrow><mi>λ</mi></mrow><mro
   w><mn>1</mn></mrow></msub><msub><mrow><mi>λ</mi></mrow><mrow><mn>2</mn>
   </mrow></msub></mrow><mrow><mn>2</mn></mrow></mfrac><msup><mrow><mi>ρ</
   mi></mrow><mrow><mo>−</mo><mn>3</mn><mo>/</mo><mn>2</mn></mrow></msup><
   mi>exp</mi><mfenced close=")"
   open="("><mrow><mo>−</mo><msub><mrow><mi>λ</mi></mrow><mrow><mn>1</mn><
   /mrow></msub><msup><mrow><mi>ρ</mi></mrow><mrow><mo>−</mo><mn>1</mn><mo
   >/</mo><mn>2</mn></mrow></msup><mo>−</mo><msub><mrow><mi>λ</mi></mrow><
   mrow><mn>2</mn></mrow></msub><mi>a</mi></mrow></mfenced><mo>,</mo>
   :MATH]
   16

   where λ[1] and λ[2] are hyperparameters that control shrinkage towards
   the simpler model. Further details are found in the [311]Supplementary
   Methods. As a default, λ[1] = 10 and λ[2] = 60 and can be assessed by
   visual prior predictive checks^[312]133. The defaults were used
   throughout except for the simulated examples and the Ithzak dataset for
   which λ[1] = 0.05 was chosen by visual assessment.

Penalised complexity prior for the noise model

   The noise effect is distributed according to
   [MATH:
   <msub><mrow><mi>ϵ</mi></mrow><mrow><mi>i</mi><mi>j</mi></mrow></msub><m
   o>~</mo><mi class="MJX-tex-caligraphic"
   mathvariant="script">N</mi><mrow><mo>(</mo><mrow><mn>0</mn><mo>,</mo><m
   space
   width="0.16em"></mspace><msubsup><mrow><mi>σ</mi></mrow><mrow><mi>k</mi
   ></mrow><mrow><mn>2</mn></mrow></msubsup></mrow><mo>)</mo></mrow>
   :MATH]
   for k = 1, . . . , K. We additionally choose a PC prior in this
   scenario, first we reparametrize in terms of a precision
   [MATH:
   <msub><mrow><mi>τ</mi></mrow><mrow><mi>k</mi></mrow></msub><mo>=</mo><m
   n>1</mn><mo>/</mo><msubsup><mrow><mi>σ</mi></mrow><mrow><mi>k</mi></mro
   w><mrow><mn>2</mn></mrow></msubsup> :MATH]
   for k = 1, . . . , K. Then appealing to^[313]134 the PC prior is a
   type-2 Gumbel distribution:
   [MATH:
   <mi>π</mi><mrow><mo>(</mo><mrow><mi>τ</mi></mrow><mo>)</mo></mrow><mo>=
   </mo><mfrac><mrow><msub><mrow><mi>λ</mi></mrow><mrow><mn>3</mn></mrow><
   /msub></mrow><mrow><mn>2</mn></mrow></mfrac><msup><mrow><mi>τ</mi></mro
   w><mrow><mo>−</mo><mn>3</mn><mo>/</mo><mn>2</mn></mrow></msup><mi>exp</
   mi><mrow><mo>(</mo><mrow><mo>−</mo><msub><mrow><mi>λ</mi></mrow><mrow><
   mn>3</mn></mrow></msub><msup><mrow><mi>τ</mi></mrow><mrow><mo>−</mo><mn
   >1</mn><mo>/</mo><mn>2</mn></mrow></msup></mrow><mo>)</mo></mrow><mo>.<
   /mo> :MATH]
   17

   The PC prior in this case shrinks towards zero variance. More details
   are found in the [314]Supplementary Methods. As a default, λ[3] = 250
   and can be assessed by visual prior predictive checks. For the Davies
   dataset λ[3] = 200, and for simulated examples and Itzhak dataset
   λ[3] = 100, which were chosen by visual assessment.

Modelling outliers

   As shown in previous work, some proteins are not well described by any
   of the annotated components^[315]30. This could be because of
   undiscovered biological novelty, poor protein quantitation, the protein
   could reside in a yet to be described subcellular component or in
   multiple annotated compartments. To alleviate this issue we augment our
   model with an additional outlier component. We introduce a latent
   binary indicator ϕ[i,d] to denote whether protein i is better modelled
   as belonging to a known subcellular niche (ϕ[i,d] = 1) or a disperse
   outlier component (ϕ[i,d] = 0) in dataset d. Since an indicator can
   only take two values, it has a Bernoulli distribution and so we write
   p[0](ϕ[i,d] = 0) = ε[d]. As in previous work, we model the density of
   the outlier component by a student’s-t distribution with degrees of
   freedom 4, mean equal to the empirical mean and the covariance to be
   the empirical covariance of the data. We also assume the covariance
   matrix to be diagonal. Finally, we place a Beta prior on
   ϵ[d] ~ B(u[d], v[d]) allowing us to specify a prior number of outliers.
   We opt for a previously recommended weakly-informative prior u[d] = 2
   and v[d] = 10 for d = 1, 2^[316]30. The relevant conditional
   distributions for MCMC sampling are found in the [317]Supplementary
   Methods.

Bandle in hierarchical model notation

   To summarise the specification of the model, we display the bandle
   model in the following Bayesian Hierarchical model:
   [MATH:
   <msubsup><mrow><mi>x</mi></mrow><mrow><mi>i</mi><mo>,</mo><mi>d</mi></m
   row><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</mo></mrow></mro
   w></msubsup><mo>∣</mo><msub><mrow><mi>z</mi></mrow><mrow><mi>i</mi><mo>
   ,</mo><mi>d</mi></mrow></msub><mo>=</mo><mi>k</mi><mo>,</mo><mi>θ</mi><
   mo>,</mo><msub><mrow><mi>ϕ</mi></mrow><mrow><mi>i</mi><mo>,</mo><mi>d</
   mi></mrow></msub><mo>=</mo><mn>1</mn><mo>~</mo><msubsup><mrow><mi>μ</mi
   ></mrow><mrow><mi>k</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></
   mrow><mo>)</mo></mrow></mrow></msubsup><mrow><mo>(</mo><mrow><msub><mro
   w><mi>s</mi></mrow><mrow><mi>j</mi></mrow></msub></mrow><mo>)</mo></mro
   w><mo>+</mo><msubsup><mrow><mi>ϵ</mi></mrow><mrow><mi>k</mi><mi>j</mi><
   /mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</mo></mrow></m
   row></msubsup><msubsup><mrow><mi>x</mi></mrow><mrow><mi>i</mi><mo>,</mo
   ><mi>d</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</m
   o></mrow></mrow></msubsup><mo>∣</mo><msub><mrow><mi>z</mi></mrow><mrow>
   <mi>i</mi><mo>,</mo><mi>d</mi></mrow></msub><mo>=</mo><mi>k</mi><mo>,</
   mo><mi>θ</mi><mo>,</mo><msub><mrow><mi>ϕ</mi></mrow><mrow><mi>i</mi><mo
   >,</mo><mi>d</mi></mrow></msub><mo>=</mo><mn>0</mn><mo>~</mo><mi
   class="MJX-tex-caligraphic"
   mathvariant="script">T</mi><mrow><mo>(</mo><mrow><mn>4</mn><mo>,</mo><m
   i>M</mi><mo>,</mo><mi>V</mi></mrow><mo>)</mo></mrow><msubsup><mrow><mi>
   μ</mi></mrow><mrow><mi>k</mi><mo>,</mo><mi>d</mi></mrow><mrow><mrow><mo
   >(</mo><mrow><mi>r</mi></mrow><mo>)</mo></mrow></mrow></msubsup><mo>~</
   mo><mi>G</mi><mi>R</mi><mi>F</mi><mrow><mo>(</mo><mrow><msubsup><mrow><
   mi>m</mi></mrow><mrow><mi>k</mi><mo>,</mo><mi>d</mi></mrow><mrow><mrow>
   <mo>(</mo><mrow><mi>r</mi></mrow><mo>)</mo></mrow></mrow></msubsup><mro
   w><mo>(</mo><mrow><mi
   mathvariant="bold-italic">s</mi></mrow><mo>)</mo></mrow><mo>,</mo><msub
   sup><mrow><mi>C</mi></mrow><mrow><mi>k</mi><mo>,</mo><mi>d</mi></mrow><
   mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</mo></mrow></mrow></m
   subsup><mrow><mo>(</mo><mrow><mi
   mathvariant="bold-italic">s</mi><mo>,</mo><msup><mrow><mi
   mathvariant="bold-italic">s</mi></mrow><mrow><mo>′</mo></mrow></msup></
   mrow><mo>)</mo></mrow></mrow><mo>)</mo></mrow><msubsup><mrow><mi>ϵ</mi>
   </mrow><mrow><mi>k</mi><mi>j</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi
   >r</mi></mrow><mo>)</mo></mrow></mrow></msubsup><mo>~</mo><mi
   class="MJX-tex-caligraphic"
   mathvariant="script">N</mi><mrow><mo>(</mo><mrow><mn>0</mn><mo>,</mo><m
   subsup><mrow><mi>σ</mi></mrow><mrow><mi>r</mi><mo>,</mo><mi>k</mi></mro
   w><mrow><mn>2</mn></mrow></msubsup></mrow><mo>)</mo></mrow><mn>1</mn><m
   o>/</mo><msubsup><mrow><mi>σ</mi></mrow><mrow><mi>r</mi><mo>,</mo><mi>k
   </mi></mrow><mrow><mn>2</mn></mrow></msubsup><mo>~</mo><mspace
   width="0.25em"></mspace><mstyle><mtext>Type-2 Gumbel</mtext></mstyle><m
   space
   width="0.25em"></mspace><mrow><mo>(</mo><mrow><msub><mrow><mi>λ</mi></m
   row><mrow><mn>3</mn></mrow></msub></mrow><mo>)</mo></mrow><mrow><mo>(</
   mo><mrow><msub><mrow><mi>z</mi></mrow><mrow><mi>i</mi><mo>,</mo><mn>1</
   mn></mrow></msub><mo>,</mo><msub><mrow><mi>z</mi></mrow><mrow><mi>i</mi
   ><mo>,</mo><mn>2</mn></mrow></msub></mrow><mo>)</mo></mrow><mo>~</mo><m
   i>c</mi><mi>a</mi><mi>t</mi><mrow><mo>(</mo><mrow><mi
   mathvariant="bold-italic">π</mi></mrow><mo>)</mo></mrow><mi>π</mi><mo>∣
   </mo><mi>α</mi><mo>~</mo><mi class="MJX-tex-caligraphic"
   mathvariant="script">M</mi><mi>D</mi><mi>i</mi><mi>r</mi><mrow><mo>(</m
   o><mrow><mi>α</mi><mo>,</mo><mi>K</mi></mrow><mo>)</mo></mrow><msub><mr
   ow><mi>ϕ</mi></mrow><mrow><mi>i</mi><mo>,</mo><mi>d</mi></mrow></msub><
   mo>~</mo><mi>B</mi><mi>e</mi><mi>r</mi><mrow><mo>(</mo><mrow><msub><mro
   w><mi>ε</mi></mrow><mrow><mi>d</mi></mrow></msub></mrow><mo>)</mo></mro
   w><msub><mrow><mi>ε</mi></mrow><mrow><mi>d</mi></mrow></msub><mo>~</mo>
   <mi>B</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>u</mi></mrow><mrow><mi>
   d</mi></mrow></msub><mo>,</mo><msub><mrow><mi>v</mi></mrow><mrow><mi>d<
   /mi></mrow></msub></mrow><mo>)</mo></mrow><msub><mrow><mi>C</mi></mrow>
   <mrow><mi>v</mi></mrow></msub><mrow><mo>(</mo><mrow><mi>δ</mi></mrow><m
   o>)</mo></mrow><mo>=</mo><msup><mrow><mi>a</mi></mrow><mrow><mn>2</mn><
   /mrow></msup><mfrac><mrow><msup><mrow><mn>2</mn></mrow><mrow><mn>1</mn>
   <mo>−</mo><mi>ν</mi></mrow></msup></mrow><mrow><mi
   mathvariant="normal">Γ</mi><mrow><mo>(</mo><mrow><mi>ν</mi></mrow><mo>)
   </mo></mrow></mrow></mfrac><msup><mrow><mfenced close=")"
   open="("><mrow><msqrt><mrow><mn>8</mn><mi>ν</mi></mrow></msqrt><mfrac><
   mrow><mi>δ</mi></mrow><mrow><mi>ρ</mi></mrow></mfrac></mrow></mfenced><
   /mrow><mrow><mi>ν</mi></mrow></msup><msub><mrow><mi
   class="MJX-tex-caligraphic"
   mathvariant="script">K</mi></mrow><mrow><mi>v</mi></mrow></msub><mfence
   d close=")"
   open="("><mrow><msqrt><mrow><mn>8</mn><mi>ν</mi></mrow></msqrt><mfrac><
   mrow><mi>δ</mi></mrow><mrow><mi>ρ</mi></mrow></mfrac></mrow></mfenced><
   msubsup><mrow><mi>a</mi></mrow><mrow><mi>k</mi><mo>,</mo><mi>d</mi></mr
   ow><mrow><mrow><mo>(</mo><mrow><mi>d</mi></mrow><mo>)</mo></mrow></mrow
   ></msubsup><mo>,</mo><msubsup><mrow><mi>ρ</mi></mrow><mrow><mi>k</mi><m
   o>,</mo><mi>d</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>d</mi></mrow><
   mo>)</mo></mrow></mrow></msubsup><mo>~</mo><mi>P</mi><mi>C</mi><mrow><m
   o>(</mo><mrow><msub><mrow><mi>λ</mi></mrow><mrow><mn>1</mn></mrow></msu
   b><mo>,</mo><msub><mrow><mi>λ</mi></mrow><mrow><mn>2</mn></mrow></msub>
   </mrow><mo>)</mo></mrow><mo>.</mo> :MATH]
   18

   In the above equation
   [MATH: <mi class="MJX-tex-caligraphic" mathvariant="script">T</mi>
   :MATH]
   denotes the student-t distribution, Ber denotes the Bernoulli
   distribution, B the Beta distribution,
   [MATH: <msub><mrow><mi class="MJX-tex-caligraphic"
   mathvariant="script">K</mi></mrow><mrow><mi>v</mi></mrow></msub> :MATH]
   the Bessel function of the 2nd kind with parameter v and PC denotes the
   penalised complexity prior for a GRF. The other notation is as
   described in the previous section. For details of the Pólya-Gamma-based
   model, we refer the [318]Supplementary Methods and Supplementary
   Note [319]12.

Major algorithmic steps of bandle

   The complex notation in the previous section can be cumbersome and so
   here, we summarise the major steps of bandle in algorithmic steps:
    1. First, for each subcellular niche k in each dataset d and replicate
       r, learn the GRFs and corresponding hyperparameters a and ρ by
       maximum a posteriori estimation for a pre-specified λ[1], λ[2], as
       well as the variance parameter σ^2 for pre-specified λ[3].
       λ[1], λ[2] and λ[3] can be selected using prior predictive checks
       or default choices.
    2. Select α, potentially using a prior predictive check, expert
       knowledge or default choices.
    3. For T Monte-Carlo iterations perform the following steps ((a)–(i)).
       Note that in each case the currently sampled parameters are used in
       the following step and probabilities are conditioned on these
       sampled parameters. The dependence on T and these sampled
       parameters is suppressed to avoid notational clutter.
         1. Compute the likelihood of each protein i belonging to
            subcellular niche k in replicate r for dataset d. That is for
            k = 1, . . . , K, i = 1, . . . , n, d = 1, 2 and
            r = 1, . . . , R, compute

   [MATH:
   <mi>p</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>x</mi></mrow><mrow><mi>
   i</mi><mo>,</mo><mi>d</mi></mrow></msub><mo>∣</mo><msub><mrow><mi>z</mi
   ></mrow><mrow><mi>i</mi><mo>,</mo><mi>d</mi></mrow></msub><mo>=</mo><mi
   >k</mi><mo>,</mo><mspace
   width="0.16em"></mspace><mi>θ</mi></mrow><mo>)</mo></mrow><mo>=</mo><mu
   nderover accent="false"
   accentunder="false"><mrow><mo>∏</mo></mrow><mrow><mi>r</mi><mo>=</mo><m
   n>1</mn></mrow><mrow><mi>R</mi></mrow></munderover><mi>p</mi><mrow><mo>
   (</mo><mrow><msubsup><mrow><mi>x</mi></mrow><mrow><mi>i</mi><mo>,</mo><
   mi>d</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</mo>
   </mrow></mrow></msubsup><mo>∣</mo><msub><mrow><mi>z</mi></mrow><mrow><m
   i>i</mi><mo>,</mo><mi>d</mi></mrow></msub><mo>=</mo><mi>k</mi><mo>,</mo
   ><mspace
   width="0.16em"></mspace><msubsup><mrow><mi>θ</mi></mrow><mrow><msub><mr
   ow><mi>z</mi></mrow><mrow><mi>i</mi><mo>,</mo><mi>d</mi></mrow></msub><
   /mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</mo></mrow></m
   row></msubsup></mrow><mo>)</mo></mrow><mo>.</mo> :MATH]
            19
         2. Sample from the conditional posterior distribution of π.
            Letting Z be the set of allocation of all proteins to all
            niches and
            [MATH:
            <msup><mrow><mi>α</mi></mrow><mrow><mo>′</mo></mrow></msup>
            :MATH]
            be the matrix which tabulates the total allocation to each
            pair of niches, then:

   [MATH: <msup><mrow><mi
   mathvariant="bold-italic">π</mi></mrow><mrow><mo>′</mo></mrow></msup><m
   o>∣</mo><mi>α</mi><mo>,</mo><mi>Z</mi><mo>~</mo><mi
   class="MJX-tex-caligraphic"
   mathvariant="script">M</mi><mi>D</mi><mi>i</mi><mi>r</mi><mrow><mo>(</m
   o><mrow><mi>α</mi><mo>+</mo><msup><mrow><mi>α</mi></mrow><mrow><mo>′</m
   o></mrow></msup><mo>,</mo><mi>K</mi></mrow><mo>)</mo></mrow><mo>.</mo>
   :MATH]
            20
         3. Compute the conditional posterior of each protein i belonging
            to each subcellular niche k in each dataset d = 1, 2, using
            the following equation:

   [MATH:
   <mi>p</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>z</mi></mrow><mrow><mi>
   i</mi><mo>,</mo><mn>1</mn></mrow></msub><mo>=</mo><mi>j</mi><mo>,</mo><
   mspace
   width="0.16em"></mspace><msub><mrow><mi>z</mi></mrow><mrow><mi>i</mi><m
   o>,</mo><mn>2</mn></mrow></msub><mo>=</mo><mi>k</mi><mo>∣</mo><msup><mr
   ow><mi
   mathvariant="bold-italic">π</mi></mrow><mrow><mo>′</mo></mrow></msup><m
   o>,</mo><mspace width="0.16em"></mspace><mi>θ</mi><mo>,</mo><mspace
   width="0.16em"></mspace><msub><mrow><mi>x</mi></mrow><mrow><mi>i</mi><m
   o>,</mo><mi>d</mi></mrow></msub></mrow><mo>)</mo></mrow><mo>∝</mo><msub
   sup><mrow><mi>π</mi></mrow><mrow><mi>j</mi><mi>k</mi></mrow><mrow><mo>′
   </mo></mrow></msubsup><munderover accent="false"
   accentunder="false"><mrow><mo>∏</mo></mrow><mrow><mi>r</mi><mo>=</mo><m
   n>1</mn></mrow><mrow><mi>R</mi></mrow></munderover><mi>p</mi><mrow><mo>
   (</mo><mrow><msubsup><mrow><mi>x</mi></mrow><mrow><mi>i</mi><mo>,</mo><
   mn>1</mn></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</mo>
   </mrow></mrow></msubsup><mo>∣</mo><msub><mrow><mi>z</mi></mrow><mrow><m
   i>i</mi><mo>,</mo><mn>1</mn></mrow></msub><mo>=</mo><mi>j</mi><mo>,</mo
   ><mspace
   width="0.16em"></mspace><mi>θ</mi></mrow><mo>)</mo></mrow><mi>p</mi><mr
   ow><mo>(</mo><mrow><msubsup><mrow><mi>x</mi></mrow><mrow><mi>i</mi><mo>
   ,</mo><mn>2</mn></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo
   >)</mo></mrow></mrow></msubsup><mo>∣</mo><msub><mrow><mi>z</mi></mrow><
   mrow><mi>i</mi><mo>,</mo><mn>2</mn></mrow></msub><mo>=</mo><mi>k</mi><m
   o>,</mo><mspace
   width="0.16em"></mspace><mi>θ</mi></mrow><mo>)</mo></mrow><mo>.</mo>
   :MATH]
            21
         4. Sample (z[i,1], z[i,2]) from a categorical distribution using
            the above computed conditional posterior probabilities.
         5. Sample ϵ[d] from the conditional posterior distribution. The
            formula is given by

   [MATH:
   <msubsup><mrow><mi>ϵ</mi></mrow><mrow><mi>d</mi></mrow><mrow><mo>′</mo>
   </mrow></msubsup><mo>~</mo><mi class="MJX-tex-caligraphic"
   mathvariant="script">B</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>u</mi>
   </mrow><mrow><mi>d</mi></mrow></msub><mo>+</mo><msubsup><mrow><mi>u</mi
   ></mrow><mrow><mi>d</mi></mrow><mrow><mo>′</mo></mrow></msubsup><mo>,</
   mo><msub><mrow><mi>v</mi></mrow><mrow><mi>d</mi></mrow></msub><mo>+</mo
   ><msubsup><mrow><mi>v</mi></mrow><mrow><mi>d</mi></mrow><mrow><mo>′</mo
   ></mrow></msubsup></mrow><mo>)</mo></mrow><mo>,</mo> :MATH]
            22
            where
            [MATH:
            <msubsup><mrow><mi>u</mi></mrow><mrow><mi>d</mi></mrow><mrow><
            mo>′</mo></mrow></msubsup> :MATH]
            is the current number of proteins allocated to the outlier
            component in dataset d and
            [MATH:
            <msubsup><mrow><mi>v</mi></mrow><mrow><mi>d</mi></mrow><mrow><
            mo>′</mo></mrow></msubsup> :MATH]
            is the current number of proteins not allocated to the outlier
            component in dataset d.
         6. Compute the conditional posterior of belonging to the outlier
            component, using the following equation:

   [MATH:
   <mi>p</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>ϕ</mi></mrow><mrow><mi>
   i</mi><mo>,</mo><mi>d</mi></mrow></msub><mo>∣</mo><msub><mrow><mi>z</mi
   ></mrow><mrow><mi>i</mi><mo>,</mo><mn>1</mn></mrow></msub><mo>=</mo><ms
   pace
   width="0.16em"></mspace><mi>k</mi><mo>,</mo><msub><mrow><mi>z</mi></mro
   w><mrow><mi>i</mi><mo>,</mo><mn>2</mn></mrow></msub><mo>=</mo><mi>j</mi
   ><mo>,</mo><mi>θ</mi><mo>,</mo><msub><mrow><mi>x</mi></mrow><mrow><mi>i
   </mi><mo>,</mo><mi>d</mi></mrow></msub><mo>,</mo><msubsup><mrow><mi>ϵ</
   mi></mrow><mrow><mi>d</mi></mrow><mrow><mo>′</mo></mrow></msubsup></mro
   w><mo>)</mo></mrow><mo>∝</mo><mrow><mo>(</mo><mrow><mn>1</mn><mo>−</mo>
   <msubsup><mrow><mi>ϵ</mi></mrow><mrow><mi>d</mi></mrow><mrow><mo>′</mo>
   </mrow></msubsup></mrow><mo>)</mo></mrow><munderover accent="false"
   accentunder="false"><mrow><mo>∏</mo></mrow><mrow><mi>r</mi><mo>=</mo><m
   n>1</mn></mrow><mrow><mi>R</mi></mrow></munderover><mi>F</mi><mrow><mo>
   (</mo><mrow><msubsup><mrow><mi>x</mi></mrow><mrow><mi>i</mi><mo>,</mo><
   mi>d</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</mo>
   </mrow></mrow></msubsup><mo>∣</mo><msubsup><mrow><mi>θ</mi></mrow><mrow
   ><mi>k</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</m
   o></mrow></mrow></msubsup></mrow><mo>)</mo></mrow><mo>+</mo><msubsup><m
   row><mi>ϵ</mi></mrow><mrow><mi>d</mi></mrow><mrow><mo>′</mo></mrow></ms
   ubsup><munderover accent="false"
   accentunder="false"><mrow><mo>∏</mo></mrow><mrow><mi>r</mi><mo>=</mo><m
   n>1</mn></mrow><mrow><mi>R</mi></mrow></munderover><mi>G</mi><mrow><mo>
   (</mo><mrow><msubsup><mrow><mi>x</mi></mrow><mrow><mi>i</mi><mo>,</mo><
   mi>d</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>)</mo>
   </mrow></mrow></msubsup><mo>∣</mo><msup><mrow><mi
   mathvariant="normal">Φ</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi
   ></mrow><mo>)</mo></mrow></mrow></msup></mrow><mo>)</mo></mrow><mo>,</m
   o> :MATH]
            23
            where Φ^(r) denotes the parameters of the outlier component
            for replicates r, while F and G are the densities of the
            niche-specific component and outlier component, respectively.
         7. Sample ϕ[i,d] from a Bernoulli distribution for all i and all
            d from the conditional posterior probabilities given by the
            above equation.
         8. Optional: Sample new GRF hyperparameters from the conditional
            posterior distribution using Metropolis-Hastings or
            Hamiltonian Monte-Carlo. Otherwise use precomputed values.
            This conditional posterior is given by, for each dataset d,
            replicate r and niche k:

   [MATH: <mi>p</mi><mfenced close=")"
   open="("><mrow><msubsup><mrow><mi>a</mi></mrow><mrow><mi>r</mi><mo>,</m
   o><mi>k</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>d</mi></mrow><mo>)</
   mo></mrow></mrow></msubsup><mo>,</mo><msubsup><mrow><mi>ρ</mi></mrow><m
   row><mi>r</mi><mo>,</mo><mi>k</mi></mrow><mrow><mrow><mo>(</mo><mrow><m
   i>d</mi></mrow><mo>)</mo></mrow></mrow></msubsup><mo>,</mo><msubsup><mr
   ow><mi>σ</mi></mrow><mrow><mi>r</mi><mo>,</mo><mi>k</mi></mrow><mrow><m
   n>2</mn></mrow></msubsup><mo>∣</mo><mi>X</mi><mo>,</mo><mi>λ</mi></mrow
   ></mfenced><mo>∝</mo><msub><mrow><mi>p</mi></mrow><mrow><mn>0</mn></mro
   w></msub><mfenced close=")"
   open="("><mrow><msubsup><mrow><mi>a</mi></mrow><mrow><mi>r</mi><mo>,</m
   o><mi>k</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>d</mi></mrow><mo>)</
   mo></mrow></mrow></msubsup><mo>,</mo><msubsup><mrow><mi>ρ</mi></mrow><m
   row><mi>r</mi><mo>,</mo><mi>k</mi></mrow><mrow><mrow><mo>(</mo><mrow><m
   i>d</mi></mrow><mo>)</mo></mrow></mrow></msubsup><mo>,</mo><msubsup><mr
   ow><mi>σ</mi></mrow><mrow><mi>r</mi><mo>,</mo><mi>k</mi></mrow><mrow><m
   n>2</mn></mrow></msubsup><mo>∣</mo><mi>λ</mi></mrow></mfenced><mi>p</mi
   ><mfenced close=")"
   open="("><mrow><mi>X</mi><mo>∣</mo><msubsup><mrow><mi>a</mi></mrow><mro
   w><mi>r</mi><mo>,</mo><mi>k</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>
   d</mi></mrow><mo>)</mo></mrow></mrow></msubsup><mo>,</mo><msubsup><mrow
   ><mi>ρ</mi></mrow><mrow><mi>r</mi><mo>,</mo><mi>k</mi></mrow><mrow><mro
   w><mo>(</mo><mrow><mi>d</mi></mrow><mo>)</mo></mrow></mrow></msubsup><m
   o>,</mo><msubsup><mrow><mi>σ</mi></mrow><mrow><mi>r</mi><mo>,</mo><mi>k
   </mi></mrow><mrow><mn>2</mn></mrow></msubsup></mrow></mfenced> :MATH]
            24
            Note that
            [MATH:
            <mi>p</mi><mrow><mo>(</mo><mrow><mi>X</mi><mo>∣</mo><msubsup><
            mrow><mi>a</mi></mrow><mrow><mi>r</mi><mo>,</mo><mi>k</mi></mr
            ow><mrow><mrow><mo>(</mo><mrow><mi>d</mi></mrow><mo>)</mo></mr
            ow></mrow></msubsup><mo>,</mo><msubsup><mrow><mi>ρ</mi></mrow>
            <mrow><mi>r</mi><mo>,</mo><mi>k</mi></mrow><mrow><mrow><mo>(</
            mo><mrow><mi>d</mi></mrow><mo>)</mo></mrow></mrow></msubsup><m
            o>,</mo><msubsup><mrow><mi>σ</mi></mrow><mrow><mi>r</mi><mo>,<
            /mo><mi>k</mi></mrow><mrow><mn>2</mn></mrow></msubsup></mrow><
            mo>)</mo></mrow> :MATH]
            is given by a Gaussian density, with equation given in the
            supplement. The prior on the GRF hyperparameters is given by
            the penalised complexity priors. The lack of conjugacy between
            the prior and likelihood means a Metropolis-Hastings or
            Hamiltonian Monte-Carlo move is required.
         9. Update the GRF distributions for each subcellular niche k in
            replicate r for dataset d. That is, sample
            [MATH:
            <msubsup><mrow><mi>μ</mi></mrow><mrow><mi>k</mi><mo>,</mo><mi>
            d</mi></mrow><mrow><mrow><mo>(</mo><mrow><mi>r</mi></mrow><mo>
            )</mo></mrow></mrow></msubsup> :MATH]
            from the conditional posterior GRFs. Equations are given in
            Rasmussen and Williams^[320]127 and ref. [321]36.

Calibration of Dirichlet prior

   The following section describes how to calibrate the prior based on
   expert information and prior predictive checks. Recall the prior on the
   allocation probabilities is the following
   [MATH:
   <mi>p</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>z</mi></mrow><mrow><mi>
   i</mi><mo>,</mo><mn>1</mn></mrow></msub><mo>=</mo><mi>k</mi><mo>,</mo><
   msub><mrow><mi>z</mi></mrow><mrow><mi>i</mi><mo>,</mo><mn>2</mn></mrow>
   </msub><mo>=</mo><msup><mrow><mi>k</mi></mrow><mrow><mo>′</mo></mrow></
   msup><mo>∣</mo><mi
   mathvariant="bold-italic">π</mi></mrow><mo>)</mo></mrow><mo>=</mo><msub
   ><mrow><mi>π</mi></mrow><mrow><mi>k</mi><msup><mrow><mi>k</mi></mrow><m
   row><mo>′</mo></mrow></msup></mrow></msub><mo>.</mo> :MATH]
   25

   The matrix π has π[jk] has its (j, k)^th entry and π[jk] is the prior
   probability that a protein belongs to organelle j in dataset 1
   (control) and k in dataset 2 (contrast). The diagonal terms represent
   the probability that the protein was allocated to the same organelle in
   each dataset. The non-diagonal terms are the prior probability that the
   protein was not allocated to the same organelle. Since the number of
   non-diagonal terms greatly exceeds the number of diagonal entries, it
   is important to specify this prior carefully. Recall that the prior is
   given a matrix Dirichlet distribution with concentration parameter α.

   Firstly, we are interested in the prior expectation of the number of
   proteins that are differentially localised; that is, proteins not
   allocated to the same organelle in both conditions. Let γ be the prior
   probability that a protein is not allocated to the same organelle. Then
   it follows that
   [MATH:
   <mi>p</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>z</mi></mrow><mrow><mi>
   i</mi><mo>,</mo><mn>1</mn></mrow></msub><mspace
   width="0.16em"></mspace><mo>≠</mo><mspace
   width="0.16em"></mspace><msub><mrow><mi>z</mi></mrow><mrow><mi>i</mi><m
   o>,</mo><mn>2</mn></mrow></msub><mo>∣</mo><mi
   mathvariant="bold-italic">π</mi></mrow><mo>)</mo></mrow><mo>=</mo><mspa
   ce width="0.3em"></mspace><mo>:</mo><mspace
   width="0.25em"></mspace><mi>γ</mi><mo>=</mo><munder><mrow><mo>∑</mo></m
   row><mrow><mi>j</mi><mo>,</mo><mi>k</mi><mo>;</mo><mi>j</mi><mo>≠</mo><
   mi>k</mi></mrow></munder><msub><mrow><mi>π</mi></mrow><mrow><mi>j</mi><
   mi>k</mi></mrow></msub><mo>.</mo> :MATH]
   26

   By properties of the Dirichlet distribution, we have that the marginal
   distribution of π[jk] is given by
   [MATH:
   <msub><mrow><mi>π</mi></mrow><mrow><mi>j</mi><mi>k</mi></mrow></msub><m
   o>~</mo><mi class="MJX-tex-caligraphic"
   mathvariant="script">B</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>α</mi>
   </mrow><mrow><mi>j</mi><mi>k</mi></mrow></msub><mo>,</mo><msub><mrow><m
   i>α</mi></mrow><mrow><mn>0</mn></mrow></msub><mo>−</mo><msub><mrow><mi>
   α</mi></mrow><mrow><mi>j</mi><mi>k</mi></mrow></msub></mrow><mo>)</mo><
   /mrow><mo>,</mo> :MATH]
   27

   where α[0] = ∑[j,k]α[jk]. Thus, the expected value of γ is computed as
   follows
   [MATH: <mi
   mathvariant="double-struck">E</mi><mrow><mo>[</mo><mrow><mi>γ</mi></mro
   w><mo>]</mo></mrow><mspace
   width="0.25em"></mspace><mo>=</mo><munder><mrow><mo>∑</mo></mrow><mrow>
   <mi>j</mi><mo>,</mo><mi>k</mi><mo>;</mo><mi>j</mi><mo>≠</mo><mi>k</mi><
   /mrow></munder><mi
   mathvariant="double-struck">E</mi><mrow><mo>[</mo><mrow><msub><mrow><mi
   >π</mi></mrow><mrow><mi>j</mi><mi>k</mi></mrow></msub></mrow><mo>]</mo>
   </mrow><mo>=</mo><munder><mrow><mo>∑</mo></mrow><mrow><mi>j</mi><mo>,</
   mo><mi>k</mi><mo>;</mo><mi>j</mi><mo>≠</mo><mi>k</mi></mrow></munder><m
   frac><mrow><msub><mrow><mi>α</mi></mrow><mrow><mi>j</mi><mi>k</mi></mro
   w></msub></mrow><mrow><msub><mrow><mi>α</mi></mrow><mrow><mn>0</mn></mr
   ow></msub></mrow></mfrac><mo>.</mo> :MATH]
   28

   We are further interested in the probability that a certain number of
   proteins, say q, are differential localised. Letting N[U] be the number
   of unlabelled proteins in the experiment, then the distribution of the
   prior number of differential localised proteins is
   [MATH:
   <mi>p</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>N</mi></mrow><mrow><mi>
   U</mi></mrow></msub><mi>γ</mi><mspace
   width="0.16em"></mspace><mo>></mo><mspace
   width="0.16em"></mspace><mi>q</mi></mrow><mo>)</mo></mrow><mo>=</mo><mi
   >p</mi><mfenced close=")"
   open="("><mrow><msub><mrow><mi>N</mi></mrow><mrow><mi>U</mi></mrow></ms
   ub><munder><mrow><mo>∑</mo></mrow><mrow><mi>j</mi><mo>,</mo><mi>k</mi><
   mo>;</mo><mi>j</mi><mo>≠</mo><mi>k</mi></mrow></munder><msub><mrow><mi>
   π</mi></mrow><mrow><mi>j</mi><mi>k</mi></mrow></msub><mspace
   width="0.16em"></mspace><mo>></mo><mspace
   width="0.16em"></mspace><mi>q</mi></mrow></mfenced><mo>=</mo><mi>δ</mi>
   <mo>.</mo> :MATH]
   29

   Computing δ is not simple; however, it is straightforward to estimate δ
   using Monte-Carlo, by simply sampling from Beta distributions:
   [MATH: <mi>p</mi><mfenced close=")"
   open="("><mrow><msub><mrow><mi>N</mi></mrow><mrow><mi>U</mi></mrow></ms
   ub><munder><mrow><mo>∑</mo></mrow><mrow><mi>j</mi><mo>,</mo><mi>k</mi><
   mo>;</mo><mi>j</mi><mo>≠</mo><mi>k</mi></mrow></munder><msub><mrow><mi>
   π</mi></mrow><mrow><mi>j</mi><mi>k</mi></mrow></msub><mspace
   width="0.16em"></mspace><mo>></mo><mspace
   width="0.16em"></mspace><mi>q</mi></mrow></mfenced><mo>≈</mo><mfrac><mr
   ow><mn>1</mn></mrow><mrow><mi>T</mi></mrow></mfrac><munderover
   accent="false"
   accentunder="false"><mrow><mo>∑</mo></mrow><mrow><mi>t</mi><mo>=</mo><m
   n>1</mn></mrow><mrow><mi>T</mi></mrow></munderover><mi
   mathvariant="double-struck">1</mi><mfenced close=")"
   open="("><mrow><msub><mrow><mi>N</mi></mrow><mrow><mi>U</mi></mrow></ms
   ub><munder><mrow><mo>∑</mo></mrow><mrow><mi>j</mi><mo>,</mo><mi>k</mi><
   mo>;</mo><mi>j</mi><mo>≠</mo><mi>k</mi></mrow></munder><msubsup><mrow><
   mi>π</mi></mrow><mrow><mi>j</mi><mi>k</mi></mrow><mrow><mrow><mo>(</mo>
   <mrow><mi>t</mi></mrow><mo>)</mo></mrow></mrow></msubsup><mspace
   width="0.16em"></mspace><mo>></mo><mspace
   width="0.16em"></mspace><mi>q</mi></mrow></mfenced><mo>.</mo> :MATH]
   30

   Thus, we calibrate the Dirichlet prior using the above expectation and
   quantile. In some applications, calibrating several quantiles is needed
   to ensure sufficient mass is placed on desired regions of the
   probability space. For example, let q[1] < q[2], then we want that
   δ[1], below, is not so small to rule out reasonable inferences and that
   δ[2] < δ[1] is sufficiently large. These can be computed from the
   equations below:
   [MATH: <mi>p</mi><mfenced close=")"
   open="("><mrow><msub><mrow><mi>N</mi></mrow><mrow><mi>U</mi></mrow></ms
   ub><munder><mrow><mo>∑</mo></mrow><mrow><mi>j</mi><mo>,</mo><mi>k</mi><
   mo>;</mo><mi>j</mi><mo>≠</mo><mi>k</mi></mrow></munder><msub><mrow><mi>
   π</mi></mrow><mrow><mi>j</mi><mi>k</mi></mrow></msub><mspace
   width="0.16em"></mspace><mo>></mo><mspace
   width="0.16em"></mspace><mi>q</mi></mrow></mfenced><mo>≈</mo><mfrac><mr
   ow><mn>1</mn></mrow><mrow><mi>T</mi></mrow></mfrac><munderover
   accent="false"
   accentunder="false"><mrow><mo>∑</mo></mrow><mrow><mi>t</mi><mo>=</mo><m
   n>1</mn></mrow><mrow><mi>T</mi></mrow></munderover><mi
   mathvariant="double-struck">1</mi><mfenced close=")"
   open="("><mrow><msub><mrow><mi>N</mi></mrow><mrow><mi>U</mi></mrow></ms
   ub><munder><mrow><mo>∑</mo></mrow><mrow><mi>j</mi><mo>,</mo><mi>k</mi><
   mo>;</mo><mi>j</mi><mo>≠</mo><mi>k</mi></mrow></munder><msubsup><mrow><
   mi>π</mi></mrow><mrow><mi>j</mi><mi>k</mi></mrow><mrow><mrow><mo>(</mo>
   <mrow><mi>t</mi></mrow><mo>)</mo></mrow></mrow></msubsup><mspace
   width="0.16em"></mspace><mo>></mo><mspace
   width="0.16em"></mspace><msub><mrow><mi>q</mi></mrow><mrow><mn>1</mn></
   mrow></msub></mrow></mfenced><mo>=</mo><msub><mrow><mi>δ</mi></mrow><mr
   ow><mn>1</mn></mrow></msub><mo>,</mo> :MATH]
   31
   [MATH: <mi>p</mi><mfenced close=")"
   open="("><mrow><msub><mrow><mi>N</mi></mrow><mrow><mi>U</mi></mrow></ms
   ub><munder><mrow><mo>∑</mo></mrow><mrow><mi>j</mi><mo>,</mo><mi>k</mi><
   mo>;</mo><mi>j</mi><mo>≠</mo><mi>k</mi></mrow></munder><msub><mrow><mi>
   π</mi></mrow><mrow><mi>j</mi><mi>k</mi></mrow></msub><mspace
   width="0.16em"></mspace><mo>></mo><mspace
   width="0.16em"></mspace><mi>q</mi></mrow></mfenced><mo>≈</mo><mfrac><mr
   ow><mn>1</mn></mrow><mrow><mi>T</mi></mrow></mfrac><munderover
   accent="false"
   accentunder="false"><mrow><mo>∑</mo></mrow><mrow><mi>t</mi><mo>=</mo><m
   n>1</mn></mrow><mrow><mi>T</mi></mrow></munderover><mi
   mathvariant="double-struck">1</mi><mfenced close=")"
   open="("><mrow><msub><mrow><mi>N</mi></mrow><mrow><mi>U</mi></mrow></ms
   ub><munder><mrow><mo>∑</mo></mrow><mrow><mi>j</mi><mo>,</mo><mi>k</mi><
   mo>;</mo><mi>j</mi><mo>≠</mo><mi>k</mi></mrow></munder><msubsup><mrow><
   mi>π</mi></mrow><mrow><mi>j</mi><mi>k</mi></mrow><mrow><mrow><mo>(</mo>
   <mrow><mi>t</mi></mrow><mo>)</mo></mrow></mrow></msubsup><mspace
   width="0.16em"></mspace><mo>></mo><mspace
   width="0.16em"></mspace><msub><mrow><mi>q</mi></mrow><mrow><mn>2</mn></
   mrow></msub></mrow></mfenced><mo>=</mo><msub><mrow><mi>δ</mi></mrow><mr
   ow><mn>2</mn></mrow></msub><mo>.</mo> :MATH]
   32

   More precise and informative prior biological knowledge can be
   specified; for example, should we suspect that some relocalisation
   events between particular organelles are more likely than others due to
   the stimuli, these can be encoded into the prior. If we expect more
   relocalisation events between organelle j and k[1] than organelle j and
   k[2], this can be encoded by ensuring
   [MATH:
   <mfrac><mrow><mn>1</mn></mrow><mrow><mi>T</mi></mrow></mfrac><munderove
   r accent="false"
   accentunder="false"><mrow><mo>∑</mo></mrow><mrow><mi>t</mi><mo>=</mo><m
   n>1</mn></mrow><mrow><mi>T</mi></mrow></munderover><mi
   mathvariant="double-struck">1</mi><mfenced close=")"
   open="("><mrow><msubsup><mrow><mi>π</mi></mrow><mrow><mi>j</mi><msub><m
   row><mi>k</mi></mrow><mrow><mn>1</mn></mrow></msub></mrow><mrow><mrow><
   mo>(</mo><mrow><mi>t</mi></mrow><mo>)</mo></mrow></mrow></msubsup><mspa
   ce width="0.16em"></mspace><mo>></mo><mspace
   width="0.16em"></mspace><msubsup><mrow><mi>π</mi></mrow><mrow><mi>j</mi
   ><msub><mrow><mi>k</mi></mrow><mrow><mn>2</mn></mrow></msub></mrow><mro
   w><mrow><mo>(</mo><mrow><mi>t</mi></mrow><mo>)</mo></mrow></mrow></msub
   sup></mrow></mfenced><mspace width="0.16em"></mspace><mo>></mo><mspace
   width="0.16em"></mspace><msub><mrow><mi>δ</mi></mrow><mrow><mn>3</mn></
   mrow></msub><mspace width="0.16em"></mspace><mo>></mo><mspace
   width="0.16em"></mspace><mn>0</mn><mo>.</mo> :MATH]
   33

   For example, suppose that for a particular experiment it is impossible
   for any protein to relocalise. Then, we are interested in ensuring
   [MATH: <mi>p</mi><mfenced close=")"
   open="("><mrow><msub><mrow><mi>N</mi></mrow><mrow><mi>U</mi></mrow></ms
   ub><munder><mrow><mo>∑</mo></mrow><mrow><mi>k</mi><mo>,</mo><mi>j</mi><
   mo>;</mo><mi>j</mi><mo>≠</mo><mi>k</mi></mrow></munder><msub><mrow><mi>
   π</mi></mrow><mrow><mi>j</mi><mi>k</mi></mrow></msub><mspace
   width="0.16em"></mspace><mo>></mo><mspace
   width="0.16em"></mspace><mn>0</mn></mrow></mfenced><mo>=</mo><mi>δ</mi>
   <mo>=</mo><mn>0</mn><mo>.</mo> :MATH]
   34

   This is only possible if π[jk] = 0 for all k ≠ j. This can be ensured
   by setting α[jj] = 1 and α[jk] = 0 for k ≠ j. Now suppose, we wish to
   relax this assumption slightly, since we believe some re-localisations
   are possible. In our experiment, we measure 1000 unlabelled proteins
   and we believe that there is roughly a 10% chance there are more than
   10 re-localisations. Then we wish to satisfy
   [MATH: <mi>p</mi><mfenced close=")"
   open="("><mrow><mn>1000</mn><munder><mrow><mo>∑</mo></mrow><mrow><mi>k<
   /mi><mo>,</mo><mi>j</mi><mo>;</mo><mi>j</mi><mo>≠</mo><mi>k</mi></mrow>
   </munder><msub><mrow><mi>π</mi></mrow><mrow><mi>j</mi><mi>k</mi></mrow>
   </msub><mspace width="0.16em"></mspace><mo>></mo><mspace
   width="0.16em"></mspace><mn>10</mn></mrow></mfenced><mo>=</mo><mn>0.1</
   mn><mo>.</mo> :MATH]
   35

   The appropriate entries of α can then be chosen using Monte-Carlo
   simulation. Alternatively, if an objective Bayesian analysis is
   preferred, the Jeffery’s prior sets α[jk] = 0.5 for every
   j, k = 1, . . , K. This approach is not generally recommended by the
   authors, because the diagonal terms of π have a different
   interpretation to the off-diagonal terms. As a default, we set
   α[jj] = 1.01 and α[jk] = 0.01 for k ≠ j. This assumes that there are
   roughly an order of magnitude fewer differentially localised proteins
   than spatially stable ones. This default was used in all simulations
   and applications except the EGF simulation dataset. In that case, we
   had prior knowledge of a differentially localisation between the plasma
   membrane and the endosome and so we set the corresponding entry of α to
   1. Additional details of setting of the priors for specific application
   is in the [322]Supplementary Material (Supplementary Note [323]11). We
   also demonstrate that our analysis remains robust to the choice of
   prior by performing a prior sensitivity analysis (Supplementary
   Note [324]11). Convergence analysis of MCMC algorithms is provided in
   Supplementary Notes [325]6, 8 and [326]10.

Differential localisation probability

   The main posterior quantity of interest is the probability that a
   protein is differentially localised. This can be approximated from the
   T Monte-Carlo samples as follows, suppressing notational dependence on
   all data and parameters for clarity
   [MATH:
   <msub><mrow><mi>χ</mi></mrow><mrow><mi>i</mi></mrow></msub><mo>=</mo><m
   i>p</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>z</mi></mrow><mrow><mi>i<
   /mi><mo>,</mo><mn>1</mn></mrow></msub><mspace
   width="0.16em"></mspace><mo>≠</mo><mspace
   width="0.16em"></mspace><msub><mrow><mi>z</mi></mrow><mrow><mi>i</mi><m
   o>,</mo><mn>2</mn></mrow></msub></mrow><mo>)</mo></mrow><mspace
   width="0.16em"></mspace><mo>≈</mo><mspace
   width="0.16em"></mspace><mfrac><mrow><mn>1</mn></mrow><mrow><mi>T</mi><
   /mrow></mfrac><munderover accent="false"
   accentunder="false"><mrow><mo>∑</mo></mrow><mrow><mi>t</mi><mo>=</mo><m
   n>1</mn></mrow><mrow><mi>T</mi></mrow></munderover><mi
   mathvariant="double-struck">1</mi><mfenced close=")"
   open="("><mrow><msubsup><mrow><mi>z</mi></mrow><mrow><mi>i</mi><mo>,</m
   o><mn>1</mn></mrow><mrow><mrow><mo>(</mo><mrow><mi>t</mi></mrow><mo>)</
   mo></mrow></mrow></msubsup><mspace
   width="0.16em"></mspace><mo>≠</mo><mspace
   width="0.16em"></mspace><msubsup><mrow><mi>z</mi></mrow><mrow><mi>i</mi
   ><mo>,</mo><mn>2</mn></mrow><mrow><mrow><mo>(</mo><mrow><mi>t</mi></mro
   w><mo>)</mo></mrow></mrow></msubsup></mrow></mfenced><mo>,</mo> :MATH]
   36

   where t denotes the tth sample of the MCMC algorithm. It is important
   to note that this quantity is agnostic to the assigned subcellular
   niche. We notice that the distribution of the number of MCMC
   observations for which z[1] is not equal to z[2] is given by:
   [MATH: <munderover accent="false"
   accentunder="false"><mrow><mo>∑</mo></mrow><mrow><mi>t</mi><mo>=</mo><m
   n>1</mn></mrow><mrow><mi>T</mi></mrow></munderover><mi
   mathvariant="double-struck">1</mi><mfenced close=")"
   open="("><mrow><msubsup><mrow><mi>z</mi></mrow><mrow><mi>i</mi><mo>,</m
   o><mn>1</mn></mrow><mrow><mrow><mo>(</mo><mrow><mi>t</mi></mrow><mo>)</
   mo></mrow></mrow></msubsup><mspace
   width="0.16em"></mspace><mo>≠</mo><mspace
   width="0.16em"></mspace><msubsup><mrow><mi>z</mi></mrow><mrow><mi>i</mi
   ><mo>,</mo><mn>2</mn></mrow><mrow><mrow><mo>(</mo><mrow><mi>t</mi></mro
   w><mo>)</mo></mrow></mrow></msubsup></mrow></mfenced><mo>~</mo><mi
   class="MJX-tex-caligraphic"
   mathvariant="script">B</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>χ</mi>
   </mrow><mrow><mi>i</mi></mrow></msub><mo>,</mo><mi>T</mi></mrow><mo>)</
   mo></mrow><mo>.</mo> :MATH]
   37

   Hence, in this case, the Monte-Carlo estimator for χ is simply the
   maximum likelihood estimator of the probability parameter of the above
   binomial distribution. As T is given, uncertainty estimates, such as
   credible intervals, can be obtained from this binomial distribution
   directly.

   An alternative, but less computationally efficient approach, to perform
   uncertainty quantification on the differential localisation
   probability, could use the non-parametric bootstrap on the Monte-Carlo
   samples. More precisely, first sample uniformly with replacement from
   [MATH:
   <msubsup><mrow><mrow><mo>{</mo><mrow><msubsup><mrow><mi>z</mi></mrow><m
   row><mi>i</mi><mo>,</mo><mn>1</mn></mrow><mrow><mrow><mo>(</mo><mrow><m
   i>t</mi></mrow><mo>)</mo></mrow></mrow></msubsup></mrow><mo>}</mo></mro
   w></mrow><mrow><mi>t</mi><mo>=</mo><mn>1</mn></mrow><mrow><mi>T</mi></m
   row></msubsup> :MATH]
   and
   [MATH:
   <msubsup><mrow><mrow><mo>{</mo><mrow><msubsup><mrow><mi>z</mi></mrow><m
   row><mi>i</mi><mo>,</mo><mn>2</mn></mrow><mrow><mrow><mo>(</mo><mrow><m
   i>t</mi></mrow><mo>)</mo></mrow></mrow></msubsup></mrow><mo>}</mo></mro
   w></mrow><mrow><mi>t</mi><mo>=</mo><mn>1</mn></mrow><mrow><mi>T</mi></m
   row></msubsup> :MATH]
   to total of T samples. This produces a bootstrap sample indexed by
   B[1]. Then compute our statistic of interest:
   [MATH:
   <msubsup><mrow><mi>χ</mi></mrow><mrow><mi>i</mi><mo>,</mo><msub><mrow><
   mi>B</mi></mrow><mrow><mn>1</mn></mrow></msub></mrow><mrow><mo>*</mo></
   mrow></msubsup><mo>≈</mo><mfrac><mrow><mn>1</mn></mrow><mrow><mo>∣</mo>
   <msub><mrow><mi>B</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>∣</mo></
   mrow></mfrac><munder><mrow><mo>∑</mo></mrow><mrow><mi>t</mi><mo>∈</mo><
   msub><mrow><mi>B</mi></mrow><mrow><mn>1</mn></mrow></msub></mrow></mund
   er><mi
   mathvariant="double-struck">1</mi><mrow><mo>(</mo><mrow><msubsup><mrow>
   <mi>z</mi></mrow><mrow><mi>i</mi><mo>,</mo><mn>1</mn></mrow><mrow><mrow
   ><mo>(</mo><mrow><mi>t</mi></mrow><mo>)</mo></mrow></mrow></msubsup><ms
   pace width="0.16em"></mspace><mo>≠</mo><mspace
   width="0.16em"></mspace><msubsup><mrow><mi>z</mi></mrow><mrow><mi>i</mi
   ><mo>,</mo><mn>2</mn></mrow><mrow><mrow><mo>(</mo><mrow><mi>t</mi></mro
   w><mo>)</mo></mrow></mrow></msubsup></mrow><mo>)</mo></mrow><mo>.</mo>
   :MATH]
   38

   This process is then repeated to obtain a set of bootstrap samples
   [MATH: <mi
   mathvariant="double-struck">B</mi><mo>=</mo><mrow><mo>{</mo><mrow><msub
   ><mrow><mi>B</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>,</mo><mo>.</
   mo><mo>.</mo><mo>.</mo><mo>,</mo><mspace
   width="0.16em"></mspace><msub><mrow><mi>B</mi></mrow><mrow><mi>b</mi></
   mrow></msub></mrow><mo>}</mo></mrow> :MATH]
   , for some large b, say 1000. For each
   [MATH:
   <msub><mrow><mi>B</mi></mrow><mrow><mi>r</mi></mrow></msub><mo>∈</mo><m
   i mathvariant="double-struck">B</mi> :MATH]
   , we compute
   [MATH:
   <msubsup><mrow><mi>χ</mi></mrow><mrow><mi>i</mi><mo>,</mo><msub><mrow><
   mi>B</mi></mrow><mrow><mi>r</mi></mrow></msub></mrow><mrow><mo>*</mo></
   mrow></msubsup> :MATH]
   for r = 1, . . . , b, obtaining a sampling distribution for χ[r] from
   which we can compute functionals of interest.

Posterior localisation probabilities

   A further quantity of interest is the posterior probability that a
   protein belongs to each of the K subcellular niches present in the
   data. For the control, this is given by the following Monte-Carlo
   average
   [MATH:
   <mi>p</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>z</mi></mrow><mrow><mi>
   i</mi><mo>,</mo><mn>1</mn></mrow></msub><mo>=</mo><mi>k</mi><mo>∣</mo><
   mi mathvariant="normal">Θ</mi></mrow><mo>)</mo></mrow><mspace
   width="0.16em"></mspace><mo>≈</mo><mspace
   width="0.16em"></mspace><mfrac><mrow><mn>1</mn></mrow><mrow><mi>T</mi><
   /mrow></mfrac><munderover accent="false"
   accentunder="false"><mrow><mo>∑</mo></mrow><mrow><mi>t</mi><mo>=</mo><m
   n>1</mn></mrow><mrow><mi>T</mi></mrow></munderover><mi>p</mi><mfenced
   close=")"
   open="("><mrow><msubsup><mrow><mi>z</mi></mrow><mrow><mi>i</mi><mo>,</m
   o><mn>1</mn></mrow><mrow><mrow><mo>(</mo><mrow><mi>t</mi></mrow><mo>)</
   mo></mrow></mrow></msubsup><mo>=</mo><mi>k</mi><mo>∣</mo><mi
   mathvariant="normal">Θ</mi></mrow></mfenced><mo>,</mo> :MATH]
   39

   where Θ denotes all other quantities in the model. A corresponding
   formula also holds for the second dataset
   [MATH:
   <mi>p</mi><mrow><mo>(</mo><mrow><msub><mrow><mi>z</mi></mrow><mrow><mi>
   i</mi><mo>,</mo><mn>2</mn></mrow></msub><mo>=</mo><mi>k</mi><mo>∣</mo><
   mi mathvariant="normal">Θ</mi></mrow><mo>)</mo></mrow><mspace
   width="0.16em"></mspace><mo>≈</mo><mspace
   width="0.16em"></mspace><mfrac><mrow><mn>1</mn></mrow><mrow><mi>T</mi><
   /mrow></mfrac><munderover accent="false"
   accentunder="false"><mrow><mo>∑</mo></mrow><mrow><mi>t</mi><mo>=</mo><m
   n>1</mn></mrow><mrow><mi>T</mi></mrow></munderover><mi>p</mi><mfenced
   close=")"
   open="("><mrow><msubsup><mrow><mi>z</mi></mrow><mrow><mi>i</mi><mo>,</m
   o><mn>2</mn></mrow><mrow><mrow><mo>(</mo><mrow><mi>t</mi></mrow><mo>)</
   mo></mrow></mrow></msubsup><mo>=</mo><mi>k</mi><mo>∣</mo><mi
   mathvariant="normal">Θ</mi></mrow></mfenced><mo>.</mo> :MATH]
   40

   The posterior distribution of these quantities and uncertainty
   estimates can be computed and visualised in standard ways.

The BANDLE package

   The BANDLE package
   ([327]https://bioconductor.org/packages/release/bioc/html/bandle.html)
   is implemented as part of the Bioconductor suite^[328]135, with
   documentation covering a typical analysis workflow, including the
   analysis of the spatio-temporal proteome of THP-1^[329]136. The package
   includes utility functions to set priors, as well as data
   visualisations of the outputs of BANDLE. BANDLE is designed to
   accompany the MSnbase, pRoloc, pRolocGUI and pRolocdata
   packages^[330]25,[331]27, as part of an integrated Bioconductor suite.
   Our pipeline offers a modular and extensible approach that includes all
   steps of the spatial proteomics workflow. This includes aggregation of
   peptide-spectrum matches (PSMs) to proteins, assessment of data
   quality^[332]40, imputation, normalisation, unsupervised analysis,
   clustering, supervised machine learning, transfer learning^[333]24,
   semi-supervised learning^[334]22,[335]23,[336]36, differential
   localisation analysis, data management and dissemination, as well as
   analysis-specific visualisation. This framework also allows for
   seamless deployment into Shiny applications as metadata is stored in a
   consistent manner, for example see
   [337]https://proteome.shinyapps.io/thp-lopit/. The package also
   provides an implementation of the MR method.

Mass-spectrometry data processing

   For the EGF stimulated HeLa cells data found in section: Characterising
   differential localisation upon EGF stimulation, quantitation was
   performed using SILAC. Protein level intensities were exported from
   MaxQuant. Proteins with any missing values were removed from the
   analysis. We refer to Itzhak et al.^[338]17 for further information.
   For the AP-4 knockout dataset in section: BANDLE obtains deeper
   insights into AP-4 dependent localisation, quantitation was performed
   using SILAC and protein level intensities were exported from MaxQuant.
   We refer to Davies et al.^[339]4 for more information. For the HCMV
   application in section: Rewiring the proteome under cytomegalovirus
   infection, quantitation was performed using TMT and protein level
   intensities were exported from MaxQuant for further details see Beltran
   et al.^[340]16.

Reporting summary

   Further information on research design is available in the [341]Nature
   Research Reporting Summary linked to this article.

Supplementary information

   [342]Supplementary Information^ (12.3MB, pdf)
   [343]Peer Review File^ (4.5MB, pdf)
   [344]Reporting Summary^ (364.5KB, pdf)

Acknowledgements