Abstract Molecularly targeted therapeutics hold promise of revolutionizing treatments of advanced malignancies. However, a large number of patients do not respond to these treatments. Here, we take a systems biology approach to understand the molecular mechanisms that prevent breast cancer (BC) cells from responding to lapatinib, a dual kinase inhibitor that targets human epidermal growth factor receptor 2 (HER2) and epidermal growth factor receptor (EGFR). To this end, we analysed temporal gene expression profiles of four BC cell lines, two of which respond and the remaining two do not respond to lapatinib. For this analysis, we developed a Gaussian process based algorithm which can accurately find differentially expressed genes by analysing time course gene expression profiles at a fraction of the computational cost of other state-of-the-art algorithms. Our analysis identified 519 potential genes which are characteristic of lapatinib non-responsiveness in the tested cell lines. Data from the Genomics of Drug Sensitivity in Cancer (GDSC) database suggested that the basal expressions 120 of the above genes correlate with the response of BC cells to HER2 and/or EGFR targeted therapies. We selected 27 genes from the larger panel of 519 genes for experimental verification and 16 of these were successfully validated. Further bioinformatics analysis identified vitamin D receptor (VDR) as a potential target of interest for lapatinib non-responsive BC cells. Experimentally, calcitriol, a commonly used reagent for VDR targeted therapy, in combination with lapatinib additively inhibited proliferation in two HER2 positive cell lines, lapatinib insensitive MDA-MB-453 and lapatinib resistant HCC 1954-L cells. 1 Introduction BC is the most common type of cancer in women accounting for 25% of all cases [[36]1]. Clinically, BC is classified based on tumour progression, histopathology, and the expression status of HER2, oestrogen, and progesterone receptors. The most suitable treatment for a patient is decided based on these parameters. HER2-positive BC, in which the HER2 receptor is either overexpressed or amplified, accounts for approximately 20–25% of human BC cases [[37]2] and is associated with poor prognosis [[38]3]. Standard treatment options such as radiation, surgery and chemotherapy as well as more targeted approaches are used to treat these types of patients. The monoclonal antibody trastuzumab [[39]4] and the dual tyrosine kinase inhibitor lapatinib are among the targeted therapies currently in clinical use [[40]5]. Here, we focused on lapatinib, which inhibits both HER2 and EGFR and prevents activation of important downstream pathways, such as ERK/MAPK (extracellular-signal-regulated kinase/mitogen-activated protein kinase) and PI3K (phosphatidylinositol 3-kinase) which can drive cancer progression [[41]6, [42]7]. Lapatinib is currently approved for treatment of metastatic BC in combination with capecitabine [[43]8], and experimentally may synergise with trastuzumab [[44]9]. However, a significant proportion of HER2-positive tumours do not respond to lapatinib. Recent clinical studies suggest lapatinib has a success rate of 12.4–24.7% depending on whether it is administered alone or in combination with capecitabine or trastuzumab [[45]8, [46]10, [47]11]. These low response rates underline the clinical urgency to understand and overcome the molecular mechanisms that prevent BC from responding to lapatinib treatment. Here, we address this question by reanalysing a previously published dataset consisting of gene expression changes in a panel of lapatinib responsive and non-responsive cell lines over a period of time following different doses of lapatinib treatment [[48]12]. We sought to compare the temporal patterns of gene expression between the responsive and non-responsive cell lines in order to identify genes which are characteristic of lapatinib non-responsiveness [[49]12]. Comparing temporal patterns of gene expression across different cell lines/experimental conditions is not straightforward, mainly because of the different types of variabilities in these measurements [[50]13, [51]14]. Firstly, genotypic differences within different cell populations and measurement errors add random variability to the data [[52]13, [53]14]. Secondly, gene expression is subjected to various levels of regulation, including epigenetic control, efficiency of transcriptional initiation and extension, and post-transcriptional processing, such as mRNA splicing and degradation. These processes add systematic time dependent variabilities to the measured expression [[54]13, [55]14]. Standard statistical tools such as t-test, rank sum test etc. which are commonly used to compare gene expression, account for the random variabilities, but fail to capture the systematic temporal variabilities in gene expression data [[56]13, [57]14]. Several methods have been proposed to model and compare temporal gene expression patterns [[58]15–[59]21]. Gaussian Process (GP) based methods are particularly attractive, since they stem from widely studied, tried and tested theories and are well known to accurately model time course data [[60]16, [61]17, [62]19, [63]21, [64]22]. In this paper, we developed a fast and accurate GP based differential time course gene expression analysis tool, called GEAGP (Gene Expression Analysis using Gaussian Process), to compare temporal gene expression patterns. We first tested this tool on a simulated dataset and compared its performance with other similar statistical tools. Subsequently, we used this tool to identify potential genes which characterize non-responsiveness of BC cells to lapatinib. Some of the identified genes were then validated by quantitative RT-PCR. Further bioinformatics analysis led to the discovery of a potentially new target for treating lapatinib insensitive cells, the vitamin D receptor (VDR). The potential of the VDR as a new treatment option was then examined in cell line models using concurrent treatment with varying doses of the VDR agonist calcitriol in combination with different doses of lapatinib in lapatinib insensitive and resistant cell lines. 2 Materials and methods 2.1 Imputing missing values in the microarray data In the microarray dataset most experiments were performed in quadruplicates, except in a few cases where only three replicates were available. In these cases, we calculated the sample mean (μ[s]) and standard deviations (σ[s]) using the three available replicates and then generated a random number by sampling a normal distribution ( [MATH: N(μs ,σs2) :MATH] ) with the same mean and standard deviation. This random number was then used to represent the missing fourth replicate value [[65]23]. 2.2 Empirical estimation of the GP parameters μ[i](t), Σ[i] We assumed that the expression profile of an mRNA is a GP with mean function and covariance matrix μ[i](t), Σ[i] respectively, i.e. m[i](t)∼N(μ[i](t), Σ[i]). We further assumed that the mean function itself is a GP, i.e. it has the following prior distribution: μ[i](t)∼N(μ[0](t), Σ[0]) where μ[0](t), Σ[0] are hyperparameters. Since the distribution of the mean function is unknown a priori, μ[0](t) was assumed to be zero, and the Σ[0] was assumed to be defined by a Ornstein–Uhlenbeck function ( [MATH: Σ0t, t=ψi exp(|tt|δi ) :MATH] ) [[66]24, [67]25] which is an exponential functional form that describes the correlation between values of μ[i](t) at different time points (μ, μ[i]). The parameters δ[i], ψ[i], d[i], v[i], and [MATH: σi2 :MATH] of the covariance matrices Σ[0] and Σ[i] can only take positive values. Therefore, they were assumed to have gamma prior distributions with location and scale parameters α = 0.1 and b = 100 respectively. The choices of a, b ensures that the corresponding gamma distributions are flat, thereby allowing a wide range of values for the above parameters to be considered during inference/estimation of μ[i](t), Σ[i]. Multiplying the likelihood (see [68]Eq 2 in the “[69]Results” section) by the prior distributions of μ[i](t), δ[i], ψ[i], d[i], v[i], and [MATH: σi2 :MATH] , and integrating the resulting joint distributions with respect to μ[i](t) one arrives at the following marginal posterior. [MATH: p(δi, ψi, < msub>di, vi, σ i2|mir(t))=1(2π)< mrow>Nr* NT2 |Σi|Nr2r=1< /mn>Nrexp( μmr(t)¯T Σi1μmr(t)¯2)×Γ(σ2 |a,b)Γ(δi< mo>|a,b)Γ(σi| a,b)Γ( di|a,b)Γ(vi|a,b ) :MATH] Where [MATH: μmr(t)¯=mir(t) μi(t)¯ :MATH] and [MATH: μi(t)¯=(Σ01+NrΣi1)1< mtext> NrΣi1 ma :MATH] (1) We used active set algorithm implemented in MATLAB’s fmincon function (see [70]http://uk.mathworks.com/help/optim/ug/constrained-nonlinear-optimiz ation-algorithms.html for details) to find the values of δ[i], ψ[i], d[i], v[i], and [MATH: σi2 :MATH] which maximizes the above posterior probability. The likelihood of an observed mRNA expression profile was calculated by replacing the maximum a posteriori estimates of δ[i], ψ[i] in [71]Eq 1, replacing μ[i](t) in the likelihood function (see [72]Eq 2 in the “[73]Results” section) by [MATH: μi(t)¯ :MATH] in [74]Eq 1, replacing d[i], v[i], and [MATH: σi2 :MATH] in [75]Eq 2 by their maximum a posteriori values. 2.3 Simulating time course gene expression profiles The temporal profiles m[ij](t) of the i^th gene in the j^th condition were generated by sampling a hierarchical GP with mean and covariance matrix μ[ij](t), Σ[ij] respectively. The covariance matrix (Σ[ij]) has two components [MATH: Σijt,t :MATH] , [MATH: ΣijN :MATH] to account for systematic and random variabilities, respectively. For simulation, we used a Gaussian kernel function ( [MATH: Σijt,t=exp( tt) 2l ij :MATH] ) to model the systematic variabilities [MATH: (Σit,t) :MATH] where l[ij] is the smoothing parameter. The random noise is Gaussian with 0 mean and variance [MATH: σij2 :MATH] , i.e. [MATH: ΣijN=σij2I :MATH] , I is an identity matrix. The smoothing parameter l[ij] was sampled from a gamma distribution with shape and scale parameters a[ij] and b[ij], respectively. The mean function μ[ij](t) itself was randomly generated by sampling a GP with mean 0 and covariance matrix [MATH: Σijt,t :MATH] . If a gene (i) is not differentially expressed between two conditions (j = 1, 2), then μ[i1](t) = μ[i2](t) and a[i1] = a[i2] = 15, b[ij] = 1. In the opposite case, i.e. when a gene (i) is differentially expressed between two conditions (j = 1, 2), we assumed μ[i1](t) ≠ μ[i2](t) and these were generated by sampling two corresponding GPs. The smoothing parameters l[i1],l[i2] were generated by sampling Gamma distributions with the parameters a[i1] = 15, b[i2] = 1; a[i2] = 25, b[i2] = 1 respectively. 2.4 Gene expression data The gene expression dataset, which has been described previously, was obtained from GSK in the form of raw data files (.cel files) [[76]12]. Hegde et al., treated four cell lines (BT-474, SKBR-3, T47D and MDA-MB-468) with 0.1% DMSO (control), 0.1 μM lapatinib and 1 μM lapatinib at 0, 2, 6, 12, 24 hours. Not all samples that were taken had data for all time-points (see [77]Table 1). All measurements were performed using the U133A Affymetrix human 22,000-element microarray. There were 36 arrays for each cell line, so a total of 144 arrays were used in our analysis. We performed a probe centric analysis, i.e. the intensity values of each Affymetrix probe was modelled using the aforementioned GP model. Once a probe was identified as a potential marker, the corresponding gene was identified using BioMart ([78]http://www.biomart.org/) mapping service. Table 1. Treatment timepoints available from the Hedge et al.’s dataset. Cell lines 0.1 μM lapatinib 1 μM lapatinib 0.1% DMSO SKBR-3 6, 12 and 24 hours 6 and 12 hours 0, 2, 6, 12, 24 hours BT-474 2, 6, 12, 24 hours T-47D 6, 12 and 24 hours MDA-MB-468 [79]Open in a new tab 2.5 Cell-lines and reagents SKBR-3, BT-474, MDA-MB-453, HCC 1954, MDA-MB-468, and T47D were obtained from ATCC and the JIMT-1 cells were obtained from the German Tissue Repository DSMZ. HCC 1954-L was developed in-house through continuous exposure to lapatinib [[80]26]. BC cell lines were maintained in RPMI 1640 medium supplemented with 10% fetal bovine serum (Gibco). HCC 1954-L were maintained in culture in 1μM lapatinib, however, cells were removed from drug 7 days prior to all assays. All cell lines were maintained at 37°C in a 5% CO[2] incubator. Calcitriol and lapatinib were obtained from Sequoia Research Products (Pangbourne UK). Cell culture media, DMSO, ethanol were obtained from Sigma-Aldrich (Dublin, Ireland). Lapatinib stock solution (10 mM) was prepared in DMSO. Calcitriol stock solution (10 mM) was prepared in ethanol and stored at -20°C. 2.6 Lapatinib treatment and RNA extraction Triplicate RNA samples were prepared from SKBR-3, BT-474, MDA-MB-468, T47D, MDA-MB-453 and JIMT-1. The cell lines were grown to approximately 75% confluency. Test samples were treated with 1 μM lapatinib for 16 hours. Control samples were treated with equivalent concentrations of DMSO. After the 16 hour incubation, RNA was extracted from the control and treated samples using the RNeasy mini Kit (Qiagen, Hilden, Germany) according to the manufacturer’s protocol and were treated with RNase-free DNase (Qiagen). RNA was isolated from MDA-MB-453 and JIMT-1 cells using TriReagent (Sigma, Ireland). The RNA was extracted using a liquid-liquid extraction procedure with chloroform (0.2 mL) and iso-propanol. The RNA was washed with 75% ethanol and resuspended in 40 μL RNase free water. cDNA was prepared from 2 μg of total RNA using a high capacity RNA to cDNA kit (Applied Biosystems, Foster City, CA, USA). 2.7 Taqman RT PCR Taqman Array 96 well fast plates (Applied Biosystems, Dublin Ireland) were pre-coated with primers for the specific genes for validation (ALDH3A2, AMD1, ANKRD10, ATP2B4, COPS3, EIF2S2, EIF4B, GAPDH, GTPBP4, GUSB, IGBP1, ING4, KLHDC2, KLHL24, MPZL1, NUDC, PPIC, PPP2CA, PTP4A1, PTP4A2, RB1CC1, RNF13, SOCS3, SPAG1, TFAP2C, TIMM17A, TJP2, VDR, WIPI2, ZDHHC17). 40 ng of cDNA and 5 μL of Taqman Fast Universal Master Mix (2x), no AmpErase UNG (Applied Biosystems) were dispensed into each well. The following thermal cycling specifications were performed on the ABI 7900 Fast Real-Time PCR system (Applied Biosystems, Foster City, CA, USA); 20 s at 95°C and 40 cycles each for 3 s at 95°C and 30 s at 60°C. Expression values were calculated using the comparative cycle threshold (Ct) method [[81]27]. 18S ribosomal RNA was selected as the endogenous control. The cycle threshold (Ct) indicates the cycle number by which the amount of amplified target reaches a fixed threshold. The Ct data for 18S was used to create ΔCt values [ΔCt = Ct (target gene)-Ct (18S)]. ΔΔCt values were calculated by subtracting ΔCt of the calibrator (DMSO controls) from the ΔCt value of each target. Relative quantification (RQ) values were calculated using the equation 2-ΔΔCt. Differences in the mRNA expression level between untreated and drug treated samples were assessed using the Students t test. A VDR gene expression assays (Hs01045840_m1) was performed with MDA-MB-453 and JIMT-1 cDNA (5 ng/well), using 18S as (Hs99999901_S2) as endogenous control. 2.8 Proliferation assay in vitro Proliferation was measured using an acid phosphatase assay [[82]28]. 2.5 to 5 x 10^3 cells per well were seeded in 96 plates and incubated for 24 hours prior to the addition of drug. After 5 days of drug treatment cells were washed with PBS. 10 mM paranitrophenol phosphate substrate (Sigma-Aldrich) in 0.1 M sodium acetate buffer with 0.1% Triton X (Sigma Aldrich) was added to each well and incubated at 37°C for 2 hours. The reaction as stopped with 50 μL of 1 M NaOH and the absorbance was read at 405 nM (reference—620 nM). Growth of drug treated cells was calculated relative to control untreated cells in biological triplicate. 3 Results 3.1 Comparing time course gene expression profiles using GP Time course gene expression data consists of noisy measurements of mRNA levels at different time points. As discussed before, these measurements contain both systematic and random variabilities. It is necessary to take both types of variabilities into account when comparing two sets of time course measurements of mRNA expression. We developed an empirical GP model of time course mRNA profiles that captures (a) average mRNA levels at different points in time, (b) the extent of random uncertainty caused by cell to cell variability and experimental measurement noise; and (c) the temporal variability in mRNA profiles caused by internal mechanisms of gene regulation. In our formulation, a time course mRNA profile (m[i](t)) is assumed to have a Gaussian distribution with a mean (μ[i](t)) which represents the average mRNA levels at different time points (t), and a covariance matrix (Σ[i]) which captures the overall variability in the mRNA profile. The covariance matrix (Σ[i]) has two components ( [MATH: Σi=Σit,t+ΣiN :MATH] ), one ( [MATH: Σit, t :MATH] ) accounts for systematic variability between mRNA expressions at any two instants of time (t, t') and the other ( [MATH: ΣiN :MATH] ) accounts for random noise. We used a modified Ornstein–Uhlenbeck function ( [MATH: Σit, t=vi exp(|tt|di ) :MATH] ) [[83]24, [84]25, [85]29], to model the systematic variabilities [MATH: (Σit,t) :MATH] in mRNA expressions between two instants in time (t, t'), where v[i] and d[i] are two parameters [[86]30]. The random noise is assumed to be Gaussian with 0 mean and variance [MATH: σi2 :MATH] (i.e. [MATH: ΣiN=σi2I :MATH] , I is an identity matrix). Under these assumptions, the likelihood of a set of observed mRNA profiles ( [MATH: mir( t) :MATH] ) is given by [MATH: p(mi r(t)|μi(t),Σi)< mo>=1(2 π) Nr*NT< /mrow>2 | Σi|Nr2r=1< /mn>Nrexp(μmr( t)TΣ< mi>i1μ mr(t)2) :MATH] (2) where N[r] is the number of replicates and [MATH: μmr( t)=mir (t)μi(t) :MATH] . μ[i](t) and Σ[i] are estimated from observed data as shown in Materials and Methods ([87]M&M). We designed a statistical hypothesis test to see whether two sets of ( [MATH: mic1(t),mic2< /mrow>(t) :MATH] ) expression profiles of an mRNA, measured under two experimental conditions (c[1], c[2]), are significantly different from each other. In this test we compared the likelihoods ([88]Eq 2) of two hypotheses: (a) both sets of mRNA profiles ( [MATH: mic1(t),mic2< /mrow>(t) :MATH] ) have the same GP model ( [MATH: mic1(t),mic2< /mrow>(t)~GP(μi(t),Σ i) :MATH] ); and (b) they come from two different sets of GP models ( [MATH: mic1(t)~GP(μic1(t),Σic1) ,mic2(t)~GP(μic2(t),Σ< mi>ic2) :MATH] )). The comparisons were performed using a likelihood ratio test. The test was repeated for each mRNA in the dataset, producing a set of p-values. Subsequently, the Benjamini-Hochberg method was used to correct for multiple testing, and differentially expressed mRNAs were selected within a 5% false discovery rate (FDR). These mRNAs are the ones whose temporal expression patterns are significantly different between the two experimental conditions under investigation. An implementation of the above pipeline, named GEAGP, in MATLAB is freely available for download from [89]https://github.com/SBIUCD/GEAGP.git. 3.2 Comparing the performance of GEAGP with other methods To evaluate and compare the performance of the GEAGP framework with other methods we generated 100 simulated datasets. Each simulated dataset consists of the expression profiles of 1000 genes measured at 10 equal intervals over a period of 100 hours in two different conditions. Four replicates were generated for each expression profile. Approximately half of the genes in each dataset were randomly chosen to have differential expression patterns between the two conditions. For the simulation, we assumed that the genes which are not differentially expressed have the same underlying pattern (average expression at each time point) with the same level of added systematic and random variabilities in each replicate across the two conditions. On the other hand, the genes which are differentially expressed were assumed to have different underlying patterns with different levels of systematic variabilities across the two conditions. The random variability, which represents biological variability and measurement errors, was kept at the same level in both conditions. Details of the data simulation process are described in the methods section. Random noise was kept at a moderate level (standard deviation = 0.2). Then, we used four different methods to find differentially expressed genes in each simulated dataset. Two of these methods, ttest & ranksum test, are widely used to analyse static gene expression data. The other two methods, GEAGP and GPREGE [[90]17], use GP to analyse temporal expression profiles. The main differences between GEAGP and GPREGE [[91]17] are the following: GEAGP directly estimates the mean functions of the GP from data, whereas GPREGE assumes that the mean function is a linear combination of a set of basis functions, and attempts to infer the coefficients of these linear equations. The two methods also use different mathematical formulations for parameter estimation and optimization process (see [92]M&M and [[93]17] for details). Prediction accuracy of each method was evaluated by calculating the area under the receiver operating characteristic (AUROC) curve (for details see [[94]31]) based on the results produced by these methods on each dataset. AUROCs can have values between 0 and 1, and the closer these values are to 1 the better is the accuracy of the inferred networks, with AUROC = 1 being the ideal case. The average AUROCs across all 100 datasets represents the overall prediction accuracy of a method and the corresponding standard deviation represents the confidence intervals surrounding these estimates. The results of the above analysis suggest that GEAGP and GPREGE have comparable accuracies (AUROCs = 0.946 ± 0.0074 and 0.952 ± 0.0046 respectively), whereas both t-test and ranksum test were less accurate than the GP based methods (average AUROCs = 0.7538± 0.0087 and 0.7432 ± 0.0082 respectively). Although the accuracies of GPREGE and GEAGP are comparable, GEAGP has the distinct advantage of speed. In the above dataset GEAGP and GPREGE took on an average ≈ 0.5 and ≈ 7.5 seconds, respectively, to execute each differential expression test on a Intel Core i7 (3rd Gen) 3720QM based laptop with 20Gb RAM, i.e. GEAGP is on an average about 15 times faster than the GPREGE. To illustrate what it means for a medium scale bioinformatic analysis, consider a dataset consisting 20,000 genes and 10 experimental conditions. To find differentially expressed genes between each pair of conditions one needs to perform [MATH: (10< /mtd>2)=45 :MATH] differential expression tests for each gene. Using a single core of the Intel Core i7 (3rd Gen) 3720QM processor, GEAGP will take approximately 5 days to do this analysis, whereas GPREGE will take more than 78 days. Using all four cores of the processor GEAGP will finish the task in about 1 days and 8 hours, whereas GPREGE will take more than 19 days. Thus, GEAGP provides similar accuracy as GPREGE at a fraction of the computational cost, making it more suitable for the analysis of medium to large scale time course data sets. Encouraged by the above results, we used GEAGP to analyse real gene expression datasets. 3.3 Discovering markers of lapatinib insensitivity in BC cells using GEAGP GEAGP was used to analyse time course mRNA profiles in BC cell lines which were treated with different doses of lapatinib and DMSO [[95]12]. In this experiment the mRNA expression profiles were measured in two lapatinib responsive (BT-474, SKBR-3) and two lapatinib non-responsive (MDA-MB-468, T47D) BC cell lines. HER2 and EGFR, the two receptor kinases targeted by lapatinib, have different expression patterns in these cell lines ([96]Table 2). Table 2. Histological-subtypes, HER2 and EGFR receptor expressions and mutation status of SKBR-3, BT-474, T47D and MDA-MB-468 cell lines. Protein Expression Gene Mutation Status Cell line Subtype EGFR HER2 KRAS BRAF PI3K RB1 p53 PTEN ERBB2 SKBR-3 HER2 +ve - - R175H BT-474 HER2 +ve 38.65 97.99 K111N E285K AMPL T47D Luminal A 44.35 77.55 H1047R L194F MDA-MB-468 Basal Like 99.66 39.05 DEL R273H E545A [97]Open in a new tab The percentile scores indicate the percentage of cell lines in GDSC database which had similar or lower expressions of the same genes. The first column contains cell-line name. Second column contains histological subtypes, third and fourth columns contain expression of EGFR and HER2 receptors. The expressions are given in percentiles, i.e. percent of cell-lines in the GDSC database which have similar or lower expressions of HER2 or EGFR than the cell line corresponding to the percentile value. Fifth till eleventh columns contain mutation status. Empty cells represent no mutation, DEL represents deletion, AMPL represents amplification, and the remaining non-empty cells contains amino acid substitution information. HER2 is over-expressed in both lapatinib responsive cell lines SKBR-3 and BT-474 [[98]32]. EGFR is overexpressed in SKBR3 [[99]33], but has relatively low expression in BT-474 (~62% of 1018 cancer cell lines in the GDSC database had higher EGFR expression than BT-474) [[100]34]. The non-responsive cell lines, T47D and MDA-MB-468 are classified as luminal A and basal-like BC cells in literature [[101]32]. Both types of BC cells have low HER2 expression. However, T47D has high HER2 expression (more than 77.55% of GDSC cell lines [[102]34]) and moderate low EGFR expression (lower than ~56% of GDSC cell lines [[103]34]). MDA-MB-468 cells have low HER2 expression (lower than ~61% of GDSC cell lines [[104]34])) and very high level of EGFR expression (higher than ~99.6% of all GDSC cell lines [[105]34]). We also looked at the mutation spectrum of these cell lines using the COSMIC database (cancer.sanger.ac.uk). Data on some of the mutations which were previously implicated in lapatinib resistance are shown in [106]Table 2. Among the responsive cell lines, SKBR3 has a p53 mutation, whereas, BT-474 has PI3K, ERBB2 and p53 mutations. Among the non-responsive cell lines, T47D has a p53 and PI3KCA mutation, whereas MDA-MB-468 has p53, PTEN and RB1 mutations. None of the cell lines have KRAS or BRAF mutation. Therefore, there is no particular set of mutations that separate the above responsive and non-responsive cell lines. This suggest that the expressions of EGFR and HER2 receptors and/or the mutation statuses do not explain the responsiveness of the above cell lines to lapatinib treatment. Hence, these cell lines are reasonable choices for studying lapatinib response in BC cells. The cells were treated with 0.1 μM DMSO, 0.1 μM or 1 μM lapatinib[[107]12]. The measurements were taken at 0, 2, 6, 12 and 24 hours for DMSO treated cells; at 2, 6, 12, 24 hours for cells treated with low doses (0.1μM) of lapatinib (except for SKBR-3 cells for which no measurements were available at 2 hours); and at 2, 6 and 12 hours for cells treated with high doses (1.0 μM) of lapatinib (except for BT-474 and SKBR-3 cell lines for which measurements were not available at 2 hours). Each experiment was replicated 3–4 times. Missing replicates (where only 3 replicates were available instead of 4) were imputed based on the intensities observed in the other replicate experiments (see [108]M&M for details). The resulting dataset was then used to identify potential markers of lapatinib insensitivity in BC cells. We assumed that genes which satisfy all of the following criteria are potential markers for lapatinib insensitivity: * Have similar expression profiles within the responsive or non-responsive groups of cell lines, but are differentially expressed between these two groups. * Have altered expression in response to both doses of lapatinib treatment in responsive cells but not in non-responsive cells. To find genes which satisfy the above criteria we performed a two stage analysis. In the first stage each mRNA expression profile was subjected to the following hypothesis tests ([109]Fig 1): Fig 1. A schematic diagram of the GP based analysis performed on the gene expression dataset of [[110]12]. [111]Fig 1 [112]Open in a new tab * Responsive vs non-responsive groups (Test 1): The objective of this test was to identify genes which have differential expression patterns between DMSO treated responsive (SKBR-3, BT-474) and non-responsive cell lines (MDA468, T47D). Here we tested the hypothesis that a single GP model is sufficient to represent the mRNA expression profiles of a gene in all four cell lines when the cells are treated with DMSO, vs the alternative hypothesis that two GP models are needed, one for the responsive cell lines and another for the non-responsive cell lines. * Responds to 0.1μM of lapatinib in responsive cells (Test 2): This test aimed to identify genes which have differential expression patterns between DMSO treated and 0.1 μM lapatinib treated responsive cells (BT-474, SKBR-3). Here we tested the hypothesis that one GP model is sufficient to model the mRNA expression profiles of a gene in both DMSO and lapatinib (0.1 μM) treated responsive cell-lines (BT-474, SKBR-3) against the alternative hypothesis that two separate GP models are needed, one for DMSO treated cells and the other for lapatinib (0.1 μM) treated cells. * Responds to 1 μM of lapatinib in responsive cells (Test 3): The goal of this test was to identify genes which have differential expression patterns between DMSO and 1 μM lapatinib treated responsive cells (BT-474, SKBR-3). Here, we tested the hypothesis that the one GP model is sufficient to model the mRNA expression profiles in DMSO and lapatinib (1 μM) treated responsive cells (BT-474, SKBR-3) against the alternative hypothesis that two separate GP models are needed, one for DMSO treated cells and the other for lapatinib (0.1 μM) treated cells. * Responds to 0.1 μM of lapatinib in non-responsive cells (Test 4): This test identified genes whose expression is altered in response to low doses (0.1 μM) of lapatinib in non-responsive cells (MDA468 and T47D). Here, we tested the hypothesis that one GP model is sufficient to model the mRNA expression profiles in DMSO and lapatinib (0.1 μM) treated non-responsive cells (MDA468 and T47D) against the hypothesis that two separate GP models are needed, one for DMSO treated cells and the other for lapatinib (0.1 μM) treated cells. * Responds to 1μM of lapatinib in non-responsive cells (Test 5): This test identified the genes whose expression in altered in response to high doses (1 μM) of lapatinib in non-responsive cells (MDA468 and T47D). This tested the hypothesis that only one GP model is sufficient to model the mRNA expression profiles in DMSO and lapatinib (1 μM) treated non-responsive cells (MDA468 and T47D) against the hypothesis that two separate GP models are need, one for DMSO treated cells and the other for lapatinib (0.1 μM) treated cells. The genes which passed the first test and at least one of the second and third tests, but fail both of the last two tests within a 5% false discovery rate were assumed to satisfy the criteria of genes selectively regulated by lapatinib in responsive, but not in non-esponsive cells. This analysis selected 519 genes out of 22000 genes as potential markers of lapatinib response (Table A in [113]S1 File). In a second step, these genes were further analysed using established bioinformatics methods as described below. 3.4 Bioinformatics analysis of markers of lapatinib insensitivity in BC cells First, we performed gene ontology enrichment analysis on the 519 selected genes using DAVID [[114]35, [115]36]. The enriched gene ontology terms were then summarized and visualized by REVIGO ([116]Fig 2A) [[117]37]. The most enriched gene ontology terms fall into seven main categories, i.e. (i) cell cycle, (ii) apoptosis or programmed cell death, (iii) DNA damage, (iv) RNA splicing, (v) intracellular transport, (vi) signal transduction and (vii) chromosome assembly. These biological processes are usually found altered in cancer cells, and therefore it is not surprising to see genes which participate in these process responding differentially to lapatinib treatment in responsive and non-responsive BC cells. We also performed pathway enrichment analysis using DAVID [[118]35, [119]36] and employing KEGG [[120]38] and PANTHER pathway databases [[121]39]. Several signalling pathways that play key roles in cancer, e.g. EGFR pathway, FGF pathway, spliceosome, proteasome, FcγR mediated signalling, were also found to be enriched in the selected genes ([122]Fig 2B). FGF and FcγR share several components with EGFR pathway which is activated by the EGFR family of receptors, two of which are targets of lapatinib. Ten genes, RAC1, STAT1, YWHAZ, PPP2CB, MAP2K1, MAPK1, PPP2R5B, PLCG1, PPP2CA, STAT5B, belonging to the EGFR pathway were in the list of potential markers of lapatinib insensitivity. Fig 2. Bioinformatics analysis of the selected genes. [123]Fig 2 [124]Open in a new tab (A) Enriched gene ontology terms are summarized and visualized by REVIGO. Here, each circle represents a gene ontology term and the size of the circle represents the extent of enrichment. (B) Pathway enrichment analysis of the selected genes. Pathways are shown in X-axis and the enrichment scores are shown in Y-axis. (C) Transcriptional module found in the identified genes. (D) Transcriptional activity of VDR. (E) PPI network induced by the identified genes. (F,G) PPIs of VDR and EIF2S2. We then looked for known transcription regulatory interactions (TRIs) among the selected genes. Human genome wide known TRIs were retrieved from two sources, the ENCODE database containing ChIP (chromatin immunoprecipitation) on ChIP data [[125]40–[126]42] and the HTRIB database [[127]43] which contains experimentally validated TRIs. All TRIs, where both the transcription regulator and its target gene were identified as potential markers for lapatinib insensitivity in the aforementioned GEAGP analysis, were retrieved. The ensemble of these TRIs ([128]Fig 2C) represents part of the known human transcription regulatory network which is dysregulated in lapatinib non-responsive BC cells. Interestingly, this dysregulated network consists of a transcriptional module formed by eight transcription factors STAT1, CTCF, EP300, FOXA1, VDR, MAPK1, TFAP2C, E2F1 ([129]Fig 2C) which regulate each other and a large number of the genes selected by the GEAGP analysis. Among these transcription factors, STAT1, MAPK1, FOXA1, TFAP2C, E2F1 are associated with resistance or insensitivity to several anticancer therapies including tyrosine kinase inhibitors such as lapatinib in BC cells [[130]44–[131]51]. CTCF and EP300 also have been linked to BC. The putative tumour suppressor gene EP300 is frequently mutated in BC [[132]52, [133]53], while CTCF overexpression is correlated with resistance to apoptosis in BC [[134]52, [135]53]. Polymorphism of the VDR (Vitamin D Receptor) was previously associated with BC [[136]54], but its expression has no known association either with BC or with lapatinib insensitivity. Since it was found to be part of the same transcriptional module whose constituents are associated with lapatinib insensitivity, VDR could be a potential target for modulating lapatinib response in BC cells which do not respond to lapatinib. We further investigated the protein-protein interaction (PPI) network formed by the protein products of the selected genes. The STRING database [[137]55] was used to find PPIs between the products of the selected genes ([138]Fig 2E). Interestingly, many of the above transcription factors (MAPK1, TFAP2C, EP300) act as large hubs in the PPI network. They were also found to interact with each other, not only at the transcriptional level as found in the TRI network, but also at the protein level ([139]Fig 2E–2G). In the PPI networks, VDR interacts with relatively few proteins ([140]Fig 2F). However, its interactors include STAT1, MAPK1 and PPP2CA, all of which have been associated with lapatinib response [[141]44–[142]46, [143]56]. This strengthens the case for VDR as a potential target for influencing lapatinib insensitivity. Another interesting protein is EIF2S2. It interacts with several large hubs such as MAP2K1, ELAVL1 and PAPOLA1 in the PPI network ([144]Fig 2G), and therefore may play important roles in insensitivity of BC cells to cancer drugs. The above analysis suggests that many of the genes which were identified by the GEAGP algorithm have association with BC and lapatinib insensitivity. Additionally, it also demonstrates that some of these genes play important role in the transcriptional and protein interaction networks that are dysregulated in lapatinib insensitive BC cells, and therefore can be potential targets for treating these cells. 3.5 Validating the markers of lapatinib insensitivity using GDSC data We analysed the GDSC database [[145]34] to see if the expression of the genes selected by the GEAGP algorithm correlates with responses of BC cells to HER2 and/or EGFR targeted therapies. Any such correlation will strengthen their potential as markers for BC cell response to HER2 and/or EGFR targeted therapies. GDSC database [[146]34] contains basal gene expression and drug response data for fifty BC cell lines belonging to different histological subtypes and five drugs (lapatinib, erlotinib, EKB-569, afatinib and CP724714) that target HER2 and/or EGFR tyrosine kinases. Expression data was available for 458 of 519 genes selected by the GEAGP algorithm. We calculated pearson correlation between the mRNA expressions of each gene and the area under the dose response curves of each cell-line for each drug. mRNA expressions of 120 genes out of 458 had statistically significant (p-value<0.05) correlation with the responses of BC cells to at least one of the five drugs (Table B in [147]S1 File). When corrected for false discovery rate (FDR), 32 of the 120 genes were selected at 10% FDR (Table B in [148]S1 File). Most of these genes’ expressions correlated with response to afatinib which, like lapatinib, targets HER2 and EGFR. There were relatively little correlation between the expressions of these genes and lapatinib response in GDSC dataset [[149]34]. This is most likely due to lack of lapatinib response data which is available for only 13 out of 50 BC cell lines. Nevertheless, the above analysis suggests that many of the genes that were identified as markers of lapatinib insensitivity in T47D and MDA-MB-468 cells, can indeed be generally used as markers of BC cell response to a variety of HER2 and/or EGFR targeted therapies. 3.6 Experimental validation of selected genes It is not feasible to experimentally validate all 519 genes which were identified in the above analysis due to time and cost factors. To select a smaller subset for experimental validation, we first compared our list of potential lapatinib insensitivity markers to that determined in a previous study by O'Neill et al.[[150]57]. O’Neill et al. used multiple parallel t-tests to analyse part of the same microarray dataset used here for identifying potential lapatinib biomarkers. 25 genes were found to be common between our list and that of the O’Neill et al.‘s study. These common genes include * VDR, EIF2S2 and SOCS3 (regulator of STAT1) which were found to play important roles in the PPI and transcriptional networks in the bioinformatics analysis * ANKRD10, PTP4A1, PTP4A2, VDR, ATP2B4 which had statistically significant correlation with BC cell responses to HER2 and/or EGFR targeted therapies Therefore, this list includes a mix of genes identified in the bioinformatics analysis, GDSC data [[151]34] analysis, and a set of new lapatinib insensitivity markers. We further added two more genes to this list, phosphatase PPP2CA and transcription factor TFAP2C, both of which was identified to be important in the bioinformatics analysis and the later also has statistically significant correlation with BC cell responses to HER2 and/or EGFR targeted therapies (Table B in [152]S1 File). The differential gene expression levels of the resulting 27 genes (see [153]Table 1) and an endogenous control gene (18S ribosomal RNA) were measured in mRNA from SKBR-3, BT-474, T47D and MDA-MB-468 cell lines treated with DMSO or lapatinib (1 μM) for 16 hours, using real time quantitative polymerase chain reaction (qRT-PCR). The data obtained from the above experiment were then statistically analysed to examine whether expression of these genes was differentially altered in response to lapatinib in responsive and non-responsive cell lines. Relative expression values were calculated using the comparative threshold cycle (C[T]) method [[154]27]. First, we subtracted the cycle threshold (C[T]) values of the housekeeping gene (18S) from those of the selected genes [[155]58]. The resulting values (ΔC[T]) represent the responses of each gene to DMSO/ lapatinib and were analysed using ANCOVA (Analysis of Covariance) analysis [[156]59]. For the ANCOVA analysis, the ΔC[T] values of each gene were used as response variables, the dose of lapatinib was used as predictor variable (1 μM lapatinib for treated cell, 0 μM for DMSO treated control cells), and the responsive or non-responsive (to lapatinib) status of the cell lines were used as covariates. The p-values calculated by the ANCOVA analysis were then subjected to correction for multiple testing. For this purpose, a false discovery rate (FDR) corresponding to each p-value was calculated using the Benjamini Hochberg method. Genes with FDR lower than 0.1 (representing 10% false discovery rate) were considered to pass the validation test. We used MATLAB functions aoctools and mafdr for the ANCOVA and FDR analysis, respectively. 16 out of 28 genes passed the test and their changes (in terms of average log-fold change with respect to DMSO treated cells) to lapatinib treatment in all four cells lines are shown in [157]Fig 3. Fig 3. Real time qPCR based validation of a set of selected potential biomarkers. [158]Fig 3 [159]Open in a new tab Each panel consists of the log fold changes (lapatinib treated vs untreated, shown in Y-axis) of the corresponding gene (shown in the header of each panel) in all four cell lines (X-axis). Plots are based on N = 3 biological replicates. 3.7 Identifying potential therapeutic targets for treating lapatinib insensitive BC cells To find potential targets, we focussed on the genes which satisfy the following criteria: * identified by GEAGP as potential lapatinib insensitivity markers * have statistically significant correlation (at <10% FDR) with BC cell response to HER2 and/or EGFR targeted therapy. 32 genes (Table B in [160]S1 File) satisfy these conditions. To further narrow down the list of potential targets we investigated whether the expressions of any of these genes correlate with BC patient survival. Ideally, we would like to analyse data from BC patients who was treated with HER2 and/or EGFR targeted therapies. But, due to lack of sufficient details on treatment history of patients, we analysed data from HER2 negative, and basal type BC patients who typically do not respond to lapatinib therapies [[161]60, [162]61]. In some cases, HER2 positive patients do not respond to these types of therapies due to lack of EGFR expression or acquired resistance [[163]62]. Therefore, we also included HER2 positive patients in our survival data analysis. We used Kaplan-Meier plot to find association between expression of each gene in the list (Table B in [164]S1 File) and recurrence free survivals of HER2 negative, basal-like and HER2 positive patient cohorts. We used kmplot [[165]63] web tool which integrate data from several public sources to perform this analysis. 26, 21 and 23 genes (out of total 32) were associated at 10% FDR with the survival of HER2 negative, basal and HER2 positive patient cohorts (Table C in [166]S1 File). 14 genes, SDS, UCP2, KPNB1, REEP5, ALAS3, ZNF10, SDC1, GPAA1, RHOB, EXOSC4, FOXA1, VDR, GLANT7, MLPH, had significant association with the survival of all three patient cohorts. We did not find any reference to REEP5, ALAS3, ZNF10, EXOSC4 and GLANT7 in existing BC literature. Therefore association of these genes to BC, especially lapatinib insensitivity of BC cells, may be novel finds of this study. SDS, UCP2, KPNB1, SDC1, GPAA1, and MLPH had been sporadically studied in the context of BC [[167]64–[168]69]. But, to the best of our knowledge, their role in lapatinib insensitivity of BC cells are not known. VDR, FOXA1 and RHOB are well studied in the context of BC. However, to the best of our knowledge, their potential as a therapeutic target for treating lapatinib insensitive BC cells was not previously investigated. We selected VDR for further experimental testing. 3.8 Investigation of VDR as a therapeutic target Vitamin D Receptor (VDR), a member of the nuclear receptor family of proteins, acts as a receptor for active vitamin D as well as a DNA-binding transcription factor. In that respect, it is similar to more familiar cancer targets such as oestrogen receptor, androgen receptor and progesterone receptor [[169]70]. Patient survival analysis using data from kmplot database [[170]63] suggests that VDR expression does not correlate with recurrence free survival of BC patients in general ([171]Fig 4A). However, HER2 negative BC patients (typically do not respond to lapatinib) with high VDR expressions are more likely to survive longer than those with low VDR expressions ([172]Fig 4B). Interestingly, basal-like BC patients, most of whom have low or no HER2 expression show the opposite trend, that is, patients with low VDR expression are more likely to survive longer than patients with high VDR expression ([173]Fig 4C). HER2 positive patients exhibit similar trend as the basal-like patients ([174]Fig 4D). Furthermore, using Regulome explorer ([175]http://explorer.cancerregulome.org/) we investigated whether VDR expression has any significant association with other genomic features of BC patients. It was found to have significant correlations with protein expression of Estrogen receptor and the expressions of twenty two miRNAs ([176]Fig 4E), most of which had been previously implicated in proliferation, apoptosis, senescence, resistance to drugs including HER2 targeted therapies [[177]71, [178]72]. This further strengthens VDR’s potential as therapeutic target for lapatinib insensitive BC patients. One of the most commonly used commercially available reagents for VDR targeted therapy is Calcitriol, the active hormonal form of Vitamin D[3] [[179]73]. It primarily functions by binding to and activating VDR leading to signalling and transcriptomic changes [[180]73]. However, the effect of calcitriol depends on dose, frequency and time lapsed following treatment. For instance, low levels of calcitriol have been shown to possess pro-tumorigenic effects, whereas high doses of calcitriol, especially when applied in pulses, were shown to have anti-proliferative and anti-tumorigenic effects in several types of cancer cells [[181]74–[182]76]. Although at the protein level calcitriol activates VDR signalling by binding to it, it was also shown to reduce the mRNA expression level of VDR [[183]77–[184]79]. Recently, combinations of calcitriol and antiestrogen or gefitinib (a tyrosine kinase inhibitor), respectively, were found to reduce proliferation and increase apoptosis of BC cells more effectively than any of these compounds alone [[185]80, [186]81]. Fig 4. Interrogation of VDR as potential therapeutic targets for sensitizing lapatinib insensitive breast cancer cell lines to lapatinib treatment. [187]Fig 4 [188]Open in a new tab (A-D) Association between recursion free survival of all, HER2 negative, triple negative, HER2 positive breast cancer patients with VDR expression. (E) Association between VDR expression and the expressions of 22 miRNAs in the TCGA breast cancer patient cohort. (F) ΔΔC[T] values for Vitamin D Receptor in HER2 positive lapatinib insensitive cell lines (MDA-MB-453 and JIMT-1) showing changes between untreated cells and cells expose to lapatinib for 16 hours. (G-J) Combined treatment with lapatinib and calcitriol in MDA-MB-453, JIMT-1, HCC 1954 and HCC 1956-L cells respectively (* denotes p<0.05).Plots are based on N = 3 biological replicates. We investigated whether the combination of calcitriol and lapatinib can be used as an effective treatment strategy for lapatinib insensitive BC patients. We selected four cell lines for our experiment. Two of these, MDA-MB-453 and JIMT-1 are HER2 positive but innately insensitive to lapatinib. Whereas HCC 1954-L cell lines, which was developed by exposing HER2 positive HCC 1954 cells to lapatinib for a long period of time, has acquired resistance to lapatinib. The clinico-pathological subtypes and the expression levels of HER2, EGFR and VDR which are targets of lapatinib and calcitriol are shown in [189]Table 3. Table 3. Clinico-pathological subtypes of and receptor expressions in MDA-MB-453 and JIMT1 cell lines. Cell line Clinico pathological subtype HER2 (percentile) EGFR (percentile) VDR (percentile) MDA-MB-453 Triple negative[[190]82], HER2 positive [[191]83] 97.50 79.03 76.18 JIMT-1 HER2 positive 97.30 2.90 95.33 HCC 1954 HER2 positive 99.21 76.20 87.92 [192]Open in a new tab The receptor expression data were collected from the GDSC database [[193]34]. The expression data are shown in percentile, here X percentile means X percent of all cell lines in the GDSC database [[194]34] had equal or less expression than the above cell lines. Among the innately lapatinib insensitive cell lines, clinico-pathological subtype of MDA-MB-453 is fiercely debated in literature, while some suggest it is a triple negative BC cell with no or low HER2 expression [[195]82], others reports it to be HER2 positive with overexpressed HER2 [[196]83]. We compared HER2 expression in MDA-MB-453 cells with other cancer cell lines in GDSC database [[197]34] and found that HER2 has higher mRNA expression in MDA-MB-453 than ~97.5% of all (1018) cell lines in this database. JIMT-1 is known to be HER2 positive [[198]84] which is corroborated by the GDSC data [[199]34] ([200]Table 3). While EGFR is over expressed in MDA-MB-453, it is significantly under-expressed in JIMT-1 ([201]Table 3). VDR has above average expressions in MDA-MB-453 and JIMT-1 cell lines and have higher expression in JIMT1 than in MDA-MB-453 according to the GDSC database. However, in our qRT-PCR analysis VDR has higher expression in MDA-MB-453 than in JIMT-1 cells (Figure A in [202]S1 File). Despite overexpressing at least one of HER2 or EGFR neither of these cell lines respond to lapatinib. Following 16 hours of treatment with 1 μM lapatinib neither cell-line showed a significant change in VDR gene expression ([203]Fig 4F), reflecting the same behaviour as the lapatinib insensitive T47D and MDA-MB-468 cell lines, which were used to generate the time course gene expression data. HCC1954, which was used to develop HCC 1954-L with acquired lapatinib resistace, is known to be HER2 positive and has overexpressed HER2, EGFR, and VDR ([204]Table 3). The cells were treated with calcitriol (6 μM for MDA-MB-453, JIMT-1, HCC 1954 and 500nM for HCC 1954-L) and lapatinib (0.25 μM, 0.5 μM, 1.0 μM for all four cell lines), individually and in combination (calcitriol + lapatinib) for 5 days before measuring cell proliferation. In MDA-MB-453, JIMT-1, HCC1954 and HCC1954-L cells calcitriol alone resulted in 19%, 11%, 21% and 30% growth inhibition respectively ([205]Fig 4G–4J), whereas, different doses of lapatinib inhibited the growth of MDA-MB-453, JIMT-1, HCC1954 and HCC1954-L cells by 10–19%, 1–5%, 50–64% and 25–44% respectively ([206]Fig 4G–4J). However, the combination of calcitriol + lapatinib resulted in 36–38%, 13–17% and 43–52%, 49–70% growth inhibition in MDA-MB-453, JIMT-1, HCC1954 and HCC1954-L cells respectively depending on the doses of lapatinib ([207]Fig 4G–4J). Thus, among the innately insensitive cells, treatment with lapatinib and calcitriol showed a significant additive effect in MDA-MB-453 but not in JIMT-1 cells. The lower expression levels of VDR expression in the JIMT-1 cells compared to MDA-MB-453 cells may plausibly explain why these cells are less sensitive to calcitriol treatment. On the other hand, among the isogenic HCC1954 and HCC1954-L cells, only the latter displayed an additive effect of the lapatinib and calcitriol combination ([208]Fig 4I and 4J). Based on qRT-PCR analysis, VDR mRNA expression is increased in the HCC1954-L cells (1.7 fold) (Figure B in [209]S1 File), which may contribute to benefit observed with calcitriol treatment. 4 Discussion The rapid advent of omics technology made it possible to investigate the molecular networks of living cells in unprecedented detail. Consequently, it is now possible to track the behaviours of thousands of genes simultaneously over a period of time. However, identifying genes which show different temporal behaviour under different experimental conditions is not straightforward. Here, we developed a statistical tool that addresses this question and applied it to study the gene expression changes associated with lapatinib resistance in BC cells. Using existing datasets, our study identified several potential biomarkers of lapatinib resistance in BC, some of which were experimentally validated. We identified more than 500 genes as potential lapatinib resistance markers, many of which can also be potential targets for treating lapatinib and, more generally, HER2-targeted TKI resistant BC cells. One such target identified in this study is VDR. We experimentally verified whether VDR can be a viable target for combination therapy for treating HER2 positive lapatinib resistant (innate and acquired) BC patients. We selected two cell lines for the investigation of innate insensitivity, MDM-MB-453 where both HER2 and VDR are relatively highly expressed, and JIMT-1 where HER2 is highly expressed but VDR has significantly lower mRNA expression compared to MDA-MB-453. The combination of lapatinib and calcitriol inhibited proliferation more effectively than individual treatment by either reagents in MDA-MB-453, but not in JIMT-1. In the acquired resistance setting we used the HCC1954-L cells which were developed by exposing HCC1954 cells to lapatinib for 6 months. The combination of calcitriol and lapatinib inhibited proliferation of the lapatinib resistant HCC1954-L more effectively than either agent alone. These results suggest that the combination of calcitriol and lapatinib can be a potential combination therapy regimen for patients who express HER2 and VDR but do not respond to lapatinib treatment. However, these are preliminary results and need further experimental validation in a larger panel of cell lines and animal models to assess the clinical viability of this drug combination. Drug resistance is a complex process which involves dynamic adaptations of the interplay between signalling and transcriptional networks of cancer cells induced by drug treatments. A system level understanding of transcriptional and signalling networks is required to design therapeutic strategies targeted at overcoming such resistance. Such insight can be gained by reconstructing quantitative models of transcriptional and signalling network from experimental data using network reconstruction methods [[210]50, [211]85–[212]90]. The reconstructed models can then be used to simulate the effects of drug treatments on signalling and transcriptional networks of drug resistant cancer cells. We are currently pursuing this avenue to identify potent combination therapy regimens for treating BC patients. Supporting information S1 File. Supplementary information for this manuscript. Contains supplementary Tables A-C and supplementary Figures A,B. (DOCX) [213]Click here for additional data file.^ (65.1KB, docx) Data Availability The dataset was previously published with the following article: Hegde, P.S., et al., Delineation of molecular mechanisms of sensitivity to lapatinib in breast cancer cell lines using global gene expression profiles. Mol Cancer Ther, 2007. 6(5): p. 1629-40. Funding Statement This work was supported by the Science Foundation Ireland under Grant No. 06/CE/B1129 (CSET) and the Irish Cancer Society Collaborative Cancer Research Centre BREAST-PREDICT (CCRC13GAL). References