Abstract PAM50 gene expression subtypes represent a cornerstone in the molecular classification of breast cancer and are included in risk prediction models to guide therapy. We aimed to illustrate the impact of included genes and biological processes on subtyping while considering a tumor’s underlying clinical subgroup defined by ER, PR, and HER2 status. To do this we used a population-representative and clinically annotated early-stage breast tumor cohort of 6233 samples profiled by RNA sequencing and applied a perturbation strategy of excluding co-expressed genes (gene sets). We demonstrate how PAM50 nearest-centroid classification depends on biological processes present across, but also within, ER/PR/HER2 subgroups and PAM50 subtypes themselves. Our analysis highlights several key aspects of PAM50 classification. Firstly, we demonstrate the tight connection between a tumor’s nearest and second-nearest PAM50 centroid. Additionally, we show that the second-best subtype is associated with overall survival in ER-positive, HER2-negative, and node-negative disease. We also note that ERBB2 expression has little impact on PAM50 classification in HER2-positive disease regardless of ER status and that the Basal subtype is highly stable in contrast to the Normal subtype. Improved consciousness of the commonly used PAM50 subtyping scheme will aid in our understanding and interpretation of breast tumors that have seemingly conflicting PAM50 classification when compared to clinical biomarkers. Finally, our study adds further support in challenging the common misconception that PAM50 subtypes are distinct classes by illustrating that PAM50 subtypes in tumors represent a continuum with prognostic implications. Subject terms: Breast cancer, Cancer genomics Introduction Breast cancer is the most frequent malignancy in women^[32]1. Today most patients are diagnosed with early-stage breast cancer and are candidates for (neo)adjuvant systemic treatment with curative intent. Treatment decisions and prognostication are routinely based on clinical and pathological assessments of different factors such as menopausal status, disease burden, Nottingham histological grade, and immunohistochemical measurements of estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (ERBB2/HER2) (including copy number assessment of ERBB2 by in situ hybridization), and the proliferation marker protein Ki67^[33]2. The ER, PR, and ERBB2/HER2 markers also define four major clinical subgroups of breast cancer: (1) ER-positive and HER2-negative tumors (ERpHER2n), (2) ER-positive and HER2-positive tumors (ERpHER2p), (3) ER-negative and HER2-positive (ERnHER2p), and (4) triple-negative breast cancer (TNBC, negative status for ER, PR, and HER2). More recently, multigene expression-based assays have been included in modern treatment guidelines based on evidence that they can aid particularly in selecting patients with ERpHER2n disease that benefit from adjuvant chemotherapy in addition to endocrine treatment^[34]3–[35]5. Clinical use of such gene expression-based signatures has largely been restricted to commercial implementations using targeted assays^[36]6. An example of a clinical multigene test is the Prosigna assay, which uses the nearest-centroid classification for PAM50 molecular subtype assignment^[37]7. The PAM50 classification scheme by Parker et al.^[38]7 builds upon the seminal work by Perou et al.^[39]8 and defines five molecular subtypes in breast cancer: (1) basal-like (Basal), (2) HER2-enriched (HER2E), (3) luminal A (LumA), (4) luminal B (LumB), and (5) normal-like (Normal). The prognostic value of these molecular subtypes has repeatedly been demonstrated^[40]9–[41]14. The subtypes are associated with specific transcriptional patterns that may also be interpreted as molecular processes including, e.g., low expression of ER-status-related genes (such as GATA3, CA12, XBP1, and FOXA1 in Basal tumors^[42]15), high expression of basal cell keratins (e.g., KRT5, KRT14, and KRT17 in Basal and Normal subtypes^[43]15), high expression of genes in the 17q12 amplicon (ERBB2/GRB7) in ERBB2-amplified tumors, and overall higher expression of proliferation-related genes in LumB, HER2E, and Basal subtypes compared to mainly LumA tumors^[44]7. Importantly, the PAM50 gene centroid values used for subtyping new samples reflect these transcriptional patterns/molecular processes in the samples (cohort) from which they were originally derived. Typical PAM50 subtyping includes measuring the distance (usually correlation-based distance) in relative gene expression space from a sample to the reported PAM50 subtype centroids and selecting the nearest one (highest correlation), i.e., nearest-centroid (NC) classification. The need for relative gene expression for subtyping typically requires normalization to transform gene expression values of samples to be subtyped relative to a reference. This step is important as inadequate normalization can result in erroneous classification^[45]16–[46]21. Consequently, single sample predictors based on, e.g., gene rules have been reported recently to try to circumvent this issue^[47]14,[48]16. Specific PAM50 subtypes have been shown to be enriched in different clinical subgroups of breast cancer, with the respective characteristic association of the Basal subtype with TNBC, the HER2E subtype with ERnHER2p tumors, and the LumA and LumB subtypes with the ERpHER2n clinical subgroup (see ref. ^[49]22). Still, less typical subtype patterns appear when applying the PAM50 NC classification to large population-representative cohorts, as shown by Vallon-Christersson et al.^[50]22. In this large study of >3500 population-representative primary breast cancers profiled by RNA sequencing, it was observed that essentially all subtypes are represented, albeit often in small proportions, in clinical treatment groups defined by the combination of ER, PR, and HER2 status and the administered therapy. This observation is expected based on the nature of NC classification and its dependency on gene centering as shown in both breast and lung cancer^[51]16,[52]20. Nevertheless, it raises the question of how to interpret for instance a clinically defined ERpHER2n tumor classified as PAM50 Basal or HER2E, or a TNBC tumor classified as LumA or LumB. For such tumors, it may be questioned whether PAM50 subtypes are clinically or molecularly relevant, if they merely reflect the nature of NC classification, or if other biological processes and gene expression patterns that correlate with the original prototypical subtype samples come into play. In addition, while the Risk Of Recurrence (ROR) score—a read-out of the commercial Prosigna test and used for risk stratification of patients—includes additional components related to tumor size and tumor proliferation, PAM50 subtype correlations contribute a major part of the final score^[53]7. In the present study, we wanted to investigate which biological processes/genes represented in the PAM50 centroids drive tumor subtyping while considering the clinical subgroups: TNBC, ERnHER2p, ERpHER2p, ERpHER2n, and if these impact the prognostic association of PAM50 subtypes. The rationale behind this investigation is that the processes and genes represented in PAM50 may have different influences on subtyping depending on the clinical subgroup, which may explain unexpected results such as ERpHER2n tumors classified as PAM50 Basal. To achieve this, we used a recently reported population-based cohort of uniformly accrued early-stage primary breast cancers comprising 6233 patients analyzed by whole transcriptome RNA sequencing with available PAM50 NC subtypes and a presented rigorous NC classification strategy^[54]14. We illustrate that PAM50 subtyping is dependent on different biological processes in different clinical breast cancer subgroups, but also within subgroups and PAM50 subtypes themselves. In combination with the hardcoded interrelationship of gene expression patterns in the actual PAM50 centroids themselves, these factors can account for the observation of subtypes conceptually not in line with clinical marker-defined subgroups. In the end, an improved understanding of the PAM50 subtyping scheme may aid in the interpretation of tumors with seemingly disparate molecular-clinical classifications. Moreover, the results presented in this study further illustrate that PAM50 subtypes in breast cancer represent a continuum rather than distinct classes, and acknowledging this has prognostic implications. Results PAM50[NC] classification correlation strength and second-best subtype vary between clinical subgroups In the total cohort of 6233 tumors, 645 were TNBC (10.3%), 254 ERnHER2p (4.1%), 548 ERpHER2p (8.8%), and 4786 ERpHER2n (76.8%). PAM50[NC] subtype proportions in these clinical subgroups were: (1) TNBC—73.3% Basal[NC], 15.4% HER2E[NC], 3.4% LumA[NC], 0.8% LumB[NC], and 7.1% Normal[NC], (2) ERnHER2p—13.0% Basal[NC], 77.2% HER2E[NC], 2.8% LumA[NC], 0.8% LumB[NC], and 6.3% Normal[NC], (3) ERpHER2p—1.5% Basal[NC], 35.2% HER2E[NC], 24.3% LumA[NC], 34.0% LumB[NC], and 5.1% Normal[NC], and (4) ERpHER2n—1.5% Basal[NC], 1.8% HER2E[NC], 62.6% LumA[NC], 27.8% LumB[NC], and 6.4% Normal[NC]. In PAM50 classification the subtype assigned to a tumor sample is typically the subtype centroid with the highest correlation (NC classification) to that sample. Consequently, there is always a second-best PAM50 subtype (PAM50[NC_2nd]) as correlation is computed for all centroids. To investigate the relationship between PAM50[NC] and PAM50[NC_2nd] subtypes in primary invasive breast cancer, we cross-tabulated subtypes for the complete cohort as well as separately for the TNBC, ERnHER2p, ERpHER2p, and ERpHER2n groups (Fig. [55]1a–e). A consistent pattern was observed across all tumor groups. Briefly, the PAM50[NC_2nd] subtype for Basal[NC] tumors was mainly HER2E[NC_2nd] or Normal[NC_2nd], for HER2E[NC] tumors mainly LumB[NC_2nd] or Basal[NC_2nd], for LumA[NC] mainly LumB[NC_2nd] or Normal[NC_2nd], for LumB[NC] mainly LumA[NC_2nd] or HER2E[NC_2nd], and for Normal[NC] tumors mainly LumA[NC_2nd] or Basal[NC_2nd]. This pattern corresponded perfectly to positive correlation patterns between PAM50 centroids (Fig. [56]1f). Next, we investigated the difference in Spearman correlation (delta) between the best (PAM50[NC]) and second-best (PAM50[NC_2nd]) subtype for all tumors as well as separately for the four clinical subgroups, as it represents to some extent a measurement of how distinct a sample’s best subtype is from its second-best alternative (boxplots in Fig. [57]1a–e). Overall, the Basal[NC] subtype showed the largest delta in the correlation between PAM50[NC] and PAM50[NC_2nd], while the Normal[NC] subtype generally showed the smallest. In the ERnHER2p subgroup, but also somewhat in ERpHER2p tumors, the HER2E[NC] subtype showed a larger delta. This is consistent with especially ERnHER2p tumors being generally considered to have a strong overrepresentation of the HER2E[NC] subtype. Still, there is a clear difference in the distinctiveness of HER2E[NC] correlation in HER2-positive tumors depending on ER status (regarding both the actual delta and the distribution/tightness of these values). Specifically, ERnHER2p HER2E[NC] tumors showed larger delta between the PAM50[NC] and PAM50[NC_2nd] subtype compared to ERpHER2p HER2E[NC] tumors. LumA[NC] and LumB[NC] subtypes appeared to have equal differences within tested groups. Here it should be noted that despite the discrete subtype assignment of a tumor in these analyses, it is still evident from the actual centroid correlation values that LumA[NC] vs. LumB[NC] subtyping represents a continuum rather than distinct subsets of samples, as illustrated in Fig. [58]1g for the 1599 tumors labeled LumA[NC] – LumB[NC_2nd] and LumB[NC] – LumA[NC_2nd]. Fig. 1. Patterns of PAM50[NC] versus PAM50[NC_2nd] subtype. Fig. 1 [59]Open in a new tab In panels (a–e), the left panels show the cross-tabulated PAM50[NC] subtype versus the PAM50[NC_2nd] subtype for separate tumor subsets, whereas the right panels show the corresponding difference (delta) in Spearman correlation between PAM50[NC] and PAM50[NC_2nd] subtype based on the average Spearman correlation of the 100 NC classifications for each case. In the cross tables, colored boxes highlight consistent subtype patterns between PAM50[NC] and PAM50[NC_2nd] subtypes. Of all 6233 tumors, 6228 had an unambiguous second-best subtype based on NC classification. a All SCAN-B tumors. b TNBC tumors. c ERnHER2p tumors. d ERpHER2p tumors. e ERpHER2n tumors. f Heatmap of Pearson correlations between PAM50 centroids. Heatmap cells marked with colored boxes show centroid correlation patterns consistent with the PAM50[NC] and PAM50[NC_2nd] subtype patterns shown in panels (a–e). g Scatter plot of LumA correlation values versus LumB correlation values for tumors classified as LumA[NC] – LumB[NC_2nd] or LumB[NC] – LumA[NC_2nd] (n = 1599). The red line corresponds to a 1:1 relationship between correlation estimates. Boxplot elements correspond to: (1) center line = median, (2) box limits = upper and lower quartiles, (3) whiskers = 1.5x interquartile range. Evaluating the prognostic value of the second-best PAM50 subtype in adjuvant endocrine-treated lymph node-negative ERpHER2n patients Next, we set out to determine if the PAM50[NC_2nd] subtype was of prognostic value. To this end, we focused analyses on adjuvant endocrine-treated lymph node-negative ERpHER2n patients (ERpHER2nLNn) older than 50 years at diagnosis, similar to ref. ^[60]14. This patient subgroup is typically a main target for gene signature testing as recommended by St Gallen panelists^[61]23. Using overall survival as a clinical endpoint we observed that a LumA[NC] – LumB[NC_2nd] phenotype showed a poorer prognosis compared to the LumA[NC] – Normal[NC_2nd] group (Fig. [62]2a). For PAM50[NC] tumors subtyped as LumB[NC], we found that the LumB[NC] – LumA[NC_2nd] patient group showed better prognosis compared to LumB[NC] – HER2E[NC_2nd] group (Fig. [63]2b). For distant recurrence-free interval as clinical endpoint there was, however, no statistical differences in prognosis for neither LumA[NC] nor LumB[NC] tumors when stratified similarly (log-rank p > 0.05 for both comparisons). To further investigate the causes of the overall survival differences we compared patient age and tumor size between the groups and the frequency of lobular histology, ROR scores, and proliferation metagene scores (Fig. [64]2c, d). In the LumA[NC] – Normal[NC_2nd] group, 25% of tumors were of lobular type compared to 11% in the LumA[NC] – LumB[NC_2nd] group (Fisher’s exact test, p < 0.0001). For LumB[NC] – LumA[NC_2nd] tumors and LumB[NC] – HER2E[NC_2nd] tumors lobular histology percentages were 10.4% and 8.4%, respectively (Fisher’s exact test p > 0.05). Consistent with a better outcome, LumA[NC] – Normal[NC_2nd] tumors showed lower ROR scores than LumA[NC] – LumB[NC_2nd] tumors (Fig. [65]2c), and the former group was drastically enriched for ROR-low risk categorized tumors based on data from ref. ^[66]14 (Fisher’s exact p = 2e-117). The ROR-score patterns may be expected as the PAM50 subtype is one component in the ROR-score calculation. However, the ROR-score patterns were mimicked by comparisons of proliferation metagene scores (mitotic checkpoint), where LumA[NC] – LumB[NC_2nd] showed significantly higher scores than LumA[NC] – Normal[NC_2nd] tumors (Wilcoxon’s test p = 3e-39). Consistent patterns for the proliferation metagene were also observed in LumB[NC] – LumA[NC_2nd] (lower) versus LumB[NC] – HER2E[NC_2nd] (higher) tumors (Wilcoxon’s test p = 4e-33). No statistical differences were observed in tumor size between the LumA[NC] – LumB[NC_2nd] group and the LumA[NC] – Normal[NC_2nd] group (Wilcoxon’s test p = 0.06), nor between the LumB[NC] – LumA[NC_2nd] group and the LumB[NC] – HER2E[NC_2nd] group (Wilcoxon’s test p = 0.52) (Fig. [67]2c, d). No difference was observed for patient age at diagnosis between the LumB[NC] – LumA[NC_2nd] versus LumB[NC] – HER2E[NC_2nd] group (Wilcoxon’s test p = 0.43) (Fig. [68]2d). While there was a statistically significant difference in age at diagnosis between the LumA[NC] – LumB[NC_2nd] versus LumA[NC] – Normal[NC_2nd] group (Wilcoxon’s test p < 0.001) it should be noted that the two groups had similar median age (70 years) (Fig. [69]2c). Thus, the statistical significance between distributions might be due to the large sample sizes compared. Fig. 2. Association of PAM50[NC_2nd] subtype with patient outcome. [70]Fig. 2 [71]Open in a new tab a Overall survival (OS) for endocrine-treated ERpHER2nLNn patients >50 years of age that were PAM50[NC] subtyped as LumA[NC]. Patients are stratified by their PAM50[NC_2nd] subtype. b Overall survival for endocrine-treated ERpHER2nLNn patients >50 years of age that were PAM50[NC] subtyped as LumB[NC]. Patients are stratified by their PAM50[NC_2nd] subtype. c Distributions for age at diagnosis (left), tumor size (center), and ROR T0 scores (right) obtained from ref. ^[72]14 in endocrine-treated ERpHER2nLNn patients >50 years of age comparing cases subtyped as LumA[NC] – LumB[NC_2nd] versus LumA[NC] – Normal[NC_2nd]. d Distributions for age at diagnosis (left), tumor size (center), and ROR T0 scores (right) obtained from ref. ^[73]14 in endocrine-treated ERpHER2nLNn patients >50 years of age comparing cases subtyped as LumB[NC] – HER2E[NC_2nd] versus LumB[NC] – LumA[NC_2nd]. e Overall survival for endocrine-treated ERpHER2nLNn patients >50 years of age that were PAM50[NC] subtyped as LumA[NC] and as ROR-low risk category according to ref. ^[74]14. Patients are stratified by their PAM50[NC_2nd] subtype. f Overall survival for endocrine-treated ERpHER2nLNn patients >50 years of age that were PAM50[NC] subtyped as LumB[NC] and as ROR-high risk category according to ref. ^[75]14. Patients are stratified by their PAM50[NC_2nd] subtype. Boxplot elements correspond to: (1) center line = median, (2) box limits = upper and lower quartiles, (3) whiskers = 1.5x interquartile range. In addition, we analyzed whether the LumA[NC] – Normal[NC_2nd] vs LumA[NC] – LumB[NC_2nd] phenotypes differed with respect to patient outcome within an ROR risk category (low, intermediate, high as determined in ref. ^[76]14 based on RNA sequencing data). While not reaching statistical significance, we did observe that in the ROR-low group of ERpHER2nLNn adjuvant endocrine-treated group of patients >50 years at diagnosis, patients with LumA[NC] – Normal[NC_2nd] tumors showed a trend toward better overall survival (log-rank p = 0.17, Fig. [77]2e). In ROR-high tumors, patients with LumB[NC] – LumA[NC_2nd] tumors showed a non-significant trend toward better overall survival compared to patients with tumors subtyped as LumB[NC] – HER2E[NC_2nd] (log-rank p = 0.12, Fig. [78]2f). We also compared the inverse, whether ROR groups were associated with outcome within LumA[NC] – LumB[NC_2nd] patients or within LumA[NC] – Normal[NC_2nd] patients separately, however no such associations were observed (log-rank p = 0.6 and p = 0.98, respectively). Overall impact of the leave-oneGeneCluster-out centroid perturbation strategy on PAM50 classification Considering the different patterns of PAM50[NC] versus PAM50[NC_2nd] subtype across the four ER and HER2 defined subgroups, we next investigated what happened to PAM50 classification when applying a leave-oneGeneCluster-out perturbation strategy where groups of co-expressed genes (gene sets) were excluded and samples reclassified based on the remaining values. The PAM50 reclassification overview and the methodology for this strategy are presented in Fig. [79]3a. We first utilized SRIQ clustering of the 50 PAM50 genes in 9206 SCAN-B RNA sequencing profiles to define seven core gene clusters of varying size (Fig. [80]3b, Supplementary Table [81]2). Fig. 3. Study overview and PAM50 reclassification results for the leave-oneGeneCluster-out strategy. [82]Fig. 3 [83]Open in a new tab a Study outline, perturbation methodology, and subtype switch concept. A sample is called as having a subtype switch if the PAM50[NC] subtype is observed in ≤50% of the 100 PAM50[perturb] reclassifications (right panel). b Left panel, size of identified SRIQ core gene clusters defined from 9206 RNA sequencing profiles from ref. ^[84]14. Center panel, heatmap of average PAM50 centroid value for each gene set for each PAM50 centroid subtype. Right panel, Spearman correlation of average SRIQ FPKM gene cluster expression for each gene set combination in all 9206 RNA sequencing profiles. c Spearman correlation matrix of average SRIQ FPKM gene cluster expression versus rank-based scores for eight reported biological metagenes from Fredlund et al.^[85]15 for the 6233 tumors included in this study. d Heatmap of scaled FPKM expression for PAM50 genes stratified by SRIQ gene cluster definition and ordered by clinical group and PAM50[NC] subtype for the 6233 included tumors. e Percent of tumors switching subtype (i.e., a different PAM50[perturb] subtype compared to PAM50[NC]) by the leave-oneGeneCluster-out strategy on a whole cohort level stratified by PAM50[NC] subtypes for the 6233 included tumors. f Percent of tumors switching subtype by the leave-oneGeneCluster-out strategy on a whole cohort level stratified by tumors’ ER, PR, and HER2 status. Correlations of the average expression for each gene set (gene set scores) with eight proposed biological metagenes in breast cancer^[86]15 showed that three of the gene sets were strongly correlated with different described metagenes. Gene set 1 (proliferation) was correlated to proliferation metagenes and includes, e.g., MKI67, CENPF, CCNE1, TYMS, and KIF2C. Gene set 2 (steroid response) was correlated to the steroid response metagene and includes ESR1, FOXA1, PGR, BCL2, SLC39A6, MAPT, NAT1, MLPH, and CXXC5. Gene set 3 (basal keratins) was correlated to a basal cell metagene and included CDH3, KRT17, KRT5, KRT14, MIA, SFRP1, EGFR, and FOXC1 (Fig. [87]3c). Gene sets 1–3 included enough genes for meaningful functional enrichment analysis that confirmed metagene associations for gene sets 1 and 2 (Supplementary Table [88]2). In support of the above analyses, an expression heatmap of the 6233 samples ordered by the four clinical subgroups (TNBC, ERnHER2p, ERpHER2p, ERpHER2n) followed by PAM50[NC] subtype and genes grouped by the seven gene clusters clearly showed gene cluster co-expression and distinct transcriptional differences (low/high expression) across both clinical subgroups and PAM50[NC] subtypes (Fig. [89]3d). Moreover, patterns of individual gene set scores across PAM50[NC] subtypes in TNBC, ERnHER2p, ERpHER2p, and ERpHER2n tumors further illustrate the association of specific gene sets with different PAM50[NC] subtypes (Supplementary Fig. [90]1). Performing a leave-oneGeneCluster-out reclassification strategy on the complete cohort (n = 6233) showed that excluding specific gene sets had different impact on the proportion of samples switching subtype depending on the PAM50[NC] subtype (Fig. [91]3e) or clinical subgroup (Fig. [92]3f). The largest effects for the Normal[NC] subtype were observed when excluding gene set 2 (steroid response, ~40% switching) or gene set 3 (basal keratins, ~80% switching). For the LumB[NC] subtype, the largest effect was seen after the exclusion of gene set 1 (proliferation) or 3 (basal keratins), both causing >40% of tumors to switch subtype. For the LumA[NC] subtype, the greatest effect was observed when excluding gene set 2 (steroid response), while the proportions of samples switching subtype in the HER2E[NC] and Basal[NC] subtypes were comparably lower, with the Basal[NC] subtype showing highest stability in line with the high subtype distinctiveness as shown in Fig. [93]1a–e. Interestingly, excluding gene set 4 (ERBB2/GRB7) caused only a moderate proportion (<15%) of all HER2E[NC] tumors to change subtype, less than excluding either gene sets 1 or 2. For the clinical subgroups as a whole, the largest perturbation effects were observed for gene sets 1–3 in mainly ER-positive groups (Fig. [94]3f). Impact of the leave-oneGeneCluster-out centroid perturbation strategy on PAM50 classification when stratified by molecular and clinical subgroups We next aimed to address whether the effect of the leave-oneGeneCluster-out strategy was different for PAM50[NC] subtypes within specific clinical subgroups (e.g., TNBC tumors classified as Basal[NC]). To address this question, we evaluated its impact separately for each clinical subgroup (TNBC, ERnHER2p, ERpHER2p, and ERpHER2n) stratified by the PAM50[NC] subtype. A summary overview of the key results is provided in Fig. [95]4, with detailed plots for each clinical subgroup provided in Supplementary Figs. [96]2–[97]5. In addition, we also for each gene set perturbation and combination of PAM50[NC] and clinical group computed the delta in correlation before and after leave-oneGeneCluster-out reclassification (PAM50[NC] subtype vs. PAM50[perturb] subtype) (Supplementary Figs. [98]6–[99]9). Generally, the correlation delta between the subtype calls varied depending on: (1) the excluded gene set, (2) the PAM50[NC] subtype, and (3) if a tumor switched or not, resulting in both negative and positive correlation differences. This suggests varying importance/relevance of gene sets for classification, although it should be noted that the median difference in Spearman correlation values between PAM50[NC] versus PAM50[perturb] values was for most comparisons small (e.g., <±0.2), and that for many comparisons the number of tumors per group is low. Fig. 4. PAM50 reclassification results for the leave-oneGeneCluster-out strategy when stratified for molecular and clinical subgroup. [100]Fig. 4 [101]Open in a new tab a Heatmap showing the proportion of tumors that switched subtype after gene set exclusion stratified by molecular and clinical subgroup. Numbers represent total group sizes per row. b Top panel shows the proportion of ERpHER2n LumA[NC] tumors with a PAM50[perturb] subtype different from their PAM50[NC] subtype, i.e., switching subtype, when excluding a specific gene set in leave-oneGeneCluster-out reclassification. Lower panel shows the distribution of the PAM50[perturb] subtypes in tumors that switched in the top panel, numbers on top represent the total number of samples that switched subtype. c The same illustration as in (b), but for ERpHER2n LumB[NC] tumors. d Heatmap showing the proportion of tumors that switched subtype after gene set exclusion that had a PAM50[perturb] subtype similar to the PAM50[NC_2nd] subtype. Numbers represent total group sizes per row. e Summary bar plots of the percentage of tumors in each clinical group further stratified by their PAM50[NC] subtype that never switch subtype across all gene set perturbations, i.e., the PAM50[perturb] subtype is the same as the PAM50[NC] subtype in all perturbations. These cases are hereon referred to as K0 cases. Several key findings are illustrated in the heatmap of PAM50[perturb] subtype switch proportions (Fig. [102]4a). Overall, the gene sets appear to affect PAM50[NC] and clinical groups differently, with both high and low switch proportions inferred depending on subtype, clinical group, and gene set excluded. One observation was that the prototypical Basal[NC] subtype in TNBC, and the prototypical HER2E[NC] subtype in ERnHER2p tumors remained stable with <10–15% of tumors switching for all gene set exclusions (Supplementary Figs. [103]2 and [104]3 for details). Moreover, we observe that for the smaller gene sets (gene sets 4–7) the proportion of tumors switching subtype is typically low, indicating a relatively low impact on PAM50 classification stability if these genes are present/absent. This observation is especially interesting for gene sets 4 (17q12 amplicon: ERBB2/GRB7) and 7 (FGFR4) that both include genes specifically selected for overexpression in HER2E tumors in the original PAM50 centroids^[105]7 (see also Supplementary Fig. [106]1). The largest impact of excluding gene set 4 was seen in ERpHER2p tumors subtyped as HER2E[NC], with 25.9% of tumors affected in the PAM50[perturb] classification (Supplementary Fig. [107]4). In contrast, when gene set 4 and 7 was excluded in ERnHER2p, only 7.7% and 3.6%, respectively of HER2E[NC] tumors were affected (Supplementary Fig. [108]3). Another observation from Fig. [109]4a is the sensitivity of the Normal[NC] subtype to the exclusion of gene set 3 (basal keratins) irrespective of the clinical subgroup. Exclusion of this gene set caused a high proportion of Normal[NC] tumors (up to 80%) to switch indicating the importance of this expression module for the subtyping (see also Supplementary Figs. [110]2–[111]5 for details and exact proportions). Finally, we observe a strong effect of gene set 1 (proliferation) and gene set 3 (basal keratins) exclusion in LumB[NC] tumors across clinical groups, somewhat in contrast to LumA[NC] (Supplementary Figs. [112]2–[113]5 for details). These observations are further exemplified in detail in Fig. [114]4b, c for ERpHER2n tumors, also showing to which PAM50[perturb] subtype a tumor switches when excluding the specific gene sets in the respective group. Next, we asked how often a tumor that switched subtype did so to a PAM50[perturb] subtype similar to its PAM50[NC_2nd] subtype across clinical and molecular groups. The heatmap in Fig. [115]4d summarizes the results for this question (details are provided in Supplementary Figs. [116]2–[117]5), showing that for the smaller gene sets PAM50[perturb] subtypes in tumors switching were most often of the same label as PAM50[NC_2nd] across subgroups. For the larger gene sets the proportions varied more but were still most often >50%. Finally, we also investigated the proportion of tumors in the molecular and clinical subgroups that were never affected by the leave-oneGeneCluster-out perturbations. As seen in Fig. [118]4e, the Basal[NC] subtype in TNBC and the HER2E[NC] subtype in ERnHER2p tumors were particularly stable in that >80% of tumors never switched subtype irrespective of perturbation. In contrast, the Normal[NC] was particularly unstable with very low number of completely unaffected tumors, followed by LumB[NC] and LumA[NC]. Impact of the leave-oneGeneCluster-out strategy on PAM50 classification with respect to prognosis To investigate whether the leave-oneGeneCluster-out reclassification had any impact on the patient outcome we performed univariate Cox regression (using DRFI as clinical endpoint) for each gene set perturbation in each clinical subgroup, using tumors that were not affected by a perturbation as a reference in the model. In the ERnHER2p and ERpHER2p groups, there were no statistically significant associations for any gene set, whereas in the TNBC, ERpHER2n, and endocrine-treated ERpHER2n groups significant hazard ratios were observed for gene sets 1–3 (Fig. [119]5a–c). We further stratified endocrine-treated ERpHER2n tumors into LumA[NC] and LumB[NC] to illustrate the prognostic associations of the PAM50[perturb] subtypes in these specific subgroups when excluding gene set 1 (proliferation) in LumA[NC] (Fig. [120]5d) or gene set 3 (basal keratins) in LumB[NC] tumors (Fig. [121]5e). Importantly, the PAM50[perturb] subtypes in these two endocrine-treated tumor groups displayed biological metagene rank scores related to proliferation, steroid response, and basal keratins in line with what would be expected when compared to tumors that did not change subtype (Fig. [122]5f, g). Fig. 5. Leave-oneGeneCluster-out perturbation and association to patient outcome. [123]Fig. 5 [124]Open in a new tab Forest plot of hazard ratios with 95% confidence intervals from univariate Cox regression, using DRFI as clinical endpoint, for tumors that switched subtype versus tumors that did not switch subtype (reference) after exclusion of a gene set in a TNBC tumors, b ERpHER2n tumors, and c endocrine-treated ERpHER2n tumors only. d Kaplan–Meier plot of DRFI for PAM50[perturb] subtypes in endocrine-treated ERpHER2n LumA[NC] tumors after exclusion of gene set 1 (proliferation). e Kaplan–Meier plot of DRFI for PAM50[perturb] subtypes in endocrine-treated ERpHER2n LumB[NC] tumors after exclusion of gene set 3 (basal keratins). f Boxplots of rank-based scores for the mitotic progression, basal, steroid response, and lipid metagenes for endocrine-treated ERpHER2n LumA[NC] tumors in panel (d). g Boxplots of rank-based scores for the mitotic progression, basal, steroid response, and lipid metagenes for endocrine-treated ERpHER2n LumB[NC] tumors in panel (e). Note that not all included cases in the study have DRFI outcome data, thus the difference in sample numbers between boxplots and survival plots. Boxplot elements correspond to: (1) center line = median, (2) box limits = upper and lower quartiles, (3) whiskers = 1.5x interquartile range. Samples that were stable during leave-oneGeneCluster-out perturbation as basis for a refined true single sample PAM50 subtype classification in ERpHER2n tumors Considering the impact of PAM50[perturb] subtype switches on patient outcome shown in Fig. [125]5, we next hypothesized that leave-oneGeneCluster-out perturbation stable tumors (portrayed in Fig. [126]4e) could represent core PAM50 subtype cases within each clinical group and used to refine PAM50 subtyping in a way that could also affect prognostic associations. To test this hypothesis, we focused on the largest clinical group, ERpHER2n tumors (n = 4786). As outlined in Fig. [127]6a, we first identified the 1934 ERpHER2n tumors that never switched subtype in any gene set perturbation (referred to as K0 tumors). Based on these tumors’ PAM50[NC] subtypes, we created new centroid values for each PAM50 gene using mean FPKM across tumors in respective subtype, thus constructing FPKM-based centroids specific for the ERpHER2n group. The entire ERpHER2n cohort was next reclassified using Spearman correlation in a single sample mode (referred to as PAM50[K0] subtypes), without any offset, log2 transformation, or gene centering, instead only by correlating each tumor’s FPKM profile to the PAM50[K0] centroids. As seen in Fig. [128]6b, the major subtype changes between PAM50[NC] and PAM50[K0] were a set of LumA[NC] tumors shifting to LumB[K0] and some LumA[NC] tumors shifting to the Normal[K0] subtype. In the group of endocrine-treated ERpHER2n tumors, the PAM50[K0] subtypes were notably associated with different DRFI (Fig. [129]6c). Moreover, in the same patient group, analysis of biological metagene rank scores showed marked expression patterns for PAM50[K0] subtypes for the key metagenes representing proliferation, steroid response, and basal gene expression (Fig. [130]6d). To challenge our hypothesis further, we next selected only the endocrine-treated ERpHER2n LumA[NC] patients and stratified these by their PAM50[K0] subtypes. While there was no statistical difference observed for LumA[K0] versus LumB[K0] using DRFI as clinical endpoint (Fig. [131]6e), we observed a marked difference in overall survival between the PAM50[K0] subtypes in the LumA[NC] cohort (Fig. [132]6f). Importantly, in this endocrine-treated ERpHER2n LumA[NC] group, the PAM50[K0] subtypes showed an expected biological metagene expression pattern, including slightly elevated proliferation in LumB[K0] versus LumA[K0], markedly lower basal expression in LumB[K0] and higher basal expression in Normal[K0] versus LumA[K0] tumors (Fig. [133]6g). These patterns are consistent with the general trends for PAM50[NC] subtypes in the full cohort (see, e.g., heatmap in Fig. [134]3d). Fig. 6. Refined single sample PAM50 subtyping in ERpHER2n tumors based on leave-oneGeneCluster-out perturbation stable tumors. [135]Fig. 6 [136]Open in a new tab a Outline of the scheme to create refined ERpHER2n PAM50 centroids (termed PAM50[K0]) used for single sample classification by Spearman correlation based on FPKM values only (i.e., no gene centering). b Sankey plot of subtype change for ERpHER2n tumors when performing PAM50[K0] classification as outlined in (a). c Kaplan–Meier plot of DRFI for PAM50[K0] subtypes in endocrine-treated ERpHER2n tumors. d Boxplots of rank-based scores for the mitotic checkpoint, steroid response, and basal metagenes for endocrine-treated ERpHER2n tumors stratified by PAM50[K0] subtypes. e Kaplan–Meier plot of DRFI for PAM50[K0] subtypes in endocrine-treated ERpHER2n LumA[NC] tumors. HER2E[K0] and Normal[K0] groups excluded due to size. f Left panel, Kaplan–Meier plot of OS for PAM50[K0] subtypes in all endocrine-treated ERpHER2n LumA[NC] tumors. HER2E[K0] and Normal[K0] groups excluded due to size. Right panel, same plot but only for non-K0 tumors (i.e., tumors not included in the PAM50[K0] centroid creation). g Boxplots of rank-based scores for the mitotic checkpoint, steroid response, and basal metagenes for endocrine-treated ERpHER2n LumA[NC] tumors stratified by PAM50[K0] subtypes. Note that not all included cases in the study have DRFI outcome data, thus the difference in sample numbers between boxplots and survival plots. Boxplot elements correspond to: (1) center line = median, (2) box limits = upper and lower quartiles, (3) whiskers = 1.5x interquartile range. Discussion In the current study, we set out to chart the gene expression drivers of PAM50 classification in primary invasive breast cancer. To this end, we applied a gene set centroid perturbation strategy (leave-oneGeneCluster-out) to gene expression data from RNA sequencing of 6233 primary breast cancers. Our hypothesis was that the effects of a perturbation to PAM50 classification differed depending on the molecular background. Therefore, we stratified classification effects by underlying molecular clinical subgroups defined by tumor ER, PR, and HER2 status. There are two important methodological strengths of the current study compared to previous reports: (1) the unbiased patient cohort that is representative of population-based disease in South Sweden during 2010–2018, and (2) our rigorous NC classification strategy involving classifying each sample 100 times using 100 different reference sets for normalization/gene centering that are balanced to mimic the original cohort composition of Parker et al.^[137]7 (see ref. ^[138]14 for full details). The typical PAM50 subtype assignment for a tumor is done through the selection of the nearest of five-subtype centroids using a gene expression correlation-based distance metric, i.e., 1-correlation meaning higher correlation equals smaller distance. While there will always be a nearest centroid (with the highest correlation and therefore the assigned subtype), there will also be a second-best subtype for a tumor as distance to each centroid is evaluated. The discrete calling of a PAM50 subtype in a tumor is a pragmatic but simplistic approach as a tumor’s second-best correlation will occasionally be very close to the highest correlation making the subtype call arbitrary in extreme cases as illustrated by Fig. [139]1g and also noted by Kuilman et al. for the BluePrint molecular subtyping test^[140]24. As such, tumor subtypes could at times be viewed as the combination of centroid correlations rather than a single nearest centroid much like the ROR score, which is partly calculated by a weighted combination of centroid correlations. This view can be further exemplified by considering the distinctiveness of the best versus second-best subtype, naively conceptualized in our study as the difference in centroid correlation between the two (Fig. [141]1). Here, it is apparent that the Basal[NC] subtype is in general the most distinct (largest separation from second-best) subtype across all tested tumor subsets, whereas the Normal[NC] subtype is typically the least distinct in line with Paquet et al.^[142]16. For the other PAM50[NC] subtypes the distinctiveness varies depending on molecular subgroup analyzed. For instance, for HER2E[NC] the distinctiveness is equivalent to Basal[NC] in ERnHER2p tumors (i.e., high), lower in ERpHER2p tumors (equivalent to, e.g., LumA[NC] in this group), while together with Normal[NC] among the lowest in ERpHER2n tumors. However, although the extent of borderline cases varies between subtypes, all of them contain some cases with insignificant separation between the nearest and second-nearest centroid. As such, the distinctiveness analysis illustrates that a one-class PAM50[NC] subtype is in many cases a conceptual oversimplification and that tumors instead could be considered as placed somewhere on a continuum between subtypes. Likewise, it illustrates that the support (distinctiveness) for certain subtype calls in typically disparate molecular subgroups can be low (like for HER2E[NC] in ERpHER2n tumors or LumA[NC] in TNBC). The latter may then question the relevance of these subtype calls in these molecular subgroups altogether, especially if erroneous sampling or misclassification by conventional pathology markers can be ruled out. Regarding the pattern of the PAM50[NC_2nd] subtype in breast cancer, we illustrate that it appears connected to the PAM50[NC] subtype irrespective of clinical subgroup (Fig. [143]1). Thus, the PAM50[NC_2nd]subtype pattern appears as an inherent consequence of how centroids are constructed and inter-correlated (Fig. [144]1f), which is expected as centroid correlations define the subtypes. For LumA[NC] and LumB[NC] tumors the PAM50[NC_2nd] subtype combinations were also associated with differences in overall survival, but interestingly not in distant recurrence-free interval, in the typical patient target group of current commercial gene expression assays (Fig. [145]2). As expected, given the nature of the ROR formula (see ref. ^[146]7), the PAM50[NC_2nd] subtype in these tumors was associated with differences in ROR-scores, but intriguingly not with obvious differences in tumor size nor patient age that could help explain survival differences. Perhaps more clinically interesting was that within ROR-low and ROR-high the second-best subtype combinations showed trends of different overall survival, but not distant recurrence-free interval (possibly due to follow-up length), for patients. If validated in larger cohorts, this may be useful to further improve clinical risk management as ROR risk groups are a weighted estimate of subtype correlations, proliferation, and tumor size. To further dissect PAM50[NC] subtyping we employed a centroid perturbation strategy that excluded sets of co-expressed genes from the PAM50 centroids (leave-oneGeneCluster-out). In this strategy, we first identified seven gene clusters, i.e., the gene sets, based on co-expression that showed different expression patterns across subgroups/subtypes and low to moderate correlation to each other (Fig. [147]3, Supplementary Fig. [148]1). Gene expression for the three largest gene sets (gene sets 1–3) correlated strongly with proliferation, steroid response, and a basal cell gene expression pattern respectively, as expected given included genes^[149]15. Identification of PAM50 co-expressed gene sets and the association of these with major biological processes in breast cancer is in line with repeatedly reported mRNA expression patterns of molecular breast cancer phenotypes as already illustrated by Perou et al.^[150]8. These larger transcriptional themes are naturally reflected in the PAM50 genes in addition to more subtype-specific selected genes (outlined in ref. ^[151]7). Interestingly, aside from gene sets 1–3, the other gene sets were much smaller (1–2 genes) but still showed distinct gene set scores for certain subtypes consistent with the PAM50 gene selection process. One such example is gene set 7, involving only the FGFR4 gene, with elevated expression in HER2E[NC] tumors. FGFR4 was specifically selected as a gene overexpressed in HER2E[NC]^[152]7, but it displays a lesser correlation to ERBB2, a prototypical HER2E[NC] gene, in SCAN-B data but also TCGA breast cancers when analyzed through the cBioPortal online tool (Spearman rho = 0.24). In fact, of the genes included in the other gene sets only ESR1 and SLC39A6 showed an absolute correlation to FGFR4 > 0.3 (Spearman rho −0.31 and −0.323, respectively) in the TCGA cohort. Correspondingly, the GPR160 and TMEM45B genes that comprise gene set 6 were selected as genes with reduced expression in Basal tumors^[153]7. Similar to FGFR4/ERBB2, the overall correlation between GPR160 and TMEM45B expression in TCGA breast cancers appeared in the lower spectrum (Spearman rho = 0.35). Together, these observations show that the selection of subtype-specific PAM50 genes is not necessarily functionally motivated based on typical co-expression across breast cancers in general. Another observation is that no gene set showed a marked correlation to a stroma, lipid, or immune response associated metagene that is likely more reflective of expression patterns associated with the tumor microenvironment, in line with the original aims of selecting an intrinsic gene list^[154]8. The leave-oneGeneCluster-out results in the full SCAN-B cohort showed that the Basal[NC] subtype was resistant to centroid perturbations (Fig. [155]3e). This repeated observation is consistent with results from the study by Paquet et al. reporting that random perturbations applied not only to NC classification but also to a true PAM50 single sample predictor algorithm resulted in the least number of subtype switches for Basal tumors^[156]16. Altogether, across TNBC, ERnHER2p, and ERpHER2p tumors it appeared that the most expected (and dominant) PAM50[NC] subtype (Basal[NC], HER2E[NC], and HER2E[NC], respectively) showed the highest classification stability in the leave-oneGeneCluster-out strategy (Fig. [157]4). In the greater context, this is likely explained by that tumors with these subtypes in their respective clinical subgroup are most representative of the prototypical tumors that once formed the actual centroid values (hence stronger correlations may be expected). Considering this fact, it then becomes crucial to acknowledge from which type of prototypical tumors each centroid was computed when applying and interpreting PAM50 subtyping to all types of breast cancer. In contrast, the same observation was not true for ERpHER2n tumors subtyped as LumA[NC] or LumB[NC]. The Normal[NC] subtype presented a special case, as it was the most unstable of all PAM50[NC] subtypes across clinical subgroups particularly for gene set 3 (basal keratins) suggesting that the expression of these genes is crucial for the subtype. The true nature of the PAM50 Normal[NC] subtype is debated as it includes both ER-positive and ER-negative tumors and was originally defined by including normal breast tissue samples^[158]7,[159]22. The Normal[NC] subtype has been described as being the result of high normal cell content in analyzed bulk tumor specimens or as representing specific histological types of breast cancer, like lobular cancer, or as an additional intrinsic subtype referred to as claudin-low^[160]7,[161]25,[162]26. Furthermore, the leave-oneGeneCluster-out strategy showed that PAM50[NC]subtypes had different robustness for the exclusion of different gene sets overall and in specific clinical subgroups. Overall, perturbation of smaller gene sets (gene sets 4–7) caused less tumors to switch, but when it happened the PAM50[perturb] subtype was typically of the same label as the PAM50[NC_2nd] (Fig. [163]4). Detailed analysis of PAM50[perturb] subtype switching patterns suggests consistency with the intrinsic centroid to centroid correlation and could likely be viewed as a mere illustration of the former (Fig. [164]1f). For the two ERBB2/HER2-amplified groups (ERnHER2p and ERpHER2p) the perhaps most interesting observations from the leave-oneGeneCluster-out analyses were: (1) the somewhat overall higher stability in ER-negative cases, (2) the general stability of the expected HER2E[NC] subtype to almost any gene set perturbation, and (3) the lack of direct importance of both the ERBB2/GRB7 amplicon genes and the FGFR4 gene in perturbed centroids for classification robustness (Fig. [165]4 and Supplementary Figs. [166]3 and [167]4). Further, while ERBB2, GRB7, and FGFR4 are included in the centroids as highly expressed in HER2E^[168]7, their actual importance in PAM50[NC] subtyping could be viewed as limited based on our perturbation results. This suggests that the HER2E[NC] subtype classification in ERBB2-amplified disease is likely predominantly dictated by the interplay/interrelationship between other included gene sets, like those capturing proliferation, steroid response, and basal keratin expression, for which HER2E[NC] tumors often display an intermediate expression pattern compared to Basal[NC], LumA[NC], and LumB[NC] tumors. In the ERpHER2n group, LumA[NC] and LumB[NC] tumors accounted for 90.4% of all tumors. For these tumors, gene sets 1–3 had the greatest impact in the leave-oneGeneCluster-out strategy. This finding may be expected considering the well-established role of proliferation as a key divider between LumA[NC] and LumB[NC], as well as the importance of ER-signaling for these subtypes^[169]27. Consistently, the exclusion of gene set 1 (proliferation) caused 49.5% of LumB[NC] tumors to have a different PAM50[perturb] subtype that was almost exclusively LumA[perturb]. It might be noted that for these LumB[NC] tumors, LumA was their PAM50[NC_2nd] subtype in 69.4% of the cases. A similar pattern was observed when gene set 3 (basal keratins) was excluded in LumB[NC] tumors. Gene set 3 includes a set of keratin genes (KRT17, KRT5, KRT14) as well as EGFR, genes that have repeatedly been shown to be expressed in basal-like tumor cells by in situ analyses^[170]28, but also in the Normal[NC] subtype by mRNA profiling (e.g., ref. ^[171]25). As seen in Fig. [172]3b, gene set 3 has the highest average PAM50 centroid values for the Normal and Basal centroids, while intermediate for LumA and lower for LumB in line with previous reports^[173]25. Thus, while luminal tumor cells are likely not expressing these specific keratin markers, they will still have an impact on PAM50 classification due to the reliance on classification on relative mRNA expression through the process of gene centering, particularly for LumB[NC]. In comparison, in recent rule-based PAM50 classifiers (based on intrinsic gene pairs and not relative mRNA expression) reported by us, KRT17, KRT5, KRT14, and EGFR were selected in gene rules for the LumB and Normal (KRT5 and KRT14) subtypes, but not at all for LumA^[174]14. While the leave-oneGeneCluster-out strategy can inform about which gene sets appear important for classification and which tumors appear stable to any perturbation, an obvious question is whether this has any clinical impact or can be used to refine classification in the future. To address the first question, we performed survival analysis for each gene set in each clinical group comparing patients whose tumors switched to those that did not. This analysis showed that gene sets 1–3 switches had prognostic value in TNBC and ERpHER2n patients but not in any of the HER2-positive groups (Fig. [175]5a–e). In this context, implementing the leave-oneGeneCluster-out strategy as an add-on to PAM50[NC] classification is a straightforward computational exercise that could label a PAM50[NC] tumor as “stable/prototypical”. To address whether the leave-oneGeneCluster-out could be used to refine PAM50 classification we tested the hypothesis of creating new ERpHER2n specific PAM50 centroids based on perturbation stable tumors, for which classification would subsequently rely only on Spearman correlation without gene centering, a step demonstrated to be a frail trait of centroid prediction^[176]16,[177]20. This exercise demonstrated that this could be possible and that the inferred new PAM50 subtypes (PAM50[K0]) could refine the existing LumA[NC] subtype in ERpHER2n tumors with respect to both gene expression patterns but also patient overall survival after endocrine therapy (Fig. [178]6). A limitation of the current study lies in the gene set size of the PAM50 centroids. Intuitively, removing a large proportion of genes from the centroids will inevitably increase subtype switching as shown by Paquet et al.^[179]16. This limits the interpretation potential of excluding combinations of gene sets (e.g., gene sets 1 and 2, representing 60% of the gene content) as correlations could drop below meaningful levels. Moreover, the interpretation of correlations to perturbed centroids should also be done with caution as omitting genes certainly changes subtype centroids to something other than what they originally are. However, here it is worth noting that centroid correlations for tumors that either had the same PAM50[NC] and PAM50[perturb] subtype or different when excluding a particular gene set did not consistently decrease or increase in the clinical subgroups (Supplementary Figs. [180]6–[181]9). Instead, a highly variable pattern of increased and decreased correlations between the PAM50[NC] and the PAM50[perturb] subtype after reclassification was observed. Examples are even seen where tumors with the same PAM50[NC] and PAM50[perturb] subtype show a higher correlation to the reduced PAM50[perturb] centroid. Moreover, we note that while not always the case, when a tumor had a different PAM50[perturb] subtype in the leave-oneGeneCluster-out strategy, it was often the same subtype as the corresponding PAM50[NC_2nd] (Fig. [182]4d, Supplementary Figs. [183]2–[184]5). Together these observations support that the taken perturbation approaches can bring meaningful insights. A final limitation of mRNA-based bulk tissue analysis, which we cannot properly address, lies in the sampling procedure and the tissue heterogeneity and cellularity. This has repeatedly been reported to affect PAM50 subtyping in both frozen tissue (see ref. ^[185]26), but also for the Prosigna assay itself which is based on macro-dissected tumor tissue^[186]29. In summary, in the current study, we have analyzed features of PAM50 subtype classification in the context of molecular-clinical subgroups. This task is complicated by the tight interrelationships between gene/gene sets in the original PAM50 centroids that define classification when applied to tumors using relative expression and correlation. As illustrated in Fig. [187]1g, for many tumors the discrete subtype assignment of LumA[NC] or LumB[NC] is clearly not represented by distinct proximity to a sole single centroid. Hence, a tumor could be viewed as either one of the subtypes or perhaps better as a combination of all subtypes. On the other hand, many of the results presented in this study appear as logical illustrations of the framework set by the centroids and the usage of correlation as a similarity metric. Moreover, an innate discrete subtype of a tumor may also be challenged considering the continuum of expression patterns governing subtype calls (like expression of proliferation-related genes). In this study, we show that perturbations to the PAM50[NC] classification have a different impact depending on the underlying ER, PR, and HER2 status of the tumor and the excluded gene set. Moreover, we show that the PAM50[perturb] pattern is different between PAM50[NC] subtypes within a clinical subgroup, but also within the PAM50[NC] subtypes themselves depending on the excluded gene set, and that this can have prognostic associations. The reasons for this are likely several, including the underlying biological processes in breast cancer that are at least partly reflected in the PAM50 gene set, but also presumably the specific selection of certain PAM50 genes to represent particular subtypes, and that each centroid was created from a set of prototypical tumors (e.g., for the Basal subtype these were mainly TNBC tumors). Interestingly, the smaller gene sets typically do not correlate with major transcriptional programs in breast cancer and are not apparently biologically co-expressed either. Deconstructing their importance and relevance in subtyping remains a topic for further investigation, as they clearly are of importance to classification in certain situations and are potentially more vulnerable to technical failures due to a lack of redundant co-expressed genes. An improved understanding of the commonly used PAM50 subtyping scheme in breast cancer and systematic illustrations of correlation interplay such as those presented here will aid the interpretation of tumors with seemingly disparate classifications, like clinically defined ERpHER2n tumors that are subtyped as PAM50 Basal[NC], and whether these represent true biological entities. Moreover, the trends we observed in overall survival in ROR groups for the combination of PAM50[NC] and PAM50[NC_2nd] subtypes, as well as the demonstrated prognostic differences based on gene set perturbation, and the demonstrated potential to refine PAM50 subtyping based on tumors stable to gene set perturbations may be of interest for future clinical management. Importantly, studies such as this one challenge the conception that PAM50 subtypes are individual discrete classes and call for a shift in the way we approach the results of this classification. Methods Unselected population-based breast cancer cohort A total of 6233 patients diagnosed with primary invasive breast tumors and enrolled in the Sweden Cancerome Analysis Network–Breast (SCAN-B) study^[188]30,[189]31 (ClinicalTrials.gov ID [190]NCT02306096) from 2010 to 2018 with curated RNA sequencing data and complete clinicopathological and PAM50 data (specifically PAM50 classification as Basal, HER2E, LumA, LumB, or Normal, ER, PR, HER2, and nodal status, treatment indication, and patient follow-up) available in Staaf et al.^[191]14 were included. The included cohort is hereafter referred to as SCAN-B. The 6233 patients comprise 93.6% of the 6660-sample early-stage follow-up cohort (one patient – one tumor RNA sequencing profile) defined in ref. ^[192]14 from the total set of 9206 RNA sequencing profiles in ref. ^[193]14. Clinicopathological and molecular characteristics for the 6233 patients’ tumors are detailed in Supplementary Table [194]1. Specific patient inclusion and exclusion criteria for the SCAN-B cohort are reported in the original publication^[195]14. Patients in this cohort have previously been shown to be representative of the underlying breast cancer population of the healthcare region in which they were enrolled^[196]14,[197]22. The PAM50 classification used in this study is based on the five-subtype system (Basal, HER2E, LumA, LumB, Normal) using the NC classification methodology reported in ref. ^[198]14 (therein termed NCN). In the classification approach described in ref. ^[199]14, each tumor is subtyped 100 times using 100 different reference sets for centering, resulting in 100 correlations to each PAM50 centroid from which a majority subtype vote is determined. The majority subtype is hereon referred to as a tumor’s PAM50[NC] subtype (Basal[NC], HER2E[NC], LumA[NC], LumB[NC], Normal[NC]). For the calculation of a tumor’s correlation to the best (PAM50[NC]) and second-best (PAM50[NC_2nd]) NC subtype, the average correlation per centroid of the 100 correlation values was used. This average value was also used to determine the PAM50[NC_2nd] subtype for a tumor. Patients were divided into four clinically relevant subgroups (with different therapy options) according to ER, PR, and HER2 status (p = positive, n = negative) available from the clinical cancer registry: (1) TNBC, (2) ERnHER2p, (3) ERpHER2p, and (4) ERpHERn. Ethical approval All SCAN-B enrolled patients provided written informed consent prior to study inclusion as described in Staaf et al.^[200]14. Ethical approval was given for the SCAN-B study (approval numbers 2009/658, 2010/383, 2012/58, 2013/459, 2015/277) by the Regional Ethical Review Board in Lund, Sweden, governed by the Swedish Ethical Review Authority, Box 2110, 750 02 Uppsala, Sweden. PAM50 gene set clustering To identify co-expressed gene clusters among the PAM50 genes we used SRIQ clustering^[201]32 of FPKM data from all RNA sequencing profiles reported in ref. ^[202]14 (n = 9206). SRIQ is an unsupervised clustering method that incorporates concepts from random forest machine learning as well as quality threshold- and k-nearest neighbor clustering to identify a core cluster of samples or genes that share common patterns without requiring prior knowledge of the data or a predefined number of clusters. The rationale behind using the larger set of 9206 RNA sequencing profiles was to have as many breast cancer expression profiles as possible for the gene clustering, acknowledging that replicates exist among the 9206 profiles as described in ref. ^[203]14. SRIQ analysis identified six core gene clusters comprising 45 of 50 PAM50 genes (see ref. ^[204]32 for details about core clustering). The other five genes, MYC, MMP11, BAG1, MDM2, and BLVRA, were not included in any SRIQ core cluster. One of the six SRIQ clusters comprised ERBB2 (17q12), GRB7 (17q12), and FGFR4 (5q35.2) and was manually split into two clusters (ERBB2/GRB7 and FGFR4, respectively). The decision to split this particular SRIQ cluster was based on: (1) the genomic proximity of ERBB2 and GRB7 in the same minimally amplified region (17q12, see ref. ^[205]33) versus FGFR4, (2) the key role of the ERBB2 locus for clinical management of breast cancer, (3) the specific addition of FGFR4 as a HER2E prototype gene to the original PAM50 centroids^[206]7, and (4) the expression correlation of the three genes where ERBB2/GRB7 are tightly correlated (Pearson correlation of 0.9 across 9206 RNA sequencing profiles) whereas FGFR4/ERBB2 showed only a Pearson correlation of 0.285 across the 9206 assays. This split resulted in seven final gene clusters to be evaluated. Gene set scores for each tumor were calculated as the average log2 (FPKM+0.1 offset) value of genes included in the respective gene cluster (i.e., no gene centering). Functional analysis of the gene clusters was performed by: (1) pathway enrichment analysis using Enrichr (v3.1)^[207]34,[208]35 accessing the KEGG pathway^[209]36,[210]37 and Gene Ontology Consortium databases^[211]38,[212]39 with an adjusted p-value cut-off of p < 0.05, and (2) correlation across samples of gene set scores and rank scores for eight biological metagenes reported by Fredlund et al.^[213]15 calculated as defined by Nacer et al.^[214]40. PAM50 reclassification following a leave-oneGeneCluster-out strategy PAM50 reclassification was performed by modifying the method outlined in ref. ^[215]14 following a leave-oneGeneCluster-out strategy where individual SRIQ-derived gene sets were sequentially excluded. Three different gene-matched data matrices were used: (1) the PAM50 centroids from Parker et al.^[216]7, (2) a matrix including 100 defined reference sets for gene centering as defined in ref. ^[217]14, and (3) a matrix for the samples to be classified. The strategy was based on excluding one of the SRIQ-derived PAM50 gene clusters at a time from the centroid, reference set, and expression matrices, creating reduced gene matrices. For each sample NC classification using the remaining genes was performed 100 times using the 100 defined reference sets for gene centering as defined in ref. ^[218]14. The genes specifically included in the seven SRIQ core gene sets formed the background centroid gene content for these analyses. The resulting subtype, PAM50[perturb], was called Basal[perturb], HER2E[perturb], LumA[perturb], LumB[perturb], and Normal[perturb]. A sample was called as having a subtype switch if the PAM50[NC] subtype was observed in ≤50% of the 100 PAM50[perturb] classifications and we used no minimum correlation cut-off for subtyping. cBioPortal analyses For gene-gene comparisons in TCGA breast cancers, we used the cBioPortal online tool ([219]www.cbioportal.org). For correlation analyses, the option of log-transformed mRNA expression z-scores compared to the expression distribution of all samples (RNA Seq V2 RSEM) was used for 1082 tumors. Statistical methods All p-values reported are two-sided and were compared to a level of significance of 0.05 unless otherwise specified. Boxplot elements correspond to: (1) center line = median, (2) box limits = upper and lower quartiles, (3) whiskers = 1.5x interquartile range. Correlations were computed using Spearman correlation unless otherwise specified. Survival analysis Survival analyses were performed in R (v4.2.2) using the survival (v3.4.0) and survminer (v0.4.9) packages with overall survival (OS) and distant recurrence-free interval (DRFI) as primary endpoints obtained from ref. ^[220]14. Survival curves were estimated using the Kaplan–Meier method and compared using the log-rank test. Cox proportional hazard ratios were computed using the coxph function in R. Reporting summary Further information on research design is available in the [221]Nature Research Reporting Summary linked to this article. Supplementary information [222]SupplementaryInformation^ (1.1MB, pdf) [223]Supplementary Table 1^ (1.5MB, xlsx) [224]Supplementary Table 2^ (54.2KB, xlsx) [225]Reporting Summary^ (2.7MB, pdf) Acknowledgements