Abstract

Background

   Bromodomain-containing protein (BRD) play a pivotal role in the
   development and progression of malignant tumours. This study aims to
   identify prognostic genes linked to BRD-related genes (BRDRGs) in
   patients with triple-negative breast cancer (TNBC) and to construct a
   novel prognostic model.

Methods

   Data from TCGA-TNBC, [40]GSE135565, and [41]GSE161529 were retrieved
   from public databases. [42]GSE161529 was used to identify key cell
   types. The BRDRGs score in TCGA-TNBC was calculated using single-sample
   Gene Set Enrichment Analysis (ssGSEA). Differential expression analysis
   was performed to identify differentially expressed genes (DEGs): DEGs1
   in key cells, DEGs2 between tumours and controls and DEGs3 in high and
   low BRDRGs score subgroups in TCGA-TNBC. Differentially expressed
   BRDRGs (DE-BRDRGs) were determined by overlapping DEGs1, DEGs2 and
   DEGs3. Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes
   (KEGG) pathway analysis, and protein-protein interaction (PPI) network
   analysis were conducted to investigate active pathways and molecular
   interactions. Prognostic genes were selected through univariate Cox
   regression and least absolute shrinkage and selection operator (LASSO)
   regression analyses to construct a risk model and calculate risk
   scores. TNBC samples from TCGA-TNBC were classified into high and
   low-risk groups based on the median risk score. Additionally,
   correlations with clinical characteristics, Gene Set Enrichment
   Analysis (GSEA), immune analysis, and pseudotime analysis were
   performed.

Results

   A total of 120 DE-BRDRGs were identified by overlapping 605 DEGs1 from
   four key cell types, 10,776 DEGs2, and 4,497 DEGs3. GO analysis
   revealed enriched terms such as ‘apoptotic process,’ ‘immune response,’
   and ‘regulation of the cell cycle,’ while 56 KEGG pathways, including
   the ‘MAPK signaling pathway,’ were associated with DE-BRDRGs. A risk
   model comprising six prognostic genes (KRT6A, PGF, ABCA1, EDNRB, CTSD
   and GJA4) was constructed. A nomogram based on independent prognostic
   factors was also developed. Immune cell abundance was significantly
   higher in high-risk group. In both risk groups, TP53 exhibited the
   highest mutation frequency. The expression of KRT6A, ABCA1, EDNRB, and
   CTSD went decreased progressively in pseudotime.

Conclusion

   A novel prognostic model for TNBC associated with BRDRGs was developed
   and validated, providing fresh insights into the relationship between
   BRD and TNBC.

Supplementary Information

   The online version contains supplementary material available at
   10.1186/s12935-025-03648-7.

   Keywords: Triple-negative breast cancer, Bromodomain-containing
   protein, Immune microenvironment, Prognosis

Introduction

   Breast cancer (BC) is one of the most prevalent malignant tumors in
   women, with an annual incidence increase of 0.5% [[43]1]. According to
   the International Agency for Research on Cancer (IARC), BC accounts for
   11.7% of all cancer cases, making it one of the most widespread cancers
   globally [[44]2]. Characterized by significant heterogeneity, BC
   presents a broad spectrum of pathological features and molecular
   subtypes [[45]3]. Treatment strategies for BC vary depending on the
   subtype, including surgical intervention, radiotherapy, chemotherapy,
   endocrine therapy, and targeted therapy [[46]1].

   Recent advancements in novel targets, drugs, and medical diagnostic and
   imaging technologies have been made for BC. However, metastatic BC
   remains the cause of over 90% of BC-related deaths [[47]4].
   Triple-negative breast cancer (TNBC), an aggressive BC subtype, is
   primarily treated with chemotherapy, but patient prognosis remains poor
   due to the limited treatment options. Current biomarkers and diagnostic
   tools are insufficient for accurately predicting TNBC prognosis
   [[48]3]. Consequently, the development of novel biomarkers and
   predictive models is crucial.

   Bromodomain-containing protein (BRD), key members of the bromodomain
   and extra-terminal domain (BET) family, influence gene transcription by
   binding to acetylated histones. BET plays a crucial role in the
   development of various malignancies, including liver, endometrial, and
   breast cancer. It is integral to the expression of tumour-related
   genes, which are essential for tumor cell survival. BET acts as a
   central regulator of tumor-related gene expression, sustaining tumor
   cell viability [[49]5–[50]7]. Recent studies have highlighted the
   significant role of BRD in regulating the cell cycle, proliferation and
   apoptosis, as well as in tumor cell infiltration, metastasis, and the
   malignant progression. In TNBC cells, reduced protein phosphatase 2 A
   activity leads to increased phosphorylation in the acidic region of
   BRD, facilitating tumor progression [[51]8]. Research by Noblejas-López
   et al. has demonstrated that BET inhibitors can effectively
   downregulate Brd4 expression [[52]9], offering a promising therapeutic
   approach for TNBC. However, the prognostic value of BRD-related genes
   (BRDRGs) in TNBC has not been explored.

   This study aims to identify BRD-associated biomarkers in TNBC through
   bioinformatics approaches and develop a risk model to predict patient
   survival, providing potential therapeutic targets and a reliable
   prognostic tool for TNBC management.

Materials and methods

Data source

   The Cancer Genome Atlas (TCGA)-TNBC dataset was retrieved from the TCGA
   database ([53]https://portal.gdc.cancer.gov/) and used as the training
   set. It includes transcriptomic data and clinical and survival
   information from 116 TNBC samples and 113 control samples. The
   validation set [54]GSE135565 ([55]GPL570), comprising 84 TNBC samples,
   was sourced from the Gene Expression Omnibus (GEO) database
   ([56]https://www.ncbi.nlm.nih.gov/geo/). Additionally, the single-cell
   dataset [57]GSE161529 ([58]GPL18573) was obtained from GEO, containing
   data from 4 TNBC tumors and 13 controls. All sample types in the
   datasets were tissue-derived. From published literature, 39 BRD-related
   genes (BRDRGs) were identified [[59]10]. In addition, the technical
   approach of this study was shown in Additional File [60]1.

Single cell data analysis

   For [61]GSE161529, cells with fewer than 200 genes and fewer than 3
   covered cells were excluded using Seurat (v 4.0.5) [[62]11]. Quality
   control (QC) criteria were applied, selecting genes with nFeature_RNA
   between 300 and 6000, nCount_RNA < 25,000, and percent.mt < 20% for
   further analysis. The FindVariableFeatures function was used to
   identify highly variable genes after log normalization. The ScaleData
   function was then applied for normalization, followed by principal
   component analysis (PCA) to assess the variance contributed by
   principal components (PCs). Stable signals were selected for PCA
   downscaling analysis, and clustering was visualized using uniform
   manifold approximation and projection (UMAP). For cell type annotation,
   the SingleR package (v 1.0.6) [[63]12] was used, referencing the
   HumanPrimaryCellAtlasData [[64]13, [65]14]. Cell clusters and
   corresponding marker genes were identified, and the annotated cell
   types were visualized for subsequent analyses. Samples were separated
   based on TNBC and Control groups, and after annotation with marker
   genes, the proportions of different cell types between the two groups
   were compared. Cell types with significant differences (p < 0.05)
   between the two groups were identified using the Wilcoxon test, and
   were thus determined to be key cell types. Cellular communication and
   interactions were analyzed using CellChat (v 1.6.1) [[66]15], with
   detailed interaction networks visualized through shell diagrams.

Identifying and analyzing differential expressed genes (DEGs)

   Differentially expressed genes (DEGs1) between TNBC and control samples
   were selected using the FindMarkers function (|log[2]FC| > 0.25) based
   on key cells. DEGs2 were identified using the DESeq2 package (v 1.34.0)
   with |log[2]FC| > 0.5 [[67]16]. The BRDRGs score for TCGA-TNBC dataset
   was calculated using single-sample Gene Set Enrichment Analysis
   (ssGSEA) from the GSVA package (v 1.44.5) [[68]17], and samples were
   stratified into high and low score subgroups based on median score to
   derive DEGs3 (|log[2]FC| > 0.5). The intersection of DEGs1, DEGs2, and
   DEGs3 was analyzed using ggvenn (v 0.1.9) [[69]18] to identify
   differentially expressed BRDRGs (DE-BRDRGs). To further investigate the
   functions of DE-BRDRGs, Gene Ontology (GO) analysis, including
   Biological Process (BP), Molecular Functions (MF) and Cellular
   Components (CC), as well as Kyoto Encyclopedia of Genes and Genomes
   (KEGG) pathway enrichment analysis, were performed using the
   clusterProfiler package (v 4.2.2) [[70]19]. A protein-protein
   interaction (PPI) network was constructed based on DE-BRDRGs with a
   confidence score of 0.4 using the STRING database.

Construction and validation of risk model

   In TCGA-TNBC dataset, prognostic genes were identified through
   univariate Cox regression and least absolute shrinkage and selection
   operator (LASSO) regression analysis using the glmnet package (v 4.0–2)
   based on DE-BRDRGs [[71]20]. It is worth noting that Lasso regression
   is a statistical method that first performed ten-fold cross-validation
   on genes selected through univariate COX analysis based on the λ value,
   and then selected the genes whose regression coefficients were not
   penalized to zero as prognostic genes. A risk model was then
   constructed according to the expression of prognostic genes and overall
   survival (OS). Risk scores were calculated using the following formula.
   graphic file with name M1.gif

   In this formula, β and x represent the coefficients and gene expression
   levels, respectively. The TNBC samples were then stratified into high-
   and low-risk groups based on the median risk score. Kaplan-Meier (K-M)
   survival curves for both risk groups were generated using the Survival
   package (v 3.3-1) [[72]21]. To further validate the risk model,
   receiver operating characteristic (ROC) curves were constructed for 3,
   5, and 7 years, and the area under the curve (AUC) values were
   calculated using the survivalROC package(v 1.0.3) [[73]22].
   [74]GSE135565 was used as an external validation set to assess the
   model’s performance. Based on the optimal cutoff value of the risk
   score, TNBC patients were divided into high- and low-risk groups. The
   survival differences between the two risk groups were analyzed using
   the survival package (v 3.3-1) (p < 0.05). In addition, to further
   validate the risk model, ROC curves for 3, 5, and 7 years were
   constructed using the survivalROC package (v 1.0.3), with an AUC > 0.6
   indicating good predictive performance of the risk model.

Independent prognostic analysis and correlation analysis of risk score with
clinical characteristics

   First, risk score, age, stage, pathological T, pathological N, and
   pathological M stage were included in univariate Cox, proportional
   hazards (PH) and multivariate Cox regression analyses. Clinical
   clinical factors identified in the multivariate Cox model were then
   used to construct a nomogram predicting OS at 3, 5, and 7 years. The
   predictive capability of the nomogram was assessed using calibration,
   ROC curves, and decision curve analysis (DCA). Differences in risk
   scores among the five clinical characteristics were also analyzed.

Gene set enrichment analysis (GSEA) and immune microenvironment analysis

   Inter-group differences were analyzed using the DESeq2 package, and the
   log[2]FC values from this analysis were used as sorting criteria for
   GSEA with |NES| ≥ 1, based on the KEGG gene set. The ssGSEA algorithm
   was applied to calculate scores for 12 immune function pathways in TNBC
   samples, and differences in immune function scores between the high-
   and low-risk groups were tested. Additionally, 28 immune cell types
   were scored by ssGSEA, and variations in immune cell infiltration
   between the two risk groups were evaluated.

Gene mutation, tumour immune dysfunction and exclusion (TIDE), and drug
sensitivity analysis

   Tumour mutational burden (TMB) analysis for two risk groups was
   performed using the maftools package [[75]23], and the top 20 mutant
   genes in both groups were visualized. TNBC samples were divided into
   high- and low-TMB groups based on the median TMB score, and K-M curves
   for the four (H-TMB + High Risk, H-TMB + Low Risk, L-TMB + High Risk
   and L-TMB + Low Risk) were plotted. Gene expression data from 115
   normalized tumor samples were imported into the TIDE website to
   calculate TIDE, dysfunction, and exclusion scores, which were used to
   evaluate potential responses to immunotherapy. Additionally, the 50%
   inhibitory concentration (IC[50]) values for 138 chemotherapy or
   targeted drugs in TNBC were inferred from the CGP programs and compared
   between the two risk groups using the pRRophetic package (v 0.5)
   [[76]24]. Finally, the DGidb database ([77]https://dgidb.org/) was used
   to predict small-molecule drugs targeting prognostic genes, and their
   interactions were visualized.

Expression of prognostic genes and pseudotime analysis

   Expression levels of prognostic genes across various cell types were
   visualized using Seurat’s DotPlot function. The differentiation
   trajectories of key cells were simulated and analyzed using the
   Monocle3 package.

Real-time quantitative PCR (RT-qPCR) detection of prognostic genes

   This study used 10 tissue samples (5 from the TNBC group and 5 from the
   control group), all of which were collected from Fujian Provincial
   Hospital. The study was approved by the Fujian Provincial Hospital
   Ethics Committee (Approval No: K2021-04-069), and all patients provided
   written informed consent. Subsequently, reverse transcription was
   performed with the help of the RevertaidTM First Strand cDNA Synthesis
   Kit (Thermo, USA) to generate cDNA. The reaction consisted of 2µL of
   cDNA, 10µL of TaqMan^® Universal PCR Master Mix, 1µL primer, and 7µL
   dH2O (Additional File [78]2), with the RT-qPCR analysis conducted over
   40 cycles (Additional File [79]3). Finally, the data were analyzed
   using the 2^−△△Ct method [[80]25], with GAPDH serving as the reference
   gene. The results were plotted and p-values calculated using GraphPad
   Prism (v 5.0.0) [[81]26].

Statistical analysis

   All analyses were performed in R, and differential analysis was
   conducted using the Wilcoxon test. Unless otherwise specified, a
   P-value < 0.05 was considered statistically significant.

Results

Cell clustering

   The distribution of nFeature_RNA, nCount_RNA and percent.mt before and
   after QC is illustrated in violin plots (Fig. [82]1a-b). Following QC,
   2,000 highly variable genes were selected for subsequent analysis
   (Fig. [83]1c). PCA of the top 20 principal components, based on the top
   50 PCs contributing to variance, revealed distinct clustering between
   TNBC and control cohorts (Fig. [84]1d-e). UMAP visualization identified
   17 unique cell populations in both TNBC and control samples, with
   certain populations absent from either group (Fig. [85]1f-g).

Fig. 1.

   [86]Fig. 1
   [87]Open in a new tab

   Cell clustering results. (a) Distribution of nFeature_RNA, nCount_RNA,
   and percent.mt prior to quality control. (b) Distribution of
   nFeature_RNA, nCount_RNA, and percent.mt after quality control. (c)
   Identification of the 2000 most variable genes. (d) Inflection point
   chart for selecting optimal dimensions. (e) PCA-based distribution of
   cells across sample groups. (f) UMAP-based cell clustering for all
   samples. (g) UMAP-based cell clustering for TNBC and control samples

Acquisition of key cells

   Combining single-cell dataset annotations with singleR (8 cell
   clusetrs, Fig. [88]2a) revealed 7 cell types and their respective
   marker genes: epithelial cell (KRT5, KRT8, EPCAM, CDH1), Endothelial
   cell (VWF, PECAM1, ACKR1, SPARCL1, HLA-E, GNG11, A2M), Fibroblast
   (COL1A1, COL1A2, PDGFRA, PDGFRB), T cell (CD8A, CD4, FOXP3, PDCD1), B
   cell (CD19, CD79A), Myeloid cell (CD86, CD14, ITGAX, CD68, CX3CR1) and
   Plasma cell (JCHAIN) (Fig. [89]2b-c). Cell type annotation indicated
   the highest abundance of epithelial cells, while immune cells,
   including B cells, T cells, myeloid cells, and plasma cells, were also
   prevalent. Fibroblasts and endothelial cells represented additional
   cell clusters (Fig. [90]2d). Notably, all 7 cell types were detected in
   both cohorts (Fig. [91]2e). Epithelial cells dominated both TNBC and
   control samples, while immune cells, including B cells, T cells,
   myeloid cells, and plasma cells, were more abundant in the TNBC cohort
   (Fig. [92]2f). A significant difference was observed between TNBC and
   control samples in endothelial cells, fibroblasts, myeloid cells, and
   plasma cells, which were identified as key cell populations
   (Fig. [93]2g).

Fig. 2.

   [94]Fig. 2
   [95]Open in a new tab

   Key cell identification. (a) SingleR cell type annotation. (b)
   Expression levels of marker genes across cellular taxa. (c) Bubble
   plots depicting expression of marker genes across cellular taxa. (d)
   Marker gene-based cell type annotation. (e) Marker gene-based cell type
   annotation for TNBC and control samples. (f) Proportion of each cell
   type in TNBC and control samples. (g) Differential cell type
   proportions between TNBC and control samples

   Ligand-receptor network analysis of 7 cell types revealed cellular
   communication patterns (Additional files 4a-b), with the MIF signaling
   pathway being the most prominent contributor to signaling across all
   populations (Additional files [96]4c). Five primary efferent and
   afferent communication modes were identified (Additional files
   [97]4e-g).

DE-BRDRGs screening and function analysis

   In TCGA-TNBC dataset, a total of 605 DEGs1 (290 upregulated and 315
   downregulated), 10,776 DEGs2 (6,546 upregulated and 4,230
   downregulated) between tumors and controls, and 4,497 DEGs3 (2,138
   upregulated and 2,359 downregulated) between high and low subgroups
   were identified (Fig. [98]3a-c). A total of 120 DE-BRDRGs were obtained
   by intersecting DEGs1, DEGs2 and DEGs3 (Fig. [99]4a). To investigate
   the functions of DE-BRDRGs, pathway enrichment analysis was performed.
   GO enrichment revealed 363 BPs, 22 CCs, and 35 MFs. Among the enriched
   BPs, key functions included ‘cytokine production’, ‘myeloid leukocyte
   migration’, and ‘positive regulation of apoptotic signaling pathway’,
   suggesting that DE-BRDRGs are not only related to BRDs but may also be
   involved in other cellular processes (Fig. [100]4b). KEGG pathway
   analysis identified 177 pathways, of which 21 were statistically
   significant (Fig. [101]4c-d). The PPI network displayed 95 interacting
   protein pairs, such as AKR1C1 and CBR3 (Fig. [102]4e).

Fig. 3.

   [103]Fig. 3
   [104]Open in a new tab

   Identification of differentially expressed genes. (a) Volcano plot of
   differentially expressed genes in key cells between TNBC and control
   samples. (b) Heatmaps and volcano plots illustrating differentially
   expressed genes between TNBC and control samples. (c) Heatmap and
   volcano plot of differentially expressed genes between high- and
   low-risk subgroups

Fig. 4.

   [105]Fig. 4
   [106]Open in a new tab

   Key genes related to BRDGs and functional enrichment analysis. (a)
   Screening of key genes associated with BRDGs. (b) Directed acyclic
   graphs and chord diagrams of GO enrichment results. (c) Classification
   of KEGG enrichment results. (d) Circular diagram of KEGG enrichment
   results. (e) Construction of PPI networks. Each node in the network
   represents a protein, and the connections between nodes represent
   interactions between the proteins

Constructed, evaluated, and validated a risk model

   Six prognostic genes (KRT6A, PGF, ABCA1, EDNRB, CTSD and GJA4) were
   selected through univariate cox regression analysis and LASSO based on
   120 DE-BRDRGs (Fig. [107]5a-b). A risk model was then constructed using
   the expression levels of these six genes in the training set, and risk
   scores were calculated (Table [108]1). Risk curve and genes expression
   profiles in high- and low-risk groups were plotted based on risk scores
   (Fig. [109]5c). Kaplan-Meier survival analysis revealed significant
   differences in survival between the two risk groups (P < 0.0001)
   (Fig. [110]5d). The AUC values at 3, 5, and 7 years were 0.739, 0.856,
   and 0.798, respectively, demonstrating the potential of these six
   prognostic genes as valuable predictors of survival (Fig. [111]5e).

Fig. 5.

   [112]Fig. 5
   [113]Open in a new tab

   Construction and evaluation of the risk model. (a) Forest plot from
   univariate Cox analysis. (b) Ten-fold cross-validation and coefficient
   spectrum from Lasso analysis. (c) Risk curves, scatter plots, and
   heatmaps depicting model gene expression for high- and low-risk TNBC
   subgroups. (d) Kaplan-Meier survival curves for high- and low-risk
   groups. (e) ROC curves for 3-, 5-, and 7-year survival prediction

Table 1.

   Gene coefficients for six prognostic genes
   Gene  Coef
   KRT6A 0.088900692
   PGF   0.219005689
   ABCA1 0.210816453
   EDNRB 0.0638788218063777
   CTSD  0.258686967
   GJA4  0.06757986
   [114]Open in a new tab

   The risk model was further validated using the [115]GSE135565 dataset.
   Distribution of risk scores, survival times, and a heatmap of
   prognostic genes expression were plotted for both risk groups
   (Fig. [116]6a). Consistent with the training set, Kaplan-Meier curves
   demonstrated that patients with TNBC in the low-risk group had a
   significantly higher survival probability (Fig. [117]6b). ROC analysis
   revealed AUC values exceeding 0.6 at 3, 5, and 7 years, indicating the
   robustness of the risk model in predicting survival outcomes
   (Fig. [118]6c).

Fig. 6.

   [119]Fig. 6
   [120]Open in a new tab

   Validation of the risk model in the [121]GSE135565 dataset. (a) Risk
   curves, scatter plots, and heatmaps of model gene expression for high-
   and low-risk TNBC subgroups in the [122]GSE135565 validation set. (b)
   Kaplan-Meier survival curves for high- and low-risk groups in the
   [123]GSE135565 validation set. (c) ROC curves for 3-, 5-, and 7-year
   survival prediction in the [124]GSE135565 validation set

Construction of the nomogram, and the correlation between risk score and
clinical features

   Independent prognostic factors were identified through univariate Cox
   regression, the PH test, and multivariate Cox regression, resulting in
   the inclusion of risk score and clinical stage in the nomogram
   (Fig. [125]7a-c). Calibration, ROC (AUC > 0.6), and DCA further
   confirmed the accuracy of the nomogram in predicting survival
   (Fig. [126]7d-f). Correlation analysis between risk score and five
   clinical characteristics revealed significant differences in stage, T
   stage, and M stage (P < 0.05, Fig. [127]8).

Fig. 7.

   [128]Fig. 7
   [129]Open in a new tab

   Development of an independent prognostic model. (a) Forest plot from
   univariate Cox analysis. (b) Forest plot from multivariate Cox model
   analysis. (c) Construction of a nomogram. (d) Evaluation of prognostic
   models via calibration curves. (e) ROC curve for independent prognostic
   factors. (f) Decision curve analysis (DCA) for the predictive model

Fig. 8.

   [130]Fig. 8
   [131]Open in a new tab

   Box plots of risk scores versus clinical indicators. (a) Age. (b) M
   Stage. (c) N Stage. (d) Stage. (e) T Stage

GSEA enrichment analysis in TCGA-TNBC and immune microenvironment landscape
for two risk teams

   GSEA identified KEGG pathways enriched for the sorted genes
   (Fig. [132]9). Activation pathways included cytokine-cytokine receptor
   interaction, autoimmune thyroid disease, Leishmania infection,
   allograft rejection, and graft-versus-host disease, while inhibitory
   pathways were associated with cytochrome P450 drug metabolism and
   xenobiotic metabolism (Additional files [133]5). Moreover, marked
   differences were observed between TNBC and control cohorts in seven
   immune functions, with significantly higher scores for these functions
   in the high-risk group (Fig. [134]10a). A total of 17 immune cells
   showed significant differences in abundance, with higher levels found
   in the high-risk group (Fig. [135]10b). Correlation analysis between
   risk score, immune cells, and immune function-related pathways revealed
   that risk score was most strongly positively correlated with five
   immune cell types and the APC co-inhibition pathway (Fig. [136]10c).

Fig. 9.

   [137]Fig. 9
   [138]Open in a new tab

   GSEA enrichment analysis results

Fig. 10.

   [139]Fig. 10
   [140]Open in a new tab

   Immune microenvironmental analysis results. (a) Differences in Immune
   Function Scores between high- and low-risk groups. (b) Differences in
   immune cell scores between high- and low-risk groups. (c) Correlation
   between risk scores, immune cells, and immune function pathways

Mutation, TIDE, and drug sensitivity analysis of two risk teams

   Visualization of somatic mutations revealed that the top 20 mutated
   genes in the high-risk group were identified in 25 samples, while in
   the low-risk group, the top 20 genes were mutated in 77 samples
   (Fig. [141]11a-b). In both risk groups, TP53 exhibited the highest
   mutation frequency, predominantly in the form of missense mutations,
   with the other 19 genes also primarily exhibiting missense mutations.
   Notable differences in TMB were observed between the two risk groups,
   as evidenced by survival curves (Fig. [142]11c). Survival analysis
   further highlighted significant differences within the four subgroups
   (Fig. [143]11d). Additionally, significant disparities in tumor immune
   dysfunction and TIDE scores were found between the high- and low-risk
   groups (Fig. [144]11e), with a notably lower probability of
   immunotherapy response in the high-risk group (Fig. [145]11f). A total
   of 118 drugs were identified with significantly different responses
   between the two risk groups (Fig. [146]12a-b). In the high-risk group,
   the IC50 values were higher for 46 drugs (e.g. ABT.263, ABT.888,
   AG.014699), while 66 drugs exhibited lower IC50 values in the high-risk
   group compared to the low-risk group (e.g. Bicalutamide, Bleomycin,
   Bortezomib). A drug network incorporating four prognostic genes and 18
   drugs was constructed (Fig. [147]12c). Notably, EDNRB was associated
   with 8 small molecule drugs, ABCA1 with 6 drugs, PGF with 3 drugs, and
   CTSD with 1 drug (streptozocin).

Fig. 11.

   [148]Fig. 11
   [149]Open in a new tab

   Gene mutation and tumor immune dysfunction and exclusion (TIDE)
   analysis. (a) Waterfall plot of mutation analysis in the high-risk
   group. (b) Waterfall plot of mutation analysis in the low-risk group.
   (c) Kaplan-Meier survival curves for high- and low-TMB score groups.
   (d) Kaplan-Meier survival analysis of the four subgroups: H-TMB + High
   Risk, H-TMB + Low Risk, L-TMB + High Risk, and L-TMB + Low Risk. The
   p < 0.05 indicates that the survival differences among the four
   subgroups are statistically significant. (e) Box plots of risk score
   versus tumor immune rejection, tumor immune dysfunction, and TIDE
   score. (f) Probability of response to immunotherapy

Fig. 12.

   [150]Fig. 12
   [151]Open in a new tab

   Drug sensitivity analysis results. (a) Drugs with higher IC50 in the
   high-risk group compared to the low-risk group. (b) Drugs with lower
   IC50 in the high-risk group compared to the low-risk group. (c) Network
   maps of prognostic genes and corresponding small molecule drugs

Expression levels for prognostic genes and pseudotime analysis for key cells

   The expression levels of the six prognostic genes across different cell
   types revealed that ABCA1 and CTSD were significantly upregulated in
   myeloid cells, EDNRB and GJA4 in fibroblasts, KRT6A in epithelial
   cells, and PGF in endothelial cells (Fig. [152]13a-b). Furthermore,
   simulation of cell differentiation trajectories for key cell types
   showed that the expression of GJA4 followed a pattern of low to high
   and then decreasing expression, PGF showed a continuous increase, while
   KRT6A, ABCA1, EDNRB, and CTSD exhibited a high-to-low expression
   pattern in pseudotime. (Fig. [153]13c-d). The RT-qPCR results showed
   that the prognostic genes (KRT6A, PGF, ABCA1, EDNRB, CTSD, and GJA4)
   exhibited differential expression between the control and TNBC groups.
   Specifically, compared to the control group, the expression of ABCA1,
   EDNRB, CTSD, and PGF was significantly elevated in the TNBC group
   (p < 0.05) (Fig. [154]13e). This confirmed the reliability of the
   bioinformatics analysis and supported the utility of the prognostic
   gene screening. These findings suggested that these four prognostic
   genes played a crucial role in determining the prognosis of TNBC.

Fig. 13.

   [155]Fig. 13
   [156]Open in a new tab

   Expression levels for prognostic genes and pseudotime analysis for key
   cells. (a) Expression levels of prognostic genes across different
   cellular taxa. (b) Bubble plot of prognostic gene expression in
   different cell types. (c) Map of immune cell differentiation pathways
   with time-based differentiation differences. (d) Expression of
   prognostic genes in the proposed chronology. (e) The RT-qPCR results of
   prognostic genes. *p < 0.05, **p < 0.01, ns indicates p > 0.05

Discussion

   BC is a malignancy resulting from the cancerous mutation of breast
   cells, influenced by various carcinogenic factors, and remains one of
   the leading causes of death among women. Within BCs, TNBC has garnered
   increasing attention due to its challenging treatment and poor
   prognosis. Jing et al. demonstrated that down-regulation of BRD4
   expression could be a promising therapeutic strategy for TNBC, as
   evidenced by in vivo experiments in rats [[157]9, [158]27]. Thus,
   investigating BRDRGs in TNBC is of paramount importance. This study
   developed a prognostic model associated to BRD by integrating
   single-cell sequencing data with transcriptome sequencing data and
   performed an in-depth analysis of prognostic genes, which provides new
   insights into TNBC research.

   Single-cell sequencing analysis identified seven distinct cell types,
   with endothelial cells and fibroblasts were more prevalent in controls,
   while immune cells such as myeloid cells, plasma cells were more
   abundant in TNBC samples. Fluid shear stress, known to influence tumour
   metastasis may facilitate the migrate of cancer cells through
   endothelial junctions and the endothelium, promoting metastatic
   processes [[159]28]. In patients with TNBC, endothelial cells exhibited
   abnormal responses to fluid shear stress, highlighting their critical
   role in TNBC metastasis [[160]28, [161]29]. Cancer-associated
   fibroblasts (CAFs) are pivotal in the tumor microenvironment, with CAFs
   enhancing the immunosuppressive effects of regulatory T cells (Tregs)
   by inhibiting T effector cell proliferation in BC, thereby promoting
   tumor progression [[162]30]. Myeloid cells, commonly expanded in TNBC,
   are associated with poor clinical prognosis. A key mechanism by which
   TNBC evades apoptosis involves the upregulation of the anti-apoptotic
   protein myeloid leukemia 1 (MCL1) [[163]31]. Additionally, the density
   of CD38 + plasma cells within tumors serves as an independent
   prognostic marker. Collectively, these studies suggest that endothelial
   cells, fibroblasts, myeloid cells, and plasma cells play pivotal roles
   in the pathogenesis and progression of TNBC.

   KRT6A, PGF, ABCA1, EDNRB, CTSD, and GJA4 represent six
   prognosis-related genes identified in TNBC. PGF, a family of angiogenic
   growth factors, has been shown to exhibit significantly elevated PLGF-1
   transcript levels in patients with TNBC experiencing poor prognosis
   [[164]32]. PGF is associated with apoptosis and GPCR pathways in
   synovial fibroblasts, which are the primary cell type in synovial
   tissue. Notably, synovial fibroblasts have been found to be related to
   various tumor types in several studies [[165]33–[166]35]. These
   fibroblasts promote tumor growth, invasion, and metastasis by secreting
   various cytokines and matrix metalloproteinases (MMPs) [[167]36].
   Additionally, GPCR play a crucial role in tumor cell migration. GPCR
   not only affect the tumor cells themselves but also regulate other cell
   types in the tumor microenvironment, such as fibroblasts and immune
   cells [[168]37, [169]38]. Therefore, PGF may modulate cells via GPCR
   pathways, thereby activating pro-tumor factors that influence the
   progression of TNBC. ABCA1, a TNBC-specific marker, is down-regulated
   in TNBC, modulating cholesterol transport and correlating with reduced
   proliferation and increased metastatic potential [[170]39].
   Furthermore, ABCA1’s involvement in cholesterol metabolism and the
   significant lipid metabolism disorders observed in TNBC suggest its
   role in disease progression via these metabolic pathways [[171]40].
   Studies have shown that lipid metabolism reprogramming, particularly
   fatty acid oxidation (FAO), has become a major driver of BC metastasis
   [[172]41]. Furthermore, a study combining spatial multi-omics also
   highlighted that changes in lipid metabolism play a key role in both
   the breast cancer cells themselves and the cells within the lung
   metastatic niche [[173]42]. These studies suggest a complex connection
   between lipid metabolism and the progression of TNBC. ABCA1 can
   downregulate the expression of histone deacetylase 9 (HDAC9) and
   inhibit lipid accumulation in macrophages [[174]43], indicating a
   potential mechanism of action between ABCA1 and lipid metabolism. This
   suggests that the ABCA1 gene may influence the development and
   progression of TNBC by modulating the expression of histone
   deacetylases in the system, thereby activating lipid metabolic
   pathways. The upregulation of EDNRB, a receptor in the endothelin axis,
   is linked to the development and metastasis of various solid tumors,
   establishing it as a novel prognostic biomarker for TNBC [[175]44]. Xi
   Gu et al. first demonstrated that silencing EDNRB expression inhibits
   TNBC progression, positioning EDNRB as both a prognostic biomarker and
   a potential therapeutic target [[176]45]. CTSD, a lysosomal protease,
   serves as a poor prognosis marker in BC, with CTSD knockout experiments
   revealing that its deficiency in mammary epithelium can inhibit tumor
   progression autonomously [[177]46]. Additionally, GJA4 (Gap Junction
   Protein Alpha 4) has been associated with an increased risk of
   secondary lymphedema in patients with BC [[178]47]. GJA4, also known as
   Connexin 37 (CX37) [[179]48], is a gap junction protein that plays a
   crucial role in normal breast development and function. Studies have
   shown that connexins (Cxs) are critical for breast development, disease
   progression, therapeutic resistance, and antitumor immune responses,
   making Cxs a promising target at the crossroads of BC development and
   treatment [[180]49, [181]50]. Additionally, the paralog of GJA4, GJA1,
   has been found to be downregulated and associated with the progression
   of TNBC and poor prognosis [[182]49]. These studies highlight the
   importance of connexins in normal breast development, and suggest that
   either overexpression or underexpression of connexins may affect the
   prognosis of TNBC. Moreover, the discovery of KRT6A as a potential
   biomarker for TNBC warrants further investigation to elucidate its
   precise role in disease progression. A study on the senescence
   mechanisms of TNBC cancer cells revealed that KRT6A is significantly
   upregulated in TNBC cell lines and is considered a risk gene for TNBC
   [[183]51]. Another study focusing on the prognostic mechanisms of TNBC
   also highlighted KRT6A as a valuable prognostic gene for TNBC
   [[184]52]. These findings are consistent with our results, where KRT6A
   expression shows an upregulation trend in tumor samples. This suggests
   that KRT6A may have significant research value in improving the
   prognosis of TNBC patients. Collectively, these studies underscore the
   involvement of PGF, ABCA1, EDNRB, KRT6A, CTSD, and GJA4 in TNBC
   pathogenesis, supporting the findings of the present study.

   Immune infiltration analysis revealed significant differences in the
   distribution of macrophages, natural killer (NK) cells, neutrophils,
   Tregs, and other immune cells between cancerous and paraneoplastic
   tissues in TNBC. Tumor-associated macrophages (TAMs), which are major
   components of the tumor microenvironment (TME), are closely linked to
   malignant tumor progression. TNBC cells and TAMs facilitate tumor
   advancement via the IL-6-TGF-β1 axis [[185]53]. NK cells, known for
   their ability to autonomously kill target cells, are key effector cells
   in innate immunity and exhibit considerable heterogeneity within the
   TME. NK cell-based therapeutic strategies are currently under extensive
   investigation for TNBC treatment [[186]54]. Neutrophil infiltration and
   the formation of neutrophil extracellular traps also promote breast
   cancer lung metastasis [[187]55]. These findings emphasize the critical
   role of immune cells in TNBC progression and treatment. Therefore,
   exploring the relationship between prognostic genes and immune cells
   helps to further understand the immune microenvironment of TNBC, which
   in turn facilitates a deeper understanding of the mechanisms underlying
   TNBC progression. Notably, mutations in KRT6A have been shown to
   mediate the activity of tumor-associated macrophages (TAMs). KRT6A can
   replace JAK2V617F as a more prominent disease driver in MPN-SC
   [[188]56]. Moreover, KRT6A can influence TAMs through various proteins
   and pathways, thereby affecting tumor prognosis [[189]57]. In addition,
   the prognostic gene ABCA1 is also associated with macrophages in tumor
   tissue [[190]58]. Another study pointed out that ABCA1 expression is
   upregulated in tumor monocytes/macrophages, and the upregulation of
   ABCA1 is positively correlated with decreased cellular cholesterol
   content and elevated serum cholesterol levels [[191]59]. This suggests
   that the upregulation of ABCA1 gene expression may regulate cholesterol
   metabolism in the serum, thereby influencing the immune response of
   macrophages in the immune microenvironment and affecting the
   development of TNBC. Furthermore, other prognostic genes, such as PGF,
   CTSD, and GJA4, are also correlated with macrophages [[192]60–[193]62].
   EDNRB, on the other hand, plays a role in T cell subpopulations,
   particularly in immune-inflammatory responses, and may influence T cell
   migration and activation [[194]63, [195]64]. These studies suggest that
   prognostic genes can mediate immune cell responses, through biological
   pathways such as lipid metabolism, thereby impacting the progression of
   TNBC.

   Due to the lack of targeted therapies for TNBC and the fact that it is
   often diagnosed at advance stages [[196]65], accurate prognostic
   assessment is particularly important. Prognostic models enable
   clinicians to identify TNBC patients at high risk for recurrence and
   metastasis, thereby guiding the decision on whether more aggressive
   treatment strategies are necessary, such as intensified chemotherapy or
   the inclusion of new immunotherapies or targeted therapies.
   Specifically, patient relapse risks can be evaluated based on data such
   as gene expression profiles, tumor size, and lymph node status
   [[197]66], which helps clinicians determine whether adjuvant
   chemotherapy or the use of immune checkpoint inhibitors and other novel
   therapies should be considered. This personalized and precise approach
   to treatment helps avoid both overtreatment and undertreatment,
   optimizing therapeutic outcomes. Furthermore, based on risk assessments
   from prognostic models, physicians can adjust the frequency of
   follow-up and monitoring. For high-risk patients, more frequent imaging
   and biomarker testing may be required to detect recurrence or
   metastasis early and adjust treatment plans accordingly. On the other
   hand, low-risk patients may benefit from less frequent monitoring,
   reducing unnecessary testing and alleviating the patient’s financial
   burden and psychological stress.

   This study constructed a prognostic model for TNBC associated with BRD
   through bioinformatics analysis, exploring the mechanisms of the
   identified prognostic genes and immune cell interactions. The results
   offer new insights into potential diagnostic and therapeutic approaches
   for TNBC. However, there are still some limitations. Firstly, the
   underlying mechanisms of these genes warrant additional experimental
   validation. In future studies, we plan to use BC cell lines (such as
   MDA-MB-231 or MCF-7) to construct cell models for overexpression or
   knockdown experiments of prognostic genes, aiming to further validate
   the molecular mechanisms of these prognostic genes and their expression
   patterns in tumor cells. Secondly, this study primarily relies on
   sequencing data from the TCGA and GEO databases for analysis, and the
   prognostic analysis in different populations is somewhat limited. In
   future work, we plan to explore additional datasets to validate the
   findings across different populations. This will help strengthen the
   overall generalizability of our model and improve the effectiveness and
   applicability of the prognostic model.

Electronic supplementary material

   Below is the link to the electronic supplementary material.
   [198]12935_2025_3648_MOESM1_ESM.pdf^ (265.7KB, pdf)

   Additional file 1: Research methods and technical approach
   [199]12935_2025_3648_MOESM2_ESM.xlsx^ (9.9KB, xlsx)

   Additional file 2: Primer sequences for RT-qPCR
   [200]12935_2025_3648_MOESM3_ESM.xlsx^ (14KB, xlsx)

   Additional file 3: RT-qPCR amplification reaction protocol
   [201]12935_2025_3648_MOESM4_ESM.xls^ (56KB, xls)

   Additional file 4: Cellular communication analysis for seven cell
   types. (a) Cellular communication circle and intensity maps. (b)
   Ligand-receptor network relationships across seven cell types. (c)
   Efferent and afferent signaling associated with each cell population.
   The x-axis represents different cell types, and the y-axis represents
   the signaling pathways that are outgoing or incoming to the cell
   population, while the bar chart at the top indicates the signal
   strength. (d) Identification and visualization of outward cellular
   communication patterns. (e) Outward communication pattern mulberry
   diagram. (f) Identification and visualization of afferent communication
   patterns in cells. (g) Incoming communication pattern mulberry diagram
   [202]12935_2025_3648_MOESM5_ESM.tif^ (5.6MB, tif)

   Additional file 5: KEGG gene set GSEA enrichment analysis results

Acknowledgements