Abstract Background Bromodomain-containing protein (BRD) play a pivotal role in the development and progression of malignant tumours. This study aims to identify prognostic genes linked to BRD-related genes (BRDRGs) in patients with triple-negative breast cancer (TNBC) and to construct a novel prognostic model. Methods Data from TCGA-TNBC, [40]GSE135565, and [41]GSE161529 were retrieved from public databases. [42]GSE161529 was used to identify key cell types. The BRDRGs score in TCGA-TNBC was calculated using single-sample Gene Set Enrichment Analysis (ssGSEA). Differential expression analysis was performed to identify differentially expressed genes (DEGs): DEGs1 in key cells, DEGs2 between tumours and controls and DEGs3 in high and low BRDRGs score subgroups in TCGA-TNBC. Differentially expressed BRDRGs (DE-BRDRGs) were determined by overlapping DEGs1, DEGs2 and DEGs3. Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, and protein-protein interaction (PPI) network analysis were conducted to investigate active pathways and molecular interactions. Prognostic genes were selected through univariate Cox regression and least absolute shrinkage and selection operator (LASSO) regression analyses to construct a risk model and calculate risk scores. TNBC samples from TCGA-TNBC were classified into high and low-risk groups based on the median risk score. Additionally, correlations with clinical characteristics, Gene Set Enrichment Analysis (GSEA), immune analysis, and pseudotime analysis were performed. Results A total of 120 DE-BRDRGs were identified by overlapping 605 DEGs1 from four key cell types, 10,776 DEGs2, and 4,497 DEGs3. GO analysis revealed enriched terms such as ‘apoptotic process,’ ‘immune response,’ and ‘regulation of the cell cycle,’ while 56 KEGG pathways, including the ‘MAPK signaling pathway,’ were associated with DE-BRDRGs. A risk model comprising six prognostic genes (KRT6A, PGF, ABCA1, EDNRB, CTSD and GJA4) was constructed. A nomogram based on independent prognostic factors was also developed. Immune cell abundance was significantly higher in high-risk group. In both risk groups, TP53 exhibited the highest mutation frequency. The expression of KRT6A, ABCA1, EDNRB, and CTSD went decreased progressively in pseudotime. Conclusion A novel prognostic model for TNBC associated with BRDRGs was developed and validated, providing fresh insights into the relationship between BRD and TNBC. Supplementary Information The online version contains supplementary material available at 10.1186/s12935-025-03648-7. Keywords: Triple-negative breast cancer, Bromodomain-containing protein, Immune microenvironment, Prognosis Introduction Breast cancer (BC) is one of the most prevalent malignant tumors in women, with an annual incidence increase of 0.5% [[43]1]. According to the International Agency for Research on Cancer (IARC), BC accounts for 11.7% of all cancer cases, making it one of the most widespread cancers globally [[44]2]. Characterized by significant heterogeneity, BC presents a broad spectrum of pathological features and molecular subtypes [[45]3]. Treatment strategies for BC vary depending on the subtype, including surgical intervention, radiotherapy, chemotherapy, endocrine therapy, and targeted therapy [[46]1]. Recent advancements in novel targets, drugs, and medical diagnostic and imaging technologies have been made for BC. However, metastatic BC remains the cause of over 90% of BC-related deaths [[47]4]. Triple-negative breast cancer (TNBC), an aggressive BC subtype, is primarily treated with chemotherapy, but patient prognosis remains poor due to the limited treatment options. Current biomarkers and diagnostic tools are insufficient for accurately predicting TNBC prognosis [[48]3]. Consequently, the development of novel biomarkers and predictive models is crucial. Bromodomain-containing protein (BRD), key members of the bromodomain and extra-terminal domain (BET) family, influence gene transcription by binding to acetylated histones. BET plays a crucial role in the development of various malignancies, including liver, endometrial, and breast cancer. It is integral to the expression of tumour-related genes, which are essential for tumor cell survival. BET acts as a central regulator of tumor-related gene expression, sustaining tumor cell viability [[49]5–[50]7]. Recent studies have highlighted the significant role of BRD in regulating the cell cycle, proliferation and apoptosis, as well as in tumor cell infiltration, metastasis, and the malignant progression. In TNBC cells, reduced protein phosphatase 2 A activity leads to increased phosphorylation in the acidic region of BRD, facilitating tumor progression [[51]8]. Research by Noblejas-López et al. has demonstrated that BET inhibitors can effectively downregulate Brd4 expression [[52]9], offering a promising therapeutic approach for TNBC. However, the prognostic value of BRD-related genes (BRDRGs) in TNBC has not been explored. This study aims to identify BRD-associated biomarkers in TNBC through bioinformatics approaches and develop a risk model to predict patient survival, providing potential therapeutic targets and a reliable prognostic tool for TNBC management. Materials and methods Data source The Cancer Genome Atlas (TCGA)-TNBC dataset was retrieved from the TCGA database ([53]https://portal.gdc.cancer.gov/) and used as the training set. It includes transcriptomic data and clinical and survival information from 116 TNBC samples and 113 control samples. The validation set [54]GSE135565 ([55]GPL570), comprising 84 TNBC samples, was sourced from the Gene Expression Omnibus (GEO) database ([56]https://www.ncbi.nlm.nih.gov/geo/). Additionally, the single-cell dataset [57]GSE161529 ([58]GPL18573) was obtained from GEO, containing data from 4 TNBC tumors and 13 controls. All sample types in the datasets were tissue-derived. From published literature, 39 BRD-related genes (BRDRGs) were identified [[59]10]. In addition, the technical approach of this study was shown in Additional File [60]1. Single cell data analysis For [61]GSE161529, cells with fewer than 200 genes and fewer than 3 covered cells were excluded using Seurat (v 4.0.5) [[62]11]. Quality control (QC) criteria were applied, selecting genes with nFeature_RNA between 300 and 6000, nCount_RNA < 25,000, and percent.mt < 20% for further analysis. The FindVariableFeatures function was used to identify highly variable genes after log normalization. The ScaleData function was then applied for normalization, followed by principal component analysis (PCA) to assess the variance contributed by principal components (PCs). Stable signals were selected for PCA downscaling analysis, and clustering was visualized using uniform manifold approximation and projection (UMAP). For cell type annotation, the SingleR package (v 1.0.6) [[63]12] was used, referencing the HumanPrimaryCellAtlasData [[64]13, [65]14]. Cell clusters and corresponding marker genes were identified, and the annotated cell types were visualized for subsequent analyses. Samples were separated based on TNBC and Control groups, and after annotation with marker genes, the proportions of different cell types between the two groups were compared. Cell types with significant differences (p < 0.05) between the two groups were identified using the Wilcoxon test, and were thus determined to be key cell types. Cellular communication and interactions were analyzed using CellChat (v 1.6.1) [[66]15], with detailed interaction networks visualized through shell diagrams. Identifying and analyzing differential expressed genes (DEGs) Differentially expressed genes (DEGs1) between TNBC and control samples were selected using the FindMarkers function (|log[2]FC| > 0.25) based on key cells. DEGs2 were identified using the DESeq2 package (v 1.34.0) with |log[2]FC| > 0.5 [[67]16]. The BRDRGs score for TCGA-TNBC dataset was calculated using single-sample Gene Set Enrichment Analysis (ssGSEA) from the GSVA package (v 1.44.5) [[68]17], and samples were stratified into high and low score subgroups based on median score to derive DEGs3 (|log[2]FC| > 0.5). The intersection of DEGs1, DEGs2, and DEGs3 was analyzed using ggvenn (v 0.1.9) [[69]18] to identify differentially expressed BRDRGs (DE-BRDRGs). To further investigate the functions of DE-BRDRGs, Gene Ontology (GO) analysis, including Biological Process (BP), Molecular Functions (MF) and Cellular Components (CC), as well as Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, were performed using the clusterProfiler package (v 4.2.2) [[70]19]. A protein-protein interaction (PPI) network was constructed based on DE-BRDRGs with a confidence score of 0.4 using the STRING database. Construction and validation of risk model In TCGA-TNBC dataset, prognostic genes were identified through univariate Cox regression and least absolute shrinkage and selection operator (LASSO) regression analysis using the glmnet package (v 4.0–2) based on DE-BRDRGs [[71]20]. It is worth noting that Lasso regression is a statistical method that first performed ten-fold cross-validation on genes selected through univariate COX analysis based on the λ value, and then selected the genes whose regression coefficients were not penalized to zero as prognostic genes. A risk model was then constructed according to the expression of prognostic genes and overall survival (OS). Risk scores were calculated using the following formula. graphic file with name M1.gif In this formula, β and x represent the coefficients and gene expression levels, respectively. The TNBC samples were then stratified into high- and low-risk groups based on the median risk score. Kaplan-Meier (K-M) survival curves for both risk groups were generated using the Survival package (v 3.3-1) [[72]21]. To further validate the risk model, receiver operating characteristic (ROC) curves were constructed for 3, 5, and 7 years, and the area under the curve (AUC) values were calculated using the survivalROC package(v 1.0.3) [[73]22]. [74]GSE135565 was used as an external validation set to assess the model’s performance. Based on the optimal cutoff value of the risk score, TNBC patients were divided into high- and low-risk groups. The survival differences between the two risk groups were analyzed using the survival package (v 3.3-1) (p < 0.05). In addition, to further validate the risk model, ROC curves for 3, 5, and 7 years were constructed using the survivalROC package (v 1.0.3), with an AUC > 0.6 indicating good predictive performance of the risk model. Independent prognostic analysis and correlation analysis of risk score with clinical characteristics First, risk score, age, stage, pathological T, pathological N, and pathological M stage were included in univariate Cox, proportional hazards (PH) and multivariate Cox regression analyses. Clinical clinical factors identified in the multivariate Cox model were then used to construct a nomogram predicting OS at 3, 5, and 7 years. The predictive capability of the nomogram was assessed using calibration, ROC curves, and decision curve analysis (DCA). Differences in risk scores among the five clinical characteristics were also analyzed. Gene set enrichment analysis (GSEA) and immune microenvironment analysis Inter-group differences were analyzed using the DESeq2 package, and the log[2]FC values from this analysis were used as sorting criteria for GSEA with |NES| ≥ 1, based on the KEGG gene set. The ssGSEA algorithm was applied to calculate scores for 12 immune function pathways in TNBC samples, and differences in immune function scores between the high- and low-risk groups were tested. Additionally, 28 immune cell types were scored by ssGSEA, and variations in immune cell infiltration between the two risk groups were evaluated. Gene mutation, tumour immune dysfunction and exclusion (TIDE), and drug sensitivity analysis Tumour mutational burden (TMB) analysis for two risk groups was performed using the maftools package [[75]23], and the top 20 mutant genes in both groups were visualized. TNBC samples were divided into high- and low-TMB groups based on the median TMB score, and K-M curves for the four (H-TMB + High Risk, H-TMB + Low Risk, L-TMB + High Risk and L-TMB + Low Risk) were plotted. Gene expression data from 115 normalized tumor samples were imported into the TIDE website to calculate TIDE, dysfunction, and exclusion scores, which were used to evaluate potential responses to immunotherapy. Additionally, the 50% inhibitory concentration (IC[50]) values for 138 chemotherapy or targeted drugs in TNBC were inferred from the CGP programs and compared between the two risk groups using the pRRophetic package (v 0.5) [[76]24]. Finally, the DGidb database ([77]https://dgidb.org/) was used to predict small-molecule drugs targeting prognostic genes, and their interactions were visualized. Expression of prognostic genes and pseudotime analysis Expression levels of prognostic genes across various cell types were visualized using Seurat’s DotPlot function. The differentiation trajectories of key cells were simulated and analyzed using the Monocle3 package. Real-time quantitative PCR (RT-qPCR) detection of prognostic genes This study used 10 tissue samples (5 from the TNBC group and 5 from the control group), all of which were collected from Fujian Provincial Hospital. The study was approved by the Fujian Provincial Hospital Ethics Committee (Approval No: K2021-04-069), and all patients provided written informed consent. Subsequently, reverse transcription was performed with the help of the RevertaidTM First Strand cDNA Synthesis Kit (Thermo, USA) to generate cDNA. The reaction consisted of 2µL of cDNA, 10µL of TaqMan^® Universal PCR Master Mix, 1µL primer, and 7µL dH2O (Additional File [78]2), with the RT-qPCR analysis conducted over 40 cycles (Additional File [79]3). Finally, the data were analyzed using the 2^−△△Ct method [[80]25], with GAPDH serving as the reference gene. The results were plotted and p-values calculated using GraphPad Prism (v 5.0.0) [[81]26]. Statistical analysis All analyses were performed in R, and differential analysis was conducted using the Wilcoxon test. Unless otherwise specified, a P-value < 0.05 was considered statistically significant. Results Cell clustering The distribution of nFeature_RNA, nCount_RNA and percent.mt before and after QC is illustrated in violin plots (Fig. [82]1a-b). Following QC, 2,000 highly variable genes were selected for subsequent analysis (Fig. [83]1c). PCA of the top 20 principal components, based on the top 50 PCs contributing to variance, revealed distinct clustering between TNBC and control cohorts (Fig. [84]1d-e). UMAP visualization identified 17 unique cell populations in both TNBC and control samples, with certain populations absent from either group (Fig. [85]1f-g). Fig. 1. [86]Fig. 1 [87]Open in a new tab Cell clustering results. (a) Distribution of nFeature_RNA, nCount_RNA, and percent.mt prior to quality control. (b) Distribution of nFeature_RNA, nCount_RNA, and percent.mt after quality control. (c) Identification of the 2000 most variable genes. (d) Inflection point chart for selecting optimal dimensions. (e) PCA-based distribution of cells across sample groups. (f) UMAP-based cell clustering for all samples. (g) UMAP-based cell clustering for TNBC and control samples Acquisition of key cells Combining single-cell dataset annotations with singleR (8 cell clusetrs, Fig. [88]2a) revealed 7 cell types and their respective marker genes: epithelial cell (KRT5, KRT8, EPCAM, CDH1), Endothelial cell (VWF, PECAM1, ACKR1, SPARCL1, HLA-E, GNG11, A2M), Fibroblast (COL1A1, COL1A2, PDGFRA, PDGFRB), T cell (CD8A, CD4, FOXP3, PDCD1), B cell (CD19, CD79A), Myeloid cell (CD86, CD14, ITGAX, CD68, CX3CR1) and Plasma cell (JCHAIN) (Fig. [89]2b-c). Cell type annotation indicated the highest abundance of epithelial cells, while immune cells, including B cells, T cells, myeloid cells, and plasma cells, were also prevalent. Fibroblasts and endothelial cells represented additional cell clusters (Fig. [90]2d). Notably, all 7 cell types were detected in both cohorts (Fig. [91]2e). Epithelial cells dominated both TNBC and control samples, while immune cells, including B cells, T cells, myeloid cells, and plasma cells, were more abundant in the TNBC cohort (Fig. [92]2f). A significant difference was observed between TNBC and control samples in endothelial cells, fibroblasts, myeloid cells, and plasma cells, which were identified as key cell populations (Fig. [93]2g). Fig. 2. [94]Fig. 2 [95]Open in a new tab Key cell identification. (a) SingleR cell type annotation. (b) Expression levels of marker genes across cellular taxa. (c) Bubble plots depicting expression of marker genes across cellular taxa. (d) Marker gene-based cell type annotation. (e) Marker gene-based cell type annotation for TNBC and control samples. (f) Proportion of each cell type in TNBC and control samples. (g) Differential cell type proportions between TNBC and control samples Ligand-receptor network analysis of 7 cell types revealed cellular communication patterns (Additional files 4a-b), with the MIF signaling pathway being the most prominent contributor to signaling across all populations (Additional files [96]4c). Five primary efferent and afferent communication modes were identified (Additional files [97]4e-g). DE-BRDRGs screening and function analysis In TCGA-TNBC dataset, a total of 605 DEGs1 (290 upregulated and 315 downregulated), 10,776 DEGs2 (6,546 upregulated and 4,230 downregulated) between tumors and controls, and 4,497 DEGs3 (2,138 upregulated and 2,359 downregulated) between high and low subgroups were identified (Fig. [98]3a-c). A total of 120 DE-BRDRGs were obtained by intersecting DEGs1, DEGs2 and DEGs3 (Fig. [99]4a). To investigate the functions of DE-BRDRGs, pathway enrichment analysis was performed. GO enrichment revealed 363 BPs, 22 CCs, and 35 MFs. Among the enriched BPs, key functions included ‘cytokine production’, ‘myeloid leukocyte migration’, and ‘positive regulation of apoptotic signaling pathway’, suggesting that DE-BRDRGs are not only related to BRDs but may also be involved in other cellular processes (Fig. [100]4b). KEGG pathway analysis identified 177 pathways, of which 21 were statistically significant (Fig. [101]4c-d). The PPI network displayed 95 interacting protein pairs, such as AKR1C1 and CBR3 (Fig. [102]4e). Fig. 3. [103]Fig. 3 [104]Open in a new tab Identification of differentially expressed genes. (a) Volcano plot of differentially expressed genes in key cells between TNBC and control samples. (b) Heatmaps and volcano plots illustrating differentially expressed genes between TNBC and control samples. (c) Heatmap and volcano plot of differentially expressed genes between high- and low-risk subgroups Fig. 4. [105]Fig. 4 [106]Open in a new tab Key genes related to BRDGs and functional enrichment analysis. (a) Screening of key genes associated with BRDGs. (b) Directed acyclic graphs and chord diagrams of GO enrichment results. (c) Classification of KEGG enrichment results. (d) Circular diagram of KEGG enrichment results. (e) Construction of PPI networks. Each node in the network represents a protein, and the connections between nodes represent interactions between the proteins Constructed, evaluated, and validated a risk model Six prognostic genes (KRT6A, PGF, ABCA1, EDNRB, CTSD and GJA4) were selected through univariate cox regression analysis and LASSO based on 120 DE-BRDRGs (Fig. [107]5a-b). A risk model was then constructed using the expression levels of these six genes in the training set, and risk scores were calculated (Table [108]1). Risk curve and genes expression profiles in high- and low-risk groups were plotted based on risk scores (Fig. [109]5c). Kaplan-Meier survival analysis revealed significant differences in survival between the two risk groups (P < 0.0001) (Fig. [110]5d). The AUC values at 3, 5, and 7 years were 0.739, 0.856, and 0.798, respectively, demonstrating the potential of these six prognostic genes as valuable predictors of survival (Fig. [111]5e). Fig. 5. [112]Fig. 5 [113]Open in a new tab Construction and evaluation of the risk model. (a) Forest plot from univariate Cox analysis. (b) Ten-fold cross-validation and coefficient spectrum from Lasso analysis. (c) Risk curves, scatter plots, and heatmaps depicting model gene expression for high- and low-risk TNBC subgroups. (d) Kaplan-Meier survival curves for high- and low-risk groups. (e) ROC curves for 3-, 5-, and 7-year survival prediction Table 1. Gene coefficients for six prognostic genes Gene Coef KRT6A 0.088900692 PGF 0.219005689 ABCA1 0.210816453 EDNRB 0.0638788218063777 CTSD 0.258686967 GJA4 0.06757986 [114]Open in a new tab The risk model was further validated using the [115]GSE135565 dataset. Distribution of risk scores, survival times, and a heatmap of prognostic genes expression were plotted for both risk groups (Fig. [116]6a). Consistent with the training set, Kaplan-Meier curves demonstrated that patients with TNBC in the low-risk group had a significantly higher survival probability (Fig. [117]6b). ROC analysis revealed AUC values exceeding 0.6 at 3, 5, and 7 years, indicating the robustness of the risk model in predicting survival outcomes (Fig. [118]6c). Fig. 6. [119]Fig. 6 [120]Open in a new tab Validation of the risk model in the [121]GSE135565 dataset. (a) Risk curves, scatter plots, and heatmaps of model gene expression for high- and low-risk TNBC subgroups in the [122]GSE135565 validation set. (b) Kaplan-Meier survival curves for high- and low-risk groups in the [123]GSE135565 validation set. (c) ROC curves for 3-, 5-, and 7-year survival prediction in the [124]GSE135565 validation set Construction of the nomogram, and the correlation between risk score and clinical features Independent prognostic factors were identified through univariate Cox regression, the PH test, and multivariate Cox regression, resulting in the inclusion of risk score and clinical stage in the nomogram (Fig. [125]7a-c). Calibration, ROC (AUC > 0.6), and DCA further confirmed the accuracy of the nomogram in predicting survival (Fig. [126]7d-f). Correlation analysis between risk score and five clinical characteristics revealed significant differences in stage, T stage, and M stage (P < 0.05, Fig. [127]8). Fig. 7. [128]Fig. 7 [129]Open in a new tab Development of an independent prognostic model. (a) Forest plot from univariate Cox analysis. (b) Forest plot from multivariate Cox model analysis. (c) Construction of a nomogram. (d) Evaluation of prognostic models via calibration curves. (e) ROC curve for independent prognostic factors. (f) Decision curve analysis (DCA) for the predictive model Fig. 8. [130]Fig. 8 [131]Open in a new tab Box plots of risk scores versus clinical indicators. (a) Age. (b) M Stage. (c) N Stage. (d) Stage. (e) T Stage GSEA enrichment analysis in TCGA-TNBC and immune microenvironment landscape for two risk teams GSEA identified KEGG pathways enriched for the sorted genes (Fig. [132]9). Activation pathways included cytokine-cytokine receptor interaction, autoimmune thyroid disease, Leishmania infection, allograft rejection, and graft-versus-host disease, while inhibitory pathways were associated with cytochrome P450 drug metabolism and xenobiotic metabolism (Additional files [133]5). Moreover, marked differences were observed between TNBC and control cohorts in seven immune functions, with significantly higher scores for these functions in the high-risk group (Fig. [134]10a). A total of 17 immune cells showed significant differences in abundance, with higher levels found in the high-risk group (Fig. [135]10b). Correlation analysis between risk score, immune cells, and immune function-related pathways revealed that risk score was most strongly positively correlated with five immune cell types and the APC co-inhibition pathway (Fig. [136]10c). Fig. 9. [137]Fig. 9 [138]Open in a new tab GSEA enrichment analysis results Fig. 10. [139]Fig. 10 [140]Open in a new tab Immune microenvironmental analysis results. (a) Differences in Immune Function Scores between high- and low-risk groups. (b) Differences in immune cell scores between high- and low-risk groups. (c) Correlation between risk scores, immune cells, and immune function pathways Mutation, TIDE, and drug sensitivity analysis of two risk teams Visualization of somatic mutations revealed that the top 20 mutated genes in the high-risk group were identified in 25 samples, while in the low-risk group, the top 20 genes were mutated in 77 samples (Fig. [141]11a-b). In both risk groups, TP53 exhibited the highest mutation frequency, predominantly in the form of missense mutations, with the other 19 genes also primarily exhibiting missense mutations. Notable differences in TMB were observed between the two risk groups, as evidenced by survival curves (Fig. [142]11c). Survival analysis further highlighted significant differences within the four subgroups (Fig. [143]11d). Additionally, significant disparities in tumor immune dysfunction and TIDE scores were found between the high- and low-risk groups (Fig. [144]11e), with a notably lower probability of immunotherapy response in the high-risk group (Fig. [145]11f). A total of 118 drugs were identified with significantly different responses between the two risk groups (Fig. [146]12a-b). In the high-risk group, the IC50 values were higher for 46 drugs (e.g. ABT.263, ABT.888, AG.014699), while 66 drugs exhibited lower IC50 values in the high-risk group compared to the low-risk group (e.g. Bicalutamide, Bleomycin, Bortezomib). A drug network incorporating four prognostic genes and 18 drugs was constructed (Fig. [147]12c). Notably, EDNRB was associated with 8 small molecule drugs, ABCA1 with 6 drugs, PGF with 3 drugs, and CTSD with 1 drug (streptozocin). Fig. 11. [148]Fig. 11 [149]Open in a new tab Gene mutation and tumor immune dysfunction and exclusion (TIDE) analysis. (a) Waterfall plot of mutation analysis in the high-risk group. (b) Waterfall plot of mutation analysis in the low-risk group. (c) Kaplan-Meier survival curves for high- and low-TMB score groups. (d) Kaplan-Meier survival analysis of the four subgroups: H-TMB + High Risk, H-TMB + Low Risk, L-TMB + High Risk, and L-TMB + Low Risk. The p < 0.05 indicates that the survival differences among the four subgroups are statistically significant. (e) Box plots of risk score versus tumor immune rejection, tumor immune dysfunction, and TIDE score. (f) Probability of response to immunotherapy Fig. 12. [150]Fig. 12 [151]Open in a new tab Drug sensitivity analysis results. (a) Drugs with higher IC50 in the high-risk group compared to the low-risk group. (b) Drugs with lower IC50 in the high-risk group compared to the low-risk group. (c) Network maps of prognostic genes and corresponding small molecule drugs Expression levels for prognostic genes and pseudotime analysis for key cells The expression levels of the six prognostic genes across different cell types revealed that ABCA1 and CTSD were significantly upregulated in myeloid cells, EDNRB and GJA4 in fibroblasts, KRT6A in epithelial cells, and PGF in endothelial cells (Fig. [152]13a-b). Furthermore, simulation of cell differentiation trajectories for key cell types showed that the expression of GJA4 followed a pattern of low to high and then decreasing expression, PGF showed a continuous increase, while KRT6A, ABCA1, EDNRB, and CTSD exhibited a high-to-low expression pattern in pseudotime. (Fig. [153]13c-d). The RT-qPCR results showed that the prognostic genes (KRT6A, PGF, ABCA1, EDNRB, CTSD, and GJA4) exhibited differential expression between the control and TNBC groups. Specifically, compared to the control group, the expression of ABCA1, EDNRB, CTSD, and PGF was significantly elevated in the TNBC group (p < 0.05) (Fig. [154]13e). This confirmed the reliability of the bioinformatics analysis and supported the utility of the prognostic gene screening. These findings suggested that these four prognostic genes played a crucial role in determining the prognosis of TNBC. Fig. 13. [155]Fig. 13 [156]Open in a new tab Expression levels for prognostic genes and pseudotime analysis for key cells. (a) Expression levels of prognostic genes across different cellular taxa. (b) Bubble plot of prognostic gene expression in different cell types. (c) Map of immune cell differentiation pathways with time-based differentiation differences. (d) Expression of prognostic genes in the proposed chronology. (e) The RT-qPCR results of prognostic genes. *p < 0.05, **p < 0.01, ns indicates p > 0.05 Discussion BC is a malignancy resulting from the cancerous mutation of breast cells, influenced by various carcinogenic factors, and remains one of the leading causes of death among women. Within BCs, TNBC has garnered increasing attention due to its challenging treatment and poor prognosis. Jing et al. demonstrated that down-regulation of BRD4 expression could be a promising therapeutic strategy for TNBC, as evidenced by in vivo experiments in rats [[157]9, [158]27]. Thus, investigating BRDRGs in TNBC is of paramount importance. This study developed a prognostic model associated to BRD by integrating single-cell sequencing data with transcriptome sequencing data and performed an in-depth analysis of prognostic genes, which provides new insights into TNBC research. Single-cell sequencing analysis identified seven distinct cell types, with endothelial cells and fibroblasts were more prevalent in controls, while immune cells such as myeloid cells, plasma cells were more abundant in TNBC samples. Fluid shear stress, known to influence tumour metastasis may facilitate the migrate of cancer cells through endothelial junctions and the endothelium, promoting metastatic processes [[159]28]. In patients with TNBC, endothelial cells exhibited abnormal responses to fluid shear stress, highlighting their critical role in TNBC metastasis [[160]28, [161]29]. Cancer-associated fibroblasts (CAFs) are pivotal in the tumor microenvironment, with CAFs enhancing the immunosuppressive effects of regulatory T cells (Tregs) by inhibiting T effector cell proliferation in BC, thereby promoting tumor progression [[162]30]. Myeloid cells, commonly expanded in TNBC, are associated with poor clinical prognosis. A key mechanism by which TNBC evades apoptosis involves the upregulation of the anti-apoptotic protein myeloid leukemia 1 (MCL1) [[163]31]. Additionally, the density of CD38 + plasma cells within tumors serves as an independent prognostic marker. Collectively, these studies suggest that endothelial cells, fibroblasts, myeloid cells, and plasma cells play pivotal roles in the pathogenesis and progression of TNBC. KRT6A, PGF, ABCA1, EDNRB, CTSD, and GJA4 represent six prognosis-related genes identified in TNBC. PGF, a family of angiogenic growth factors, has been shown to exhibit significantly elevated PLGF-1 transcript levels in patients with TNBC experiencing poor prognosis [[164]32]. PGF is associated with apoptosis and GPCR pathways in synovial fibroblasts, which are the primary cell type in synovial tissue. Notably, synovial fibroblasts have been found to be related to various tumor types in several studies [[165]33–[166]35]. These fibroblasts promote tumor growth, invasion, and metastasis by secreting various cytokines and matrix metalloproteinases (MMPs) [[167]36]. Additionally, GPCR play a crucial role in tumor cell migration. GPCR not only affect the tumor cells themselves but also regulate other cell types in the tumor microenvironment, such as fibroblasts and immune cells [[168]37, [169]38]. Therefore, PGF may modulate cells via GPCR pathways, thereby activating pro-tumor factors that influence the progression of TNBC. ABCA1, a TNBC-specific marker, is down-regulated in TNBC, modulating cholesterol transport and correlating with reduced proliferation and increased metastatic potential [[170]39]. Furthermore, ABCA1’s involvement in cholesterol metabolism and the significant lipid metabolism disorders observed in TNBC suggest its role in disease progression via these metabolic pathways [[171]40]. Studies have shown that lipid metabolism reprogramming, particularly fatty acid oxidation (FAO), has become a major driver of BC metastasis [[172]41]. Furthermore, a study combining spatial multi-omics also highlighted that changes in lipid metabolism play a key role in both the breast cancer cells themselves and the cells within the lung metastatic niche [[173]42]. These studies suggest a complex connection between lipid metabolism and the progression of TNBC. ABCA1 can downregulate the expression of histone deacetylase 9 (HDAC9) and inhibit lipid accumulation in macrophages [[174]43], indicating a potential mechanism of action between ABCA1 and lipid metabolism. This suggests that the ABCA1 gene may influence the development and progression of TNBC by modulating the expression of histone deacetylases in the system, thereby activating lipid metabolic pathways. The upregulation of EDNRB, a receptor in the endothelin axis, is linked to the development and metastasis of various solid tumors, establishing it as a novel prognostic biomarker for TNBC [[175]44]. Xi Gu et al. first demonstrated that silencing EDNRB expression inhibits TNBC progression, positioning EDNRB as both a prognostic biomarker and a potential therapeutic target [[176]45]. CTSD, a lysosomal protease, serves as a poor prognosis marker in BC, with CTSD knockout experiments revealing that its deficiency in mammary epithelium can inhibit tumor progression autonomously [[177]46]. Additionally, GJA4 (Gap Junction Protein Alpha 4) has been associated with an increased risk of secondary lymphedema in patients with BC [[178]47]. GJA4, also known as Connexin 37 (CX37) [[179]48], is a gap junction protein that plays a crucial role in normal breast development and function. Studies have shown that connexins (Cxs) are critical for breast development, disease progression, therapeutic resistance, and antitumor immune responses, making Cxs a promising target at the crossroads of BC development and treatment [[180]49, [181]50]. Additionally, the paralog of GJA4, GJA1, has been found to be downregulated and associated with the progression of TNBC and poor prognosis [[182]49]. These studies highlight the importance of connexins in normal breast development, and suggest that either overexpression or underexpression of connexins may affect the prognosis of TNBC. Moreover, the discovery of KRT6A as a potential biomarker for TNBC warrants further investigation to elucidate its precise role in disease progression. A study on the senescence mechanisms of TNBC cancer cells revealed that KRT6A is significantly upregulated in TNBC cell lines and is considered a risk gene for TNBC [[183]51]. Another study focusing on the prognostic mechanisms of TNBC also highlighted KRT6A as a valuable prognostic gene for TNBC [[184]52]. These findings are consistent with our results, where KRT6A expression shows an upregulation trend in tumor samples. This suggests that KRT6A may have significant research value in improving the prognosis of TNBC patients. Collectively, these studies underscore the involvement of PGF, ABCA1, EDNRB, KRT6A, CTSD, and GJA4 in TNBC pathogenesis, supporting the findings of the present study. Immune infiltration analysis revealed significant differences in the distribution of macrophages, natural killer (NK) cells, neutrophils, Tregs, and other immune cells between cancerous and paraneoplastic tissues in TNBC. Tumor-associated macrophages (TAMs), which are major components of the tumor microenvironment (TME), are closely linked to malignant tumor progression. TNBC cells and TAMs facilitate tumor advancement via the IL-6-TGF-β1 axis [[185]53]. NK cells, known for their ability to autonomously kill target cells, are key effector cells in innate immunity and exhibit considerable heterogeneity within the TME. NK cell-based therapeutic strategies are currently under extensive investigation for TNBC treatment [[186]54]. Neutrophil infiltration and the formation of neutrophil extracellular traps also promote breast cancer lung metastasis [[187]55]. These findings emphasize the critical role of immune cells in TNBC progression and treatment. Therefore, exploring the relationship between prognostic genes and immune cells helps to further understand the immune microenvironment of TNBC, which in turn facilitates a deeper understanding of the mechanisms underlying TNBC progression. Notably, mutations in KRT6A have been shown to mediate the activity of tumor-associated macrophages (TAMs). KRT6A can replace JAK2V617F as a more prominent disease driver in MPN-SC [[188]56]. Moreover, KRT6A can influence TAMs through various proteins and pathways, thereby affecting tumor prognosis [[189]57]. In addition, the prognostic gene ABCA1 is also associated with macrophages in tumor tissue [[190]58]. Another study pointed out that ABCA1 expression is upregulated in tumor monocytes/macrophages, and the upregulation of ABCA1 is positively correlated with decreased cellular cholesterol content and elevated serum cholesterol levels [[191]59]. This suggests that the upregulation of ABCA1 gene expression may regulate cholesterol metabolism in the serum, thereby influencing the immune response of macrophages in the immune microenvironment and affecting the development of TNBC. Furthermore, other prognostic genes, such as PGF, CTSD, and GJA4, are also correlated with macrophages [[192]60–[193]62]. EDNRB, on the other hand, plays a role in T cell subpopulations, particularly in immune-inflammatory responses, and may influence T cell migration and activation [[194]63, [195]64]. These studies suggest that prognostic genes can mediate immune cell responses, through biological pathways such as lipid metabolism, thereby impacting the progression of TNBC. Due to the lack of targeted therapies for TNBC and the fact that it is often diagnosed at advance stages [[196]65], accurate prognostic assessment is particularly important. Prognostic models enable clinicians to identify TNBC patients at high risk for recurrence and metastasis, thereby guiding the decision on whether more aggressive treatment strategies are necessary, such as intensified chemotherapy or the inclusion of new immunotherapies or targeted therapies. Specifically, patient relapse risks can be evaluated based on data such as gene expression profiles, tumor size, and lymph node status [[197]66], which helps clinicians determine whether adjuvant chemotherapy or the use of immune checkpoint inhibitors and other novel therapies should be considered. This personalized and precise approach to treatment helps avoid both overtreatment and undertreatment, optimizing therapeutic outcomes. Furthermore, based on risk assessments from prognostic models, physicians can adjust the frequency of follow-up and monitoring. For high-risk patients, more frequent imaging and biomarker testing may be required to detect recurrence or metastasis early and adjust treatment plans accordingly. On the other hand, low-risk patients may benefit from less frequent monitoring, reducing unnecessary testing and alleviating the patient’s financial burden and psychological stress. This study constructed a prognostic model for TNBC associated with BRD through bioinformatics analysis, exploring the mechanisms of the identified prognostic genes and immune cell interactions. The results offer new insights into potential diagnostic and therapeutic approaches for TNBC. However, there are still some limitations. Firstly, the underlying mechanisms of these genes warrant additional experimental validation. In future studies, we plan to use BC cell lines (such as MDA-MB-231 or MCF-7) to construct cell models for overexpression or knockdown experiments of prognostic genes, aiming to further validate the molecular mechanisms of these prognostic genes and their expression patterns in tumor cells. Secondly, this study primarily relies on sequencing data from the TCGA and GEO databases for analysis, and the prognostic analysis in different populations is somewhat limited. In future work, we plan to explore additional datasets to validate the findings across different populations. This will help strengthen the overall generalizability of our model and improve the effectiveness and applicability of the prognostic model. Electronic supplementary material Below is the link to the electronic supplementary material. [198]12935_2025_3648_MOESM1_ESM.pdf^ (265.7KB, pdf) Additional file 1: Research methods and technical approach [199]12935_2025_3648_MOESM2_ESM.xlsx^ (9.9KB, xlsx) Additional file 2: Primer sequences for RT-qPCR [200]12935_2025_3648_MOESM3_ESM.xlsx^ (14KB, xlsx) Additional file 3: RT-qPCR amplification reaction protocol [201]12935_2025_3648_MOESM4_ESM.xls^ (56KB, xls) Additional file 4: Cellular communication analysis for seven cell types. (a) Cellular communication circle and intensity maps. (b) Ligand-receptor network relationships across seven cell types. (c) Efferent and afferent signaling associated with each cell population. The x-axis represents different cell types, and the y-axis represents the signaling pathways that are outgoing or incoming to the cell population, while the bar chart at the top indicates the signal strength. (d) Identification and visualization of outward cellular communication patterns. (e) Outward communication pattern mulberry diagram. (f) Identification and visualization of afferent communication patterns in cells. (g) Incoming communication pattern mulberry diagram [202]12935_2025_3648_MOESM5_ESM.tif^ (5.6MB, tif) Additional file 5: KEGG gene set GSEA enrichment analysis results Acknowledgements