Abstract

Background

   E2F transcription factors are crucial in various biological processes,
   including cell proliferation, differentiation, and apoptosis. However,
   the exact role of E2F target genes in breast cancer (BC), as well as
   their influence on survival and immune response, remains poorly
   understood.

Methods

   To investigate the differential expression of E2F target genes and
   their relationship with patient prognosis and immune cell infiltration,
   transcriptomic data from the Cancer Genome Atlas database were
   analyzed. A risk model was developed to identify genes associated with
   survival. BC samples were clustered into high-expression (C1) and
   low-expression (C2) groups of E2F target genes. The correlation between
   gene expression and factors such as survival, immune cell infiltration
   (CD4 + and CD8 + T cells), and immune checkpoint inhibitors (PD-L1 and
   PD-L2) was analyzed. We analyzed the link between clusters and clinical
   characteristics using the chi-squared test. For further investigation,
   single-cell data from [32]GSE243526 were utilized. For validation, the
   expression levels of JPT1 and TBRG4 were assessed using RT-qPCR in
   clinical samples.

Results

   Genes targeting E2F, such as AURKB, JPT1, TBRG4, and KIF4A, showed
   increased expression linked to poor patient prognosis, regardless of
   clinical features. Kaplan-Meier survival analysis revealed that
   elevated expression of these genes correlated significantly with
   decreased survival rates and heightened mortality risk. Single-cell
   data confirmed that candidate genes exhibited higher expression in
   tumor-associated epithelial cells than healthy ones. Furthermore,
   samples from group C1 exhibited a lower survival rate than C2. Immune
   cell infiltration analysis determined that high expression of E2F
   target genes in the C1 subgroup was associated with diminished T cell
   infiltration and increased PD-L1 and PD-L2 expression. A strong and
   significant association was also identified between triple-negative
   breast cancer and the C1 cluster. RT-qPCR validation confirmed a
   significant elevation of JPT1 and TBRG4 expression levels relative to
   adjacent healthy tissues in BC.

Conclusion

   These findings suggest that E2F target genes, including JPT1 and TBRG4,
   may act as prognostic biomarkers and contribute to immune evasion in
   BC. E2F target genes can also offer good potential for classifying and
   treating patients.

   Keywords: E2F transcription factors, Breast cancer, Prognosis, Immune
   cell infiltration, Gene expression, Single cell

Introduction

   E2F transcription factors are crucial for regulating genes essential
   for cell division, particularly during the transition from G1 to
   S-phase of the cell cycle [[33]1]. E2F factors change from
   transcriptional activators to repressors through interactions with the
   retinoblastoma tumor suppressor protein and its partner proteins, p107
   and p130 [[34]2]. Dysregulation of the E2F function has been associated
   with oncogenesis, highlighting its significance in cancer biology
   [[35]3].

   In addition to their acknowledged functions in the regulation of the
   cell cycle, E2Fs in breast cancer (BC) serve crucial roles as mediators
   of tumor growth and metastasis [[36]4]. E2Fs regulate the growth and
   metastasis of tumors by promoting the expression of genes essential for
   DNA synthesis, replication, and cell cycle progression [[37]5]. The
   CDK-RB-E2F pathway is a key regulator of E2Fs, crucial for controlling
   gene expression throughout the cell cycle. Traditionally, it is
   understood that the retinoblastoma tumor suppressor becomes
   phosphorylated and inactivated by CDK4/6-cyclin D complexes stimulated
   by mitogenic signals. When RB is phosphorylated, E2F is released,
   increasing the expression of its target genes [[38]6]. Reports indicate
   that the expression levels of certain E2F transcription factors are
   linked to poor prognosis in BC [[39]7]. Some E2F transcription factors
   have also been linked to immune responses in different cancers,
   including BC [[40]8, [41]9]. These findings suggest that the E2F family
   of transcription factors and their target genes significantly
   contribute to cancer development.

   Research has shown that E2F and its target genes play a pivotal role in
   the development and malignancy of cancers. This study utilized both in
   silico data and ex vivo research to better illuminate the function of
   E2F target genes in BC. We further investigated how their expression
   relates to patient prognosis and immune cell infiltration.

Materials and methods

Data sources

   This study utilized transcriptomic data for BC sourced from the Cancer
   Genome Atlas (TCGA) database. It focused on identifying changes in the
   expression of E2F target genes. We downloaded the raw data and
   performed initial preprocessing following the methodologies detailed in
   our previous research [[42]10]. The BC dataset from TCGA comprised 113
   healthy samples and 1109 cancerous samples. The most recent clinical
   data were utilized to assess the clinical characteristics of TCGA
   cancer samples, such as stage, TNM.T, and TNM.N. Various BC subtypes,
   including triple-negative breast cancer (TNBC), Luminal A, Luminal B,
   and human epidermal growth factor receptor 2 positive (HER2+), were
   also recognized in this analysis way. The most recent gene expression
   profiles and clinical data from TCGA were incorporated into this
   research. Additionally, single-cell data from the [43]GSE243526 dataset
   were assessed. The study included 12 tumor samples and four healthy
   ones.

Single-cell data analysis

   The [44]GSE243526 data were downloaded in raw format. The Seurat
   package (V 5.2) and other related packages were utilized to analyze the
   single-cell data. The mitochondrial percentage for each cell was
   calculated based on the expression of mitochondrial genes, and samples
   with a mitochondrial percentage greater than 10% were removed from the
   dataset. The data were normalized using the logNormalize method,
   followed by scaling based on genes associated with cell proliferation.
   Significant principal components (PCs) were identified using the
   JackStraw package (V 1.3.17), and PCs with a p-value less than 0.05
   were selected for clustering. The identified clusters were visualized
   using UMAP. The SingleR package (V 3.21) was used to determine the cell
   type in each cluster. Using the FindMarkers function, markers specific
   to each cluster were identified and manually validated against
   CellMarker and Azimuth databases. The manual results were integrated
   with the outcomes from SingleR. Epithelial cells were chosen due to
   their primary role in BC. We calculated the expression differences of
   candidate genes in epithelial cells from cancer samples versus those
   from healthy tissue.

Prognosis and risk assessment

   Clinical data preprocessing followed established methodologies from
   previous studies [[45]10]. Survival analyses were conducted using the
   survival package (V 3.8). A univariate Cox regression test identified
   the association between candidate gene expression and patient
   prognosis. Furthermore, multivariate Cox regression analysis assessed
   whether the link between candidate gene expression and patient
   mortality rates remained independent of clinical characteristics. Risk
   scores based on the expression of candidate genes were calculated using
   the following formula:

     Risk score = Exp [(gene1)] * Beta value [(gene1)] + Exp [(gene2)] *
     Beta value [(gene2)] +….

   Kaplan-Meier survival curve analysis confirmed the link between
   candidate gene expression, especially the elevated levels of E2F target
   genes, and patient mortality rates.

Clustering and differential expression

   BC samples from the TCGA database were divided into two categories
   based on E2F target gene expression: high-expression (C1) and
   low-expression (C2). The clustering and related analyses were conducted
   using the cluster (V 2.1.8) and NbClust (V 3.0.1) packages. The k-means
   clustering algorithm facilitated this categorization. The optimal
   number of suitable clusters was obtained through the Elbow method. The
   K-means algorithm’s iterations per run were chosen to be 20, and the
   maximum number of replications per run was selected to be 500.
   Increasing the parameters did not affect clustering. Clustering quality
   was evaluated using silhouette scores calculated via the silhouette
   function from the cluster package in R. The silhouette value reflects
   the consistency of each sample within its cluster, balancing intra- and
   inter-cluster distances. The average silhouette width was used to
   summarize clustering performance. To examine the differential
   expression of E2F target genes, clinical data were leveraged to
   distinguish samples into cancerous and healthy groups. The expression
   level variations for all genes in group C1 relative to C2 were
   computed. A linear model assessed differential expression, and the
   false discovery rate (FDR) threshold was enforced to guarantee
   statistical significance.

Candidate genes, pathway enrichment, and immune cell filtration

   To identify E2F target genes, we used the msigDB database
   ([46]https://www.gsea-msigdb.org/gsea/msigdb) to extract the E2F target
   gene set. The KEGG database was employed for enrichment and identifying
   pathways linked with differentially expressed genes. We used the
   Estimation of Proportions of Immune and Cancer Cells (EPIC) algorithm
   to calculate each sample’s immune cell filtration levels in the RNA-seq
   data. Transcriptomic data were analyzed in TPM format. We applied the
   Wilcoxon test to evaluate the significance of filtration differences
   between the C1 and C2 groups. We assessed the expression levels of two
   key T-cell inhibitors, Programmed Death-Ligand 1 (PD-L1) and Programmed
   Death-Ligand 2 (PD-L2), in both the C1 and C2 groups.

Sample collection

   For this study, 35 BC samples and their corresponding healthy tissues
   were sourced from the Iran Tumor Bank. Table [47]1 summarizes the
   participants’ clinical information. All ethical guidelines and
   protocols established by the Iranian Ministry of Health and Medical
   Education were meticulously followed. The samples were gathered at Imam
   Khomeini Hospital in Tehran, where the hospital’s ethics review board
   approved all ethical protocols. Informed consent was secured from every
   participant, and all samples were preserved in liquid nitrogen until
   needed.

Table 1.

   Summary of clinical information of BC patients participating in this
   study
   Subgroups Number of samples Stage (Number)
   HER2+ 6

   Stage I (0)

   Stage II (3)

   Stage III (2)

   Stage IV (1)
   Luminal A 13

   Stage I (3)

   Stage II (5)

   Stage III (4)

   Stage IV (1)
   Luminal B 11

   Stage I (2)

   Stage II (8)

   Stage III (1)

   Stage IV (0)
   TNBC 5

   Stage I (0)

   Stage II (2)

   Stage III (3)

   Stage IV (0)
   Healthy 35 -
   [48]Open in a new tab

cDNA synthesis, primer design, and RT-qPCR

   RNA was extracted from samples using TRIzol reagent per the
   manufacturer’s protocol. RNA quality was assessed by measuring
   absorbance at 260 and 280 nm. DNA contamination was removed using DNase
   I (SinaClon, Iran) treatment. CDNA synthesis (Genius Gene, Iran)
   utilized oligo-dT, random hexamer primers, and reverse transcriptase.
   Primers were designed using the Primer-BLAST tool. The primer sequences
   were as follows: JPT1 (F: 5’-GCAGAGGAAGGCTTGGATGT-3’ and R:
   5’-GAAGACCCGCTTCAGTGTGA-3’); TBRG4 (F: 5’-AGTACAAGCACCTGGCCTTC-3’ and
   5’-AGGCGGTTCATTAGTGGCTC-3’); and B-actin (F:5’-CGAGCACAGAGCCTCGC-3’ and
   R: 5’-GCGGCGATATCATCATCCAT-3’). Target gene expression levels were
   evaluated using specific primers and SYBR GREEN dye. B-actin expression
   was used as an internal control for normalization. Gene expression
   levels in each sample were calculated using the 2^−ΔCT method.

Statistics and software

   All preprocessing and statistical analyses were conducted using the R
   programming language (V 4.4.2). The false discovery rate (FDR) test was
   utilized to ascertain statistical significance within the TCGA and
   single-cell data. For survival-related analyses, the log-rank test
   assessed significance levels. The sensitivity and specificity of
   candidate genes in group C1 against group C2 were evaluated using
   receiver operating characteristic (ROC) curves, calculating the area
   under the curve (AUC) for assessment. The analysis of expression
   differences, ROC, and visualization of ex vivo data were conducted
   using GraphPad Prism (V 8.4). A chi-squared test was used to examine
   the association of identified clusters with clinical characteristics.

Results

Significant elevated levels of E2F target genes in BC and their link to poor
prognosis

   Figure [49]1 illustrates a flowchart outlining the study design and its
   sequential steps. Using data from the MSigDB database, researchers
   identified 200 genes that may serve as E2F targets. Of these, 87 genes
   showed overexpression of more than two-fold in cancerous samples when
   compared to healthy samples, as per TCGA data (Fig. [50]2A, logFC > 1
   and FDR < 0.01). Additionally, Cox regression analysis indicated that
   the expression levels of 31 out of the 200 genes were linked to
   unfavorable patient prognosis (Fig. [51]2B, HR > 1 and logRank < 0.05).
   Notably, 24 of these genes exhibited significant overexpression and a
   connection to poor prognosis (Fig. [52]2C).

Fig. 1.

   [53]Fig. 1
   [54]Open in a new tab

   A flowchart of the overall study process is shown

Fig. 2.

   [55]Fig. 2
   [56]Open in a new tab

   (A) Differential expression of E2F target genes in BC samples versus
   healthy tissues, according to TCGA data. A total of 87 genes exhibited
   significant overexpression (fold change > 2) in cancerous tissues. (B)
   The link between E2F target genes and patient prognosis is established.
   (C) Genes from E2F targets that were overexpressed and linked to
   unfavorable prognosis in BC patients were identified

   Subsequently, the independent prognostic value of the 24 identified
   genes was evaluated using multivariate analysis, considering clinical
   characteristics. Among these, AURKB, JPT1, TBRG4, and KIF4A expression
   levels were independently associated with unfavorable prognosis
   (Table [57]2, HR > 1 and log-rank < 0.05). A risk model was created
   using the expression data of four genes. The risk model, which
   considers the expression levels of these four genes, indicated that
   higher expression correlates with an elevated mortality rate
   (Fig. [58]3A and B, log-rank = 0.0002). The Kaplan-Meier survival curve
   analysis further validated the link between increased expression of
   these genes and elevated mortality rates (Fig. [59]4A and D,
   log-rank < 0.05). Thus, these results indicate that E2F target genes
   could act as potential prognostic biomarkers in BC.

Table 2.

   The multivariate Cox regression analysis was conducted for the
   twenty-four candidate E2F target genes in order to investigate their
   associations with patient prognosis, taking into account the impact of
   clinical parameters
   Parameters Univariate Multivariate
   HR P value 95% CI HR P value 95% CI Beta value
   Pathological Stage (Stage III, IV vs. Stage I, II) 2.54 0 < 0.00001
   1.65–3.71 1.82 0.00002 1.43–2.93 0.73
   TNM.T (T3,4 vs. T1,2) 1.72 0 < 0.00001 1.13–2.66 1.24 0.16 0.93–1.24
   0.13
   TNM.N (N0 vs. N1,2,3) 2.21 0 < 0.00001 1.62–3.13 1.52 0.001 1.21–1.96
   0.65
   Subtype (TNBC vs. HER+,lumA and lumB) 0.99 0.24 0.64– 1.51 - - - -

   Age

   (> 60 vs. <60)
   0.97 0.63 0.83–1.13 - - - -
   AURKB 1.47 0.001 1.14–1.73 1.37 0.01 1.09–1.33 0.38
   JPT1 1.43 0.001 1.08–1.71 1.32 0.02 1.02–1.28 0.31
   TBRG4 1.32 0.006 1.03–1.64 1.24 0.03 1.01–1.26 0.29
   KIF4A 1.36 0.004 1.01–1.82 1.23 0.04 1.01–1.24 0.27
   [60]Open in a new tab

Fig. 3.

   [61]Fig. 3
   [62]Open in a new tab

   (A and B) Displays the outcomes of the model developed using AURKB,
   JPT1, TBRG4, and KIF4A expressions. The increase in risk was associated
   with an increase in patient mortality rates

Fig. 4.

   [63]Fig. 4
   [64]Open in a new tab

   (A-D) Demonstrates the Kaplan-Meier curve results for the candidate
   genes. Elevated AURKB, JPT1, TBRG4, and KIF4A expression levels were
   associated with higher mortality rates

Increased expression of AURKB, JPT1, TBRG4, and KIF4A in epithelial cells
derived from cancer samples

   Single-cell data provided for additional validation. After
   preprocessing the [65]GSE243526 dataset, we discovered nine unique cell
   types (Fig. [66]5A). Subsequently, we focused solely on epithelial
   cells since these are the primary source of BC cells. As displayed in
   Fig. [67]5B, the expression levels of candidate genes such as AURKB,
   JPT1, TBRG4, and KIF4A were markedly elevated in epithelial cells from
   cancer samples compared to those from healthy samples (logFC > 1 and
   FDR < 0.01). This data indicates a marked increase in the expression of
   E2F target genes in primary cancer-related cells.

Fig. 5.

   [68]Fig. 5
   [69]Open in a new tab

   (A) Clustering results are shown for single-cell data and different
   cell subgroups based on [70]GSE243526 data. (B) Candidate gene
   expression levels in epithelial cells from cancer samples were assessed
   relative to healthy samples

Differential gene expression related to cell proliferation and p53 pathways
is evident in samples with high E2F target gene expression

   All 1,109 TCGA BC samples were clustered according to the 87 E2F target
   genes identified during the initial analysis. Figure [71]6A illustrates
   that the samples in group C1 (N = 568) displayed high expression of
   these E2F target genes, while group C2 (N = 541) samples showed reduced
   expression levels. The silhouette analysis for the 2-cluster solution
   yielded an average silhouette width of 0.34, indicating acceptable
   clustering quality (Fig. [72]6B). While a subset of samples
   demonstrated strong cluster membership (silhouette > 0.4), others
   exhibited lower scores, suggesting partial overlap between clusters.
   These results reflect the inherent heterogeneity of the dataset and
   support the choice of k = 2 as the most stable partitioning.

Fig. 6.

   [73]Fig. 6
   [74]Open in a new tab

   (A) BC samples were categorized into two groups based on E2F target
   gene expression: high (C1, N = 568) and low (C2, N = 541). (B) The
   silhouette score for the clustered samples is shown

   Patient survival rates in group C1 were notably lower than in group C2
   (Fig. [75]7A, log-rank < 0.0001). Furthermore, 435 genes had a
   logFC > 1 and an FDR < 0.01, indicating differential expression between
   the C1 and C2 groups (Fig. [76]7B). The pathway enrichment analysis for
   these 435 genes highlighted their participation in pathways related to
   cell proliferation, DNA replication, and the p53 signaling pathway
   (Fig. [77]7C, FDR < 0.01). These findings indicate that E2F target
   genes could play a significant role in these crucial pathways,
   emphasizing their possible involvement in the progression of BC. The
   chi-squared test results suggested a significant association between
   the C1 cluster samples and clinical features such as Stage, TNM.T, and
   subtype (Table [78]3, P < 0.0001). TNM.T4 showed a strong association
   with cluster C1, occurring at a rate of 64% (Table [79]3). Among the
   various BC subtypes, TNBC, HER2+, and Luminal B showed the highest
   association with the C1 cluster, with frequencies of 96%, 89%, and 88%,
   respectively (Table [80]3, P = 0). These findings indicate that the
   elevated expression of E2F target genes is particularly pronounced in
   the TNBC subgroup, making them potentially more appropriate therapeutic
   targets in this group.

Fig. 7.

   [81]Fig. 7
   [82]Open in a new tab

   (A) The C1 group exhibited significantly lower survival rates than the
   C2 group, highlighting the prognostic value of E2F target gene
   expression. (B) A volcano plot displays all differentially expressed
   genes from the C1 group compared to C2. (C) The pathway enrichment
   analysis of the differentially expressed genes in the C1 group has
   identified significant pathways, including cell proliferation, DNA
   replication, and the p53 signaling pathway. These pathways demonstrate
   enrichment, suggesting their potential role in cancer progression

Table 3.

   The association of clinical features with identified clusters related
   to the expression levels of target genes is presented. A Chi-squared
   test was utilized for the analysis
   Clinical features Number in Cluster C1 (Frequency) Number in Cluster C2
   (Frequency) χ² P.value
   Stage

   Stage I

   Stage II

   Stage III

   Stage IV

   64 (35%)

   339 (55%)

   128 (52%)

   9 (56%)

   115 (65%)

   276 (45%)

   117 (48%)

   7 (44%)
   21.15 0.0001
   TNM.T

   T1

   T2

   T3

   T4

   100 (37%)

   353 (58%)

   66 (48%)

   21 (64%)

   168 (63%)

   263 (42%)

   72 (52%)

   12 (36%)
   32.54 0
   Subtype

   TNBC

   Luminal A

   Luminal B

   HER2+

   182 (96%)

   105 (19%)

   178 (88%)

   68 (89%)

   8 (4%)

   442 (81%)

   24 (12%)

   9 (11%)
   545 0
   TNM.N

   N0

   N1

   N2

   N3

   258 (51%)

   181 (51%)

   68 (53%)

   33 (46%)

   256 (49%)

   172 (49%)

   58 (47%)

   39 (54%)
   8 0.7
   [83]Open in a new tab

Elevated expression of PD-L1 and PD-L2 in group C1 and decreased T-cell
filtration

   The expression levels of inhibitory T-cell genes, such as PD-L1 and
   PD-L2, were analyzed in groups C1 and C2. Group C1 showed significantly
   higher levels of both genes than group C2 (Fig. [84]8A and B,
   FDR < 0.01). The EPIC algorithm assessed immune cell infiltration in
   the two subgroups. Cancer-associated fibroblast (CAF) filtration was
   significantly lower in the C1 group (Fig. [85]8C, P < 0.0001). CD4 + T
   cell and CD8 + T cell infiltration were notably lower in samples from
   C1 than in those from C2 (Fig. [86]8D and E, P < 0.001). NK cell
   infiltration also decreased in the C1 group (Fig. [87]8F, P < 0.01).
   These results imply a potential correlation between E2F target gene
   expression and the presence of immune checkpoint inhibitors and
   decreased immune cell infiltration in the tumor microenvironment of
   cancer cells.

Fig. 8.

   [88]Fig. 8
   [89]Open in a new tab

   (A and B) Immunological cell infiltration and levels of immune
   checkpoint inhibitors (PD-L1 and PD-L2) were analyzed in C1 (N = 568)
   and C2 (N = 541) groups. (C-F) The C1 group displayed a decrease in
   CD4 + and CD8 + T cell infiltration, with increased expression of PD-L1
   and PD-L2, highlighting immune evasion in samples with high E2F target
   expression

Overexpression of JPT1 and TBRG4 in BC

   The expression levels of JPT1 and TBRG4 were validated in BC samples
   relative to adjacent healthy tissues using RT-qPCR. These genes were
   chosen explicitly because they are less explored in BC and have been
   linked to poorer patient outcomes in prior studies. As illustrated in
   Fig. [90]9A and B, the expression of both genes was notably higher in
   group C1, demonstrating a twofold increase compared to group C2. ROC
   analysis revealed that JPT1 and TBRG4 expression levels effectively
   distinguished between groups C1 and C2, exhibiting strong sensitivity
   and specificity (Fig. [91]9C and D, P < 0.0001). Moreover, substantial
   overexpression of JPT1 and TBRG4 was detected in BC samples relative to
   adjacent healthy tissues, as indicated by RT-qPCR results (Fig. [92]9E
   and F, P < 0.001). These findings underscore their potential role as
   biomarkers for BC prognosis and classification.

Fig. 9.

   [93]Fig. 9
   [94]Open in a new tab

   (A and B) The expression levels of JPT1 and TBRG4 were analyzed in
   greater detail across different clusters, specifically C1 and C2 based
   on TCGA data. (C and D) The AUC for JPT1 and TBRG4 expression was
   compared between the C1 subgroup and C2, as shown. (E and F) JPT1 and
   TBRG4 overexpression in BC tissues (N = 35) was confirmed in comparison
   adjacent healthy tissues (N = 35), as reflected in RT-qPCR results

Discussion

   This study discovered 200 potential E2F target genes related to BC
   using the MISGDB database. TCGA data shows 87 genes exhibited
   significant overexpression in cancer samples compared to healthy
   tissues. Several genes showed statistically substantial upregulation,
   with a fold change greater than 2. Moreover, Cox regression analysis
   revealed that the expression levels of 31 genes were associated with a
   poor prognosis in BC patients. From this subset, 24 genes were selected
   for further study due to their substantial overexpression and strong
   association with adverse patient outcomes.

   Multivariate analysis of clinical features indicated that the
   expression levels of the AURKB, JPT1, TBRG4, and KIF4A genes were
   independently linked to poor prognosis in patients. A risk model using
   the expression levels of these four genes was developed, showing that
   their concurrent upregulation significantly increased the mortality
   rate among patients. Furthermore, Kaplan-Meier survival analysis
   confirmed that elevated expression of these genes was associated with
   higher mortality rates.

   These results are consistent with earlier research, including a study
   by Huang et al., who reported that overexpression of AURKB is
   associated with poor survival in BC [[95]11]. AURKB is a
   serine/threonine kinases that play essential roles in cell cycle
   regulation. Increased expression and activity of AURKB can increase
   cell proliferation and invasion of cancer cells [[96]12]. It has also
   been shown in bladder cancer that AURKB can regulate P53 activity
   through MAD2L2 [[97]13]. Additionally, it has been reported that the
   level of AURKB expression is associated with tumor immune responses in
   various cancers [[98]14]. JPT1 was found to be overexpressed in a
   particular subset of BC, leading to heightened cell proliferation
   [[99]15]. The role of JPT1 as an oncogene in endometrial and prostate
   cancers has been recognized [[100]16, [101]17]. TBRG4 may influence the
   cell cycle by stabilizing regulatory proteins at the transcriptional
   level, a property attributed to its leucine zipper motif. Increasing
   its activity increases cell proliferation [[102]18]. Elevated levels of
   TBRG4 have been observed in hepatocellular carcinoma, lung cancer, and
   pancreatic cancers, correlating with higher malignancy rates
   [[103]18–[104]20]. For example, in hepatocellular carcinoma, the
   knockdown of TBRG4 can decrease the proliferation, migration, and
   invasion of cancer cells through the TGF-β pathway [[105]20]. KIF4A has
   been recognized as a prognostic and oncogenic biomarker in BC [[106]21,
   [107]22]. KIF4A, a kinesin 4 protein, regulates chromosome condensation
   and segregation during mitosis [[108]23]. Research has demonstrated
   that KIF4A influences stemness and metastasis pathways in lung cancer
   and glioma, with its elevated levels correlating to greater malignancy
   [[109]24]. Furthermore, studies have demonstrated that reducing KIF4A
   expression can inhibit BC cell proliferation, migration, and invasion
   [[110]22]. Fujiwara et al. reported similar findings, observing that
   high nuclear levels of E2F4 were associated with lower survival rates
   in BC patients [[111]7]. Additionally, Zhang et al. confirmed that
   higher expression of E2F1 is linked to poor prognosis in various cancer
   types, including BC [[112]25]. These studies support our conclusion
   that E2F target genes, particularly AURKB, JPT1, TBRG4, and KIF4A,
   could serve as valuable prognostic biomarkers for BC. Single-cell data
   analyses corroborated these findings.

   In this study, we analyzed 1,109 BC samples from the TCGA database,
   stratifying them by the expression levels of 87 E2F target genes
   identified in the initial phase. As shown in Fig. [113]1, the C1 group
   exhibited high expression of E2F target genes, while the C2 group
   displayed low expression. Notably, the C1 group had a lower survival
   rate than the C2 group. Subsequently, we identified 435 genes with
   significant differential expression (logFC > 1 and FDR < 0.01) between
   the C1 and C2 groups. Pathway enrichment analysis revealed that these
   genes are involved in critical cellular processes such as cell
   proliferation, DNA replication, and the p53 signaling pathway. This
   suggests that E2F target genes may play a role in these pathways and
   interact with them. The E2F family of transcription factors regulates
   genes crucial for cell cycle progression, DNA synthesis, and DNA
   replication [[114]26]. E2F proteins recruit transcriptional activators
   to regulate the expression of these genes, thus impacting cell
   proliferation [[115]27]. The p53 protein, an essential tumor
   suppressor, can block cell cycle progression by interacting with E2F
   proteins [[116]28]. Upon DNA damage, p53 activates the transcription of
   p21, which inhibits cyclin-dependent kinases. This results in the
   retention of Rb in its active form and suppresses E2F-mediated
   transcription [[117]29]. Our results are consistent with these
   established mechanisms, suggesting that the increased expression of E2F
   target genes in the C1 group could interfere with normal cell cycle
   regulation and the tumor-suppressive functions of p53, leading to a
   worse prognosis in BC patients [[118]30, [119]31]. This study
   highlights the significance of E2F target genes and their therapeutic
   and prognostic potential in BC. For the first time, we showed the
   elevated expression of JPT1 and TBRG4 in BC, linking it to a worse
   patient prognosis. The findings indicated that the TNBC, HER2+, and
   Luminal B subgroups exhibited greater expression than the E2F target
   genes (Cluster C1). Previous studies have also shown that the
   expression levels of genes related to cell proliferation can be higher
   in the TNBC subgroup than in other subgroups [[120]10].

   Our results also showed that in samples from subgroup C1, the
   filtration rate of T-cell immune cells may be lower. While the
   involvement of E2Fs in immune responses is indicated in certain
   cancers, it remains underexplored [[121]8, [122]32]. This study
   suggests that E2F target genes may affect the tumor microenvironment
   and the infiltration of immune cells, positioning E2Fs as valuable
   therapeutic and diagnostic targets for BC.

   A limitation of this study is that further in vitro and in vivo
   experiments are required to clarify the precise biological roles of
   JPT1 and TBRG4 in BC. Moreover, the retrospective nature of the TCGA
   and single-cell RNA-seq datasets may introduce inherent selection
   biases, thereby limiting the generalizability of the results. In
   addition, the relatively small sample size used for ex vivo RT-qPCR
   validation (n = 35 clinical samples) may limit the statistical power of
   the findings. Future studies should address these limitations using
   larger, independent cohorts and functional assays.

Conclusion

   This study identified significant overexpression of 87 E2F target genes
   in BC, with over two-fold upregulation compared to normal tissues. Cox
   regression analysis shows that 31 genes are linked to poor prognosis;
   24 have overexpression and strong correlations with adverse outcomes.
   Notably, AURKB, JPT1, TBRG4, and KIF4A are independently associated
   with poor prognosis, even after adjusting for clinical features. A risk
   model based on these genes indicates increased mortality rates,
   confirmed by Kaplan-Meier analysis. Pathway analysis suggests these
   genes are involved in cell proliferation, DNA replication, and p53
   signaling, indicating their regulatory role. Our results emphasize the
   potential of E2F target genes, particularly AURKB, JPT1, TBRG4, and
   KIF4A, as prognostic biomarkers for BC.

Acknowledgements