Abstract Background Breast cancer is a highly malignant disease worldwide, but there are currently no sufficient molecular biomarkers to predict patient prognosis and guide radiotherapy. The tumor microenvironment (TME) is an important factor affecting tumor biological function, and changes in its composition are equally relevant to tumor progression and prognosis during radiotherapy. Methods Here, we performed bioinformatic analyses using data obtained from The Cancer Genome Atlas (TCGA) and the Gene Expression Omnibus (GEO) databases to screen for molecular biomarkers related to the TME that may influence radiotherapy sensitivity. By combining immune scores and stromal scores and performing weighted coexpression network analysis (WGCNA), we identified key modules and hub genes to construct competing endogenous RNA (ceRNA) networks. Then, key pathways and genes were identified using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses. Furthermore, we explored the effect of THBS2 on the malignant phenotype of breast cancer through cellular experiments. Results Genes in the PI3K-AKT pathway in the blue module were significantly enriched. Among the hub genes in the blue module, COL1A1, COL1A2, COL6A3, THBS2 and PDGFRB were negatively associated with RT sensitivity. Cellular experiments confirmed that knockdown of THBS2 inhibited the proliferation and invasion of breast cancer cells. Conclusion These findings may provide new insights into the mechanisms of radiotherapy sensitivity in breast cancer patients, offering hope for the discovery of new therapeutic targets. Keywords: Breast cancer, Radiotherapy, WGCNA, ceRNA, TME, THBS2 1. Introduction Breast cancer is the most common malignancy among women worldwide and is currently the cause of approximately 6 % of all tumor-related deaths, which poses a major public health challenge to women worldwide [[33]1]. Radiotherapy is a multidisciplinary treatment for breast cancer and is considered to be indicated for patients in all phases of this disease. A meta-analysis by the Early Breast Cancer Trialists‵ Collaborative Group (EBCTCG) et al. reported that the 10-year locoregional or distant recurrence rates and the 15-year risk of breast cancer death in patients allocated to radiotherapy after breast-conserving surgery were 19.3 % and 21.4 %, respectively, compared with 35.0 % and 25.2 %, respectively, in patients allocated to breast-conserving surgery only, and patients who are pN0 or pN + can benefit from radiotherapy [[34]2]. For patients with positive axillary lymph nodes after mastectomy, postoperative radiotherapy can decrease locoregional recurrence and overall recurrence [[35]3]. Breast cancer is a heterogeneous disease, and the response to treatment varies from person to person. Several biomarkers have been proposed to guide chemotherapy, endocrine therapy and HER-2-targeted therapy for breast cancer. Similarly, treatment decisions for radiotherapy should be adjusted based on individual factors, tumor biology, and prognostic scores [[36]4]. However, few signatures are available for most patients to predict the benefit of radiotherapy. Therefore, exploring the association between gene expression and breast cancer radiotherapy sensitivity and constructing a radiotherapy sensitivity signature may constitute a breakthrough in personalized breast cancer radiotherapy. As a malignant solid tumor, breast cancer includes tumor cells, tumor-associated stromal cells, infiltrating immune cells and other normal epithelial cells. In recent years, studies on the stromal and immune components of the TME have guided the staging, treatment and prognosis of patients with breast cancer [[37][5], [38][6], [39][7]]. Weighted gene coexpression network analysis (WGCNA) is often used to establish relationships within the TME. It was designed to analyze microarray datasets more efficiently by quantifying not only the correlation between individual genes but also the extent to which these genes share nodes. WGCNA can be employed to identify candidate biological markers and therapeutic targets by identifying highly related gene modules, summarizing such gene clusters in terms of module features or hub genes, and correlating gene modules with each other or with certain traits in clinical samples [[40]8]. LncRNAs were once thought to be transcriptional byproducts of RNA polymerase, but an increasing number of studies have suggested that lncRNAs play important roles in almost every aspect of cell biology and contribute to each stage of tumorigenesis and progression [[41]9]. According to the ceRNA hypothesis proposed by Salmena's team in 2011, lncRNAs can inhibit the targeting of mRNAs to normal miRNAs by sharing common miRNA response elements (MREs) [[42]10]. It is reasonable to identify genes related to the TME that have effects on radiotherapy sensitivity in breast cancer patients and can aid in predicting tumor progression and prognosis, thus guiding clinical diagnosis and treatment. Herein, we screened genes associated with breast cancer radiotherapy sensitivity using various analytical methods, including calculation of immune scores and stromal scores and WGCNA, GO and KEGG enrichment analyses, thus providing new insights and evidence for the treatment of breast cancer by identifying new potential targets. 2. Materials and methods 2.1. Sample data selection and processing The genetic data of breast cancer patients (RNA-seqv2) were downloaded from the TCGA database and [43]GSE103744, which were collated into standardized raw data for subsequent analysis. Samples that were sensitive to radiotherapy (no recurrence after radiotherapy) were selected, resulting in a total of 68 samples for analysis. We used the annotation file of the [44]GPL10558 Illumina HumanHT-12 v4.0 expression beadchip to select the design sequences of the probe sets. We then compared these probe set sequences to coding genes using BLAST. If a probe set sequence could not be compared to a coding gene, we compared it to lncRNAs. Finally, we parsed the microarray detection data to obtain the lncRNA annotation information for [45]GPL10558 and determine the expression of lncRNAs. 2.2. Identification of DEGs We calculated the immune scores (ISs) and stromal scores (SSs) of 68 patients who were sensitive to radiotherapy (no recurrence after radiotherapy) using the ESTIMATE algorithm. Then, the 68 samples were divided into IS-high, IS-low, SS-high, and SS-low based on the calculated median IS and SS values. To identify differentially expressed genes (DEGs) between breast cancer samples and normal subjects, we performed differential screening analysis using the limma package of R. Specifically, we compared the IS-high vs. IS-low groups and the SS-high vs. SS-low groups to obtain differentially expressed mRNAs and lncRNAs. The difference screening parameters were set as P < 0.05 and |FC|>1.2. 2.3. WGCNA We constructed gene coexpression networks and modules using the WGCNA method. First, we calculated the Pearson correlation coefficient between any two genes and constructed a similarity matrix. Then, we transformed the similarity matrix into an adjacency matrix, where β is a soft threshold that strengthens strong correlations and weakens weak correlations to make the correlation values more consistent with the scale-free network characteristics and more biologically significant. Next, we constructed a topological overlap matrix (TOM) and a hierarchical clustering tree among genes to identify modules of lncRNAs and mRNAs. Finally, we used the phenotypic information of grouped traits in the experiment to obtain the correlation between gene modules and identify trait-related modules. 2.4. Functional and pathway enrichment analysis To further investigate the molecular mechanism, we annotated the functions of mRNAs in the modules obtained from WGCNA. The functions of all mRNAs in each module were obtained separately based on the gene function annotation information from the GO data. Similarly, the mRNAs in the modules obtained from WGCNA were annotated with signaling pathways according to the gene signaling pathway annotation information from the KEGG database to obtain all signaling pathways in which the mRNAs in each module are involved. Finally, Fisher's exact test and multiple comparison test were used to calculate the significance level (P value) and false positive rate (FDR) to filter out the significant functions and signaling pathways associated with the mRNAs in each module. 2.5. lncRNA‒mRNA pathway and PPI network Based on the module obtained by WGCNA, the Hub genes in the selected module, i.e., the genes with higher connectivity, were extracted and set as the top 100 genes, including lncRNAs and mRNAs. Hub genes with high connectivity are usually regulatory factors and are upstream in the regulatory network, while genes with low connectivity are usually downstream in the regulatory network (e.g., transporter proteins and catalase). After extracting the hub genes, the coexpression relationship weights between Hub genes were calculated, and the lncRNA‒mRNA and mRNA‒mRNA coexpression regulatory relationships among them were selected. Based on the regulatory relationships of lncRNAs and mRNAs and the significant pathways involved in the regulation of mRNAs, a network of pathways that may be regulated by lncRNAs through the coexpression of mRNAs was constructed, and the lncRNA‒mRNA-pathway network was constructed with Cytoscape 3.7.0. The main purpose of this analysis was to reveal the signaling pathways regulated by lncRNAs in the model. Similarly, according to these mRNAs from the lncRNA‒mRNA network, protein‒protein interaction (PPI) network analysis was performed via the String database, and a protein regulatory network was constructed via Cytoscape 3.7.0. 2.6. Survival analysis High-throughput genetic testing data for recurrence-free survival (RFS) information after radiotherapy for breast cancer patients were obtained from the TCGA database. Using the mRNAs and lncRNAs described in section [46]2.5, we constructed Cox regression analyses via the survival package in R and generated corresponding Kaplan‒Meier curves. We performed RFS survival analysis for these mRNAs and lncRNAs based on their optimal expression thresholds. 2.7. Correlation analysis Pearson's correlation was used to analyze the correlation between the selected mRNAs and lncRNAs and clinical features. The differences between the selected mRNAs and lncRNAs were validated using previously described genetic test data downloaded from the TCGA database for breast cancer. The analysis included comparisons between tumor tissue and normal tissue and molecular typing based on PAM50, followed by box plots generated using R ggplot2. 2.8. miRNA prediction and ceRNA analysis miRNA‒mRNA target gene prediction was performed via miRanda ([47]http://www.microrna.org/) and TargetScan ([48]http://www.targetscan.org/vert_72/), and the corresponding target gene prediction results were obtained. miRNA-lncRNAs were predicted to target genes via miRanda ([49]http://www.microrna.org/) and PITA ([50]https://genie.weizmann.ac.il/pubs/mir07/mir07_exe.html) to obtain miRNAs that regulate target lncRNAs. Then, an endogenous competitive binding ceRNA regulatory network (lncRNA‒miRNA-mRNA network) was constructed based on the predicted miRNA target-binding miRNA‒mRNA and miRNA‒lncRNA relationship pairs. 2.9. Cell culture and THBS2 knockdown We purchased human breast cancer cell lines, including MDA-MB-231 and BT549 cell lines, from Procell (Wuhan, China). The cells were cultured in DMEM (Bological, Israel) supplemented with 10 % heat-inactivated FBS, and the culture conditions were set at 37 °C and saturated with carbon dioxide. To prevent contamination by stray bacteria, we added 1 % penicillin. Transient transfection of si-THBS2 was performed using a Lipo3000 kit (Invitrogen, USA) following the manufacturer's instructions. We used two siRNAs targeting THBS2 with the following sequences: siTHBS2-1: forward, ACATCTGCTTCTCCACCAGC; reverse, GGTCCCCAACCTCATCCTTG siTHBS2-2: forward, AGACCGACGTGGACAATGAC; reverse, GTCCCGGTCAGTGTTTACGT. The cells were incubated for 48 h after infection with si-THBS2, and the transfection efficiency was assessed via Western blotting (WB) with antibodies against THBS2 (ab112543 from Abcam) and β-actin (81115-1-RR from Proteintech). 2.10. Cell proliferation and migration assays First, we used a colony formation assay to assess the effect of reducing THBS2 expression on the proliferative capacity of breast cancer cells. Forty-eight hours after transfection, single-cell suspensions of breast cancer cells were collected and added to the wells of a six-well plate at a ratio of approximately 1000 cells/well. After 10 days of incubation, single-cell colony formation was observed. Subsequently, the cells were fixed using 4 % paraformaldehyde and stained with 0.4 % crystal violet. Finally, the cells were counted using light microscopy to assess their proliferative capacity. To assess the effect of reduced THBS2 expression on the invasive and migratory abilities of breast cancer cells, a Transwell kit (Corning, USA) was used according to the manufacturer's instructions. To perform the scratch healing assay, breast cancer cells with reduced THBS2 expression and control cells were treated to assess their migration ability. First, breast cancer cells that had grown to 90 % confluence were collected and starved in serum-free DMEM for 24 h. The cells were mounted using a 200 μl pipette tip, and then the cell debris was removed by washing with DMEM. The plates were incubated for 48 h to allow the mounted scratches to heal. Finally, the scratch wounds were observed by microscopy, and the wound healing rate was measured using ImageJ software to assess the migration ability of the cells. 2.11. WB analysis of cell cycle indicators We extracted total proteins from breast cancer cells and subsequently separated the proteins using 10 % SDS‒PAGE (Invitrogen, USA) electrophoresis and collected the separated protein extracts through PVDF membranes. The protein-containing PVDF membrane was incubated with 5 % skim milk powder for 1 h at room temperature. Subsequently, membranes were incubated with primary antibodies against C-Myc (10828-1-AP, Proteintech), CDK4 (11026-1-AP, Proteintech), CyclinD1 (60186-1-Ig, Proteintech), CyclinE1 (11554-1-AP, Proteintech) and β-Actin overnight at 4 °C. After the incubation was completed, secondary antibody incubation was performed, and protein bands were scanned and imaged using a laser imager (LL-COR Biosciences, USA). 2.12. Statistical analysis All the data were statistically analyzed using R version 3.6.1 ([51]https://www.r-project.org) (packages: ESTIMATE, limma, clusterProfiler, WGCNA, survival) and plotted using Cytoscape and GraphPad Prism. A two-tailed p value < 0.05 was regarded as statistically significant. 3. Results 3.1. Cluster analysis of DEGs We used the above methods to identify DEGs in breast cancer samples and to screen and determine the number of differentially expressed IS and SS mRNAs and lncRNAs. Among them, there were 3959 IS and 2284 SS differentially expressed mRNAs and 257 IS and 123 SS differentially expressed lncRNAs. Based on the expression values of 344 lncRNAs and 5418 mRNAs (IS and SS differential screening and set) detected in the data sets, along with the expression information from 68 breast cancer radiotherapy-sensitive samples, we clustered the samples and constructed a heatmap ([52]Fig. 1A) and volcano plots ([53]Fig. 1B). Fig. 1. [54]Fig. 1 [55]Open in a new tab (A): Heatmap of the differentially expressed genes. (B): Volcano plot of the differentially expressed genes. The differentially expressed genes of high and low IS and SS in the TCGA breast cancer patient cohort, which is sensitive to radiotherapy, were identified. 3.2. Identification of DEG modules by WGCNA Based on the expression data of the concatenated sets, including 344 lncRNAs and 5418 mRNAs, the soft threshold was calculated so that the connections between genes in the network satisfied both the average connectivity and the scale-free network distribution, which finally yielded β = 7 ([56]Fig. 2A). A hierarchical clustering tree was then constructed to cluster the coexpressed genes into eight different color-coded gene modules (GMs) ([57]Fig. 2B). Correlations between gene modules and phenotypes were calculated according to the settings of grouped trait phenotype information of the experimental design, identification of trait-related modules, and analysis of module significance (MS) values. The higher the ranking of modules with higher correlation values, the more significant the module is. The correlation is indicated by the color depth, and the color code indicates a positive (red) or negative (blue) correlation of the module with the trait. Based on the number of lncRNAs and mRNAs in the modules and the correlation between the modules and the traits, 5 modules were ultimately identified: turquoise, blue, brown, yellow and green ([58]Fig. 2C and D). Fig. 2. [59]Fig. 2 [60]Open in a new tab WGCNA graphics. (A): Analysis of network topology for various soft-threshold powers. Check the scale-free topology; the adjacency matrix was defined using soft thresholds with β = 7. (B): Clustering dendrograms of genes, in which different branches correspond to different gene modules, and the leaves represent genes in the module. (C): Visualization of the gene network using a heatmap plot. (D): Analysis of module-trait relationships of breast cancer patients who were sensitive to radiotherapy based on TCGA data. 3.3. GO and KEGG analysis According to the results of the GO enrichment analysis, the important functions in blue are extracellular matrix organization and cell adhesion ([61]Fig. 3A). KEGG pathway enrichment analysis revealed that blue was mainly associated with focal adhesion, ECM-receptor interaction, the PI3K-AKT signaling pathway, human papillomavirus infection and pathways in cancer ([62]Fig. 3B). In this paper, we investigated the effect of this pathway and the molecules involved in it on breast cancer radiotherapy sensitivity, starting from the PI3K-AKT signaling pathway; therefore, we selected genes in the BLUE module for subsequent analysis. Fig. 3. [63]Fig. 3 [64]Open in a new tab (A): GO enrichment analysis of the blue module. (B): KEGG pathway enrichment analysis of the blue module. 3.4. lncRNA‒mRNA pathway and PPI network Based on the selected hub genes from the module, we obtained the lncRNA‒mRNA regulatory relationships and the relationships between the significant pathways and encoded proteins involved in regulation by mRNAs, constructed a network of pathways ([65]Fig. 4A) and protein interactions ([66]Fig. 4B) that lncRNAs regulate through the coexpression of mRNAs and thus may regulate, and revealed the signaling pathways regulated by lncRNAs and proteins involved in biological functions. We identified 2 lncRNAs involved in the PI3K-AKT signaling pathway (NONHSAT224573.1 and ENST00000554980.5) and 6 mRNAs (COL1A1, COL1A2, COL6A1, COL6A3, THBS2 and PDGFRB). In the blue module, the lncRNA‒mRNA-pathway network consisted of 58 nodes and 128 lines, and the PPI network consisted of 48 nodes and 300 lines. Among these nodes, the interaction degrees of key genes are shown in [67]Table 1 and [68]Table 2. Fig. 4. [69]Fig. 4 [70]Open in a new tab (A): lncRNA‒mRNA pathway network based on the blue coexpression module. (B): PPI network based on the blue coexpression module. Table 1. Hub genes involved in the PI3K-Akt signaling pathway in the blue module. Node Degree Descriptions ENST00000554980.5 18 Unknown LncRNA NONHSAT224573.1 21 Unknown LncRNA COL1A1 11 Collagen Type I Alpha 1 Chain COL1A2 11 Collagen Type I Alpha 2 Chain COL6A1 7 Collagen Type VI Alpha 1 Chain COL6A3 7 Collagen Type VI Alpha 3 Chain THBS2 8 Thrombospondin 2 PDGFRB 14 Platelet Derived Growth Factor Receptor Beta [71]Open in a new tab Table 2. Hub genes involved in the PPI network in the blue module. Node Degree Descriptions COL1A1 34 Collagen Type I Alpha 1 Chain COL1A2 30 Collagen Type I Alpha 2 Chain COL6A1 22 Collagen Type VI Alpha 1 Chain COL6A3 20 Collagen Type VI Alpha 3 Chain THBS2 24 Thrombospondin 2 PDGFRB 19 Platelet Derived Growth Factor Receptor Beta [72]Open in a new tab 3.5. Difference verification For the genes in the blue module screened above, the six mRNAs could be validated using the breast cancer gene detection data from the TCGA database, while no corresponding data were available for the two lncRNAs. We plotted box plots reflecting the differences in the expression of six mRNAs in tumor and normal tissues and molecular subtypes defined by PAM50. There were significant differences in the expression of COL1A1, COL1A2, COL6A1, COL6A3 ([73]Fig. 5A–D), and THBS2 ([74]Fig. 5F) and nonsignificant differences in the expression of PDGFRB between tumor tissues and normal tissues ([75]Fig. 5E). The expression of all the above six mRNAs significantly differed among the different molecular subtypes defined by PAM50 ([76]Fig. 6A–F). Fig. 5. [77]Fig. 5 [78]Open in a new tab Expression of COL1A1 (A), COL1A2 (B), COL6A1 (C), COL6A3 (D), PDGFRB (E) and THBS2 (F) in breast cancer tissues and normal tissues. Fig. 6. [79]Fig. 6 [80]Open in a new tab Expression of COL1A1 (A), COL1A2 (B), COL6A1 (C), COL6A3 (D), PDGFRB (E) and THBS2 (F) in different PAM50 subtypes. 3.6. miRNA prediction and ceRNA networks For the selected molecules, including 6 mRNAs (COL1A1, COL1A2, COL6A1, COL6A3, THBS2 and PDGFRB) and 2 lncRNAs (NONHSAT224573.1, ENST00000554980.5), we used miRanda, TargetScan and PITA for miRNA prediction. As shown in the Figure ([81]Fig. 7), we identified a total of 49 miRNAs that may regulate the expression of 6 key genes (COL1A1, COL6A1, THBS2, PDGFRB, NONHSAT224573.1 and ENST00000554980.5). In contrast, no potential upstream miRNAs were detected for the other two genes (COL1A2 and COL6A3). Fig. 7. [82]Fig. 7 [83]Open in a new tab CeRNA network of the blue module. 3.7. Survival analysis K‒M analysis revealed that patients with low expression of 5 mRNAs (COL1A2, COL1A1, COL6A3, THBS2, and PDGFRB) had a better prognosis, and when these 5 mRNAs were highly expressed, breast cancer patients had a relatively short RFS after RT, which may predict a lower sensitivity to RT ([84]Fig. 8A–E). Fig. 8. [85]Fig. 8 [86]Open in a new tab K‒M survival curves of breast cancer patients who were sensitive to radiotherapy. RFS according to low and high COL1A1 (A), COL1A2 (B), COL6A3 (C), PDGFRB (D) and THBS2 (E) expression. 3.8. Low THBS2 expression is associated with low proliferation and invasion of breast cancer cells THBS2 belongs to the family of platelet-responsive proteins that mediate cell‒cell and cell-matrix interactions. This protein has been shown to promote tumor growth and angiogenesis. Previous studies have shown that THBS2 is a poor prognostic marker for metastasis and proliferation in other tumor types (e.g., pancreatic and colon cancer). Therefore, our cellular experiments aimed to validate the effect of THBS2 on the proliferative and invasive phenotypes of breast cancer. We used two si-THBS2 constructs to knock down THBS2 expression in MB-231 and BT549 cells, and Western blot assays showed that both si-THBS2 constructs effectively reduced THBS2 protein expression in the cell lines ([87]Fig. 9A and B, P < 0.01, original blot in [88]Fig. S1A). We first evaluated the effect of THBS2 on the proliferative ability of breast cancer cells via a colony formation assay. The results showed that the proliferation ability of breast cancer cells was significantly reduced after THBS2 expression decreased ([89]Fig. 9C–D, P < 0.001). Transwell assays were utilized to assess whether the migration ability of breast cancer cells with low THBS2 expression was affected. The results showed that the migration ability of breast cancer cells after THBS2 knockdown was significantly lower than that of untreated breast cancer cells ([90]Fig. 9E–F, P < 0.01). Subsequently, we evaluated the effect of THBS2 on the invasion ability of breast cancer cells by adding matrix gel to Transwell plates. The results showed that the invasive ability of breast cancer cells was significantly attenuated after THBS2 knockdown ([91]Fig. 9G–H, P < 0.001). In addition, we evaluated the effect of low THBS2 expression on the migration ability of breast cancer cells using a scratch healing assay. The results showed that the migration ability of THBS2-knockdown breast cancer cells was attenuated compared with that of wild-type cells ([92]Fig. 9I–J, P < 0.001). Fig. 9. [93]Fig. 9 [94]Open in a new tab Low THBS2 expression is associated with low proliferation and invasion of breast cancer cells. (A–B) The knockdown effect of two si-THBS2 constructs was confirmed at the protein level using WB. Original blot in [95]Fig. S1A. (C–D) Representative images of colony formation in two breast cancer cell lines (MB-231 and BT549) after the reduction in THBS2 expression and the corresponding statistical results. (E–F) Representative images of two breast cancer cell lines (MB-231 and BT549) subjected to a reduction in THBS2 expression in a Transwell assay (without the addition of a matrix gel) and the corresponding statistical results. (G–H) Representative images of two breast cancer cell lines (MB-231 and BT549) in Transwell experiments (with matrix gel added) after reduction of THBS2 expression and their statistical results. (I–J) Representative images of cell scratches and corresponding statistics of two breast cancer cell lines (MB-231 and BT549) after the reduction of THBS2 expression. 3.9. Low THBS2 expression is associated with decreased expression of cell cycle proteins Cell cycle dysregulation is closely related to the proliferative activity of malignant cells, so we subsequently focused on whether low THBS2 expression affects the expression of cell cycle-related proteins in breast cancer types. We examined four classical cyclins (C-myc, CDK4, CyclinD1 and CyclinE1) and found that the expression of all four cyclins was lower in breast cancer cells with low THBS2 expression than in wild-type breast cancer cells ([96]Fig. 10 A-C, P < 0.01. Original plot in [97]Fig. S1B). Fig. 10. [98]Fig. 10 [99]Open in a new tab Low expression of THBS2 is associated with decreased expression of cell cycle proteins. (A–C) Representative images and corresponding statistics of cell cycle-related proteins in two breast cancer cell lines (MB-231 and BT549) after reduction of THBS2 expression were examined using WB assays. Original blot in [100]Fig. S1B. 4. Discussion In our study, bioinformatics was used to screen differentially expressed lncRNAs (DElncRNAs) and mRNAs (DEmRNAs) based on immune and stromal scores, and then WGCNA was performed to obtain 8 modules based on the immune score, substrate score and ER status. Based on GO and KEGG analyses of these 8 modules, DElncRNAs and DEmRNAs in the blue module related to the PI3K-AKT signaling pathway were used to select the pivotal genes and construct the ceRNA network. Finally, we identified 2 lncRNAs and 5 mRNAs (COL1A1, COL1A2, COL6A3, THBS2 and PDGFRB) associated with the TME of breast cancer, providing possible targets to improve radiotherapy sensitivity. Type I collagen is a component of the ECM and is significant for the TME, and its α1 and α2 chains are encoded by the collagen type I α1 (COL1A1) and collagen type I α2 (COL1A2) genes, respectively [[101]11]. Type VI collagen, encoded by the collagen VI alpha 3 (COL6A3) gene, is associated with type I collagen fibrils and can bind them together to form thicker collagen fibers [[102]12]. COL1A1 and ER status are positively correlated with tumor size [[103]13]. In addition, high expression of COL1A and COL6A3 was significantly associated with breast cancer metastasis and predicted shorter PFS and OS [[104]14]. Our research revealed that individuals displaying elevated expression levels of COL1A1, COL1A2, and COL6A3, particularly those characterized as ER-positive, exhibit heightened resistance to radiotherapy. Specifically, these individuals are more predisposed to experience recurrence and metastasis subsequent to treatment, consistent with earlier research findings. The THBS2 gene encodes thrombospondin-2 (THBS2), which belongs to the thrombospondin (TSP) family of glycoproteins. Previous studies have shown that THBS2 is upregulated during the transition between noninvasive and invasive breast cancer [[105]15]. Hozhabri et al. reported that THBS2 is a key gene involved in brain metastasis and could be regarded as a therapeutic target in breast cancer [[106]16]. In our study, we observed a negative correlation between THBS2 expression and RT sensitivity in breast cancer patients. This finding aligns with earlier research indicating that patients with elevated THBS expression are at greater risk of developing metastasis following radiotherapy than are those with lower THBS expression. In our study, we constructed breast cancer cell lines (MB-231 and BT549) with THBS2 knockdown via siRNA and found that THBS2 knockdown significantly inhibited the proliferation and invasion of breast cancer cell lines. In addition, the expression of several cell cycle markers, such as C-myc, CyclinD1, CyclinE1, and CDK4, was significantly reduced after THBS2 knockdown, suggesting that THBS2 may promote the proliferation of breast cancer cells by regulating the cell cycle. PDGFRB encodes a receptor for platelet-derived growth factors (PDGFs). PDGFR and PDGF act on mesenchymal cells, which are important components of the tumor stroma and can promote tumor development [[107]17]. That is, the expression level of PDGFRB may have an impact on the TME and thus be associated with tumorigenesis and progression. Our findings are similar to the previously mentioned finding that PDGFRB expression can influence the TME, resulting in an increase in the stromal component. Consequently, this alteration leads to a decrease in the sensitivity of breast cancer to radiotherapy. Thies KA et al. reported that PDGFRB/PDGFB can not only impact primary tumor growth directly but also predict brain metastases in patients with breast cancer [[108]18]. Moreover, as found previously, high expression of PDGFRB was related to high histopathological grade, estrogen receptor negativity, and high HER2 expression in breast cancer, which indicated that patients with high PDGFRB expression have poorer prognosis and shorter survival [[109]19]. The PI3K/AKT/mTOR pathway is an important intracellular signaling pathway that plays a major role in diverse cellular functions, such as cell proliferation, survival, translational regulation of protein synthesis, glucose metabolism, apoptosis and angiogenesis. Some studies have reported that mutations or amplifications of genes encoding the catalytic subunits and the regulatory subunit of the PI3K/AKT/mTOR signaling pathway are usually detected in breast cancer, especially PIK3CA, which is common in hormone receptor-positive breast cancers and predicts poor survival [[110]20,[111]21]. In addition, excessive activation of the PI3K/AKT/mTOR signaling pathway induces resistance to endocrine therapy, chemotherapy, immunotherapy and radiotherapy [[112]22]. Our study is consistent with the findings above and may provide more insight from a genetic perspective. Tumor cells can recruit endogenous stromal cells to their vicinity to form the TME. These supportive cells remodel the extracellular matrix, promote cell migration, induce angiogenesis, activate invasion, metastasis and drug resistance, and evade immune surveillance by producing various growth factors, chemokines and cytokines, as well as abnormal signaling pathways, which are predictive of clinical outcome [[113]23]. Kovács et al. reported that lower stromal tumor-infiltrating lymphocytes in the primary tumor resulted in a greater benefit from radiotherapy after breast-conserving surgery, thereby reducing the risk of ipsilateral breast tumor recurrence [[114]24]. Qayyum et al. reported that cancer-associated fibroblasts and the factors they produce can be used to evaluate breast cancer radiotherapy schedules [[115]25]. These studies suggest that the immune and stromal components of the TME influence the treatment effect of adjuvant radiotherapy in breast cancer patients. In conclusion, our study showed that COL1A1, COL1A2, COL6A3, THBS2, and PDGFRB have the potential to predict the sensitivity of breast cancer patients to radiotherapy and thus the prognosis of breast cancer patients, which provides potential new targets for improving radiotherapy sensitivity and patient prognosis. However, in our study, we selected only a limited number of breast cancer samples that were sensitive to radiotherapy (no recurrence after radiotherapy), constructed a network based on the genes of these samples, and then speculated on the possible mechanisms by which they affect breast cancer radiotherapy sensitivity based on their immune score and stromal score, as well as the biological functions and signaling pathways involved. Cellular experiments confirmed that knockdown of THBS2 inhibited the proliferation and invasion of breast cancer cells. In our study, we constructed a network based only on multiple genes and speculated on the possible mechanisms by which they affect breast cancer radiotherapy sensitivity based on their immune score and stromal score, as well as the biological functions and signaling pathways involved. However, the exact mechanism underlying the influence of each gene on breast cancer radiotherapy sensitivity is not known. Moreover, it is important to be careful when interpreting the predictive efficacy of molecules analyzed from small sample sizes. Therefore, future studies involving functional assays, in vivo models, and clinical trials with large sample sizes are necessary to validate our findings and establish the clinical utility of these genes as predictive biomarkers for radiotherapy response in breast cancer patients. 5. Conclusion Breast cancer patients with high expression of COL1A1, COL1A2, COL6A3, THBS2, and PDGFRB may have decreased sensitivity to radiotherapy. KEGG analysis revealed that these five molecules are involved in the PI3K-AKT signaling pathway. According to the WGCNA results, the blue module was also found to be associated with these genes. Therefore, we suggest that these five molecules may contribute to the alteration of the tumor microenvironment in breast cancer by activating the PI3K-AKT signaling pathway, resulting in a greater stromal component and a lower immune component, which could increase resistance to radiotherapy, particularly in ER-positive patients. Ethics approval and consent to participate Not applicable. Consent for publication Not applicable. Funding This work was supported by the Sichuan Science and Technology Program (2022YFS0212), the Chengdu Science and Technology Innovation Research and Development Project (2021-YF05-02143-SN), and the Clinical Research and Transformation Fund of Sichuan Provincial People's Hospital (2021LY25). CRediT authorship contribution statement Xiaoyue Sun: Writing – review & editing, Writing – original draft, Visualization, Validation, Methodology, Formal analysis, Data curation. Chihua Wu: Writing – review & editing, Writing – original draft, Visualization, Methodology, Formal analysis, Data curation. Yao Lin: Writing – review & editing, Writing – original draft, Validation. Shengwei Zhang: Writing – review & editing, Writing – original draft, Visualization, Validation. Xinchen Zhao: Writing – review & editing, Validation. Xiaoshan Wang: Writing – review & editing, Writing – original draft, Supervision, Project administration, Funding acquisition, Conceptualization. Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgments