Abstract Background Sarcopenia, an age-related syndrome characterized by a decline in muscle mass, not only affects patients’ quality of life but may also increase the risk of breast cancer recurrence and reduce survival rates. Therefore, investigating the genetic mechanisms shared between breast cancer and sarcopenia is significant for the prevention, diagnosis, and treatment of breast cancer. Methods This study downloaded gene expression datasets and clinical data related to breast cancer and skeletal muscle aging from the GEO database. Data preprocessing, integration, differential gene identification, functional enrichment analysis, and construction of protein-protein interaction networks were performed using R language. Subsequently, COX proportional hazards model analysis and survival analysis were conducted, and survival curves and nomograms were generated. The expression levels of genes in tissues were detected using qRT-PCR, and the Radiant DICOM viewer software was used to delineate the pectoralis major muscle area in CT images. Results We identified 152 differentially expressed genes (P < .05) and 226 sarcopenia-related genes (r > .4) associated with skeletal muscle aging. The TCGA-BRCA dataset revealed 106 genes associated with breast cancer (P < .05, logFC = 1). Functional enrichment analysis indicated significant enrichment in cell proliferation and growth pathways. The PPI network identified critical molecules involved in muscle aging and tumor progression. After dimensionality reduction, a strong correlation was observed between the expression of the muscle aging-related gene set and the prognosis of breast cancer patients (P < .01). The expression of SLC38A1 identified through multivariate COX analysis was significantly associated with poor prognosis in breast cancer patients (P = .03). Incorporating SLC38A1 expression, the prognostic model precisely forecasted breast cancer survival (P < .01). External validation confirmed the higher expression of the SLC38A1 gene in breast cancer tissues compared to adjacent non-cancerous tissues (P < .01). The SLC38A1 index, calculated in combination with the patient’s age and BMI, can optimize the prognostic prediction model, providing a powerful tool for personalized treatment of breast cancer. Conclusion High SLC38A1 gene expression was significantly associated with poor prognosis in breast cancer patients. The combination of SLC38A1 expression and the pectoralis major muscle area provided an optimized prognostic prediction model, offering a potential tool for personalized breast cancer treatment. Supplementary Information The online version contains supplementary material available at 10.1186/s12885-024-13326-y. Keywords: Breast cancer, Sarcopenia, Prognostic prediction model, SLC38A1, Multivariate analysis Introduction In breast cancer development, fat tissue and lean tissue play different roles [[32]1, [33]2]. Research suggests that obesity and high body fat may increase the risk of breast cancer, while lean tissue, especially skeletal muscle, may be somewhat protective against breast cancer [[34]3, [35]4]. In sarcopenia, muscle mass, strength, and physical function are reduced as people age [[36]5]. With age, the number and size of skeletal muscle fibers in the human body gradually decrease, leading to a decline in muscle strength and function [[37]6]. Sarcopenia is often accompanied by a chronic inflammatory state that may promote tumor growth and metastasis, thereby affecting the survival time of breast cancer patients [[38]7, [39]8]. Zhang’s study showed that people with sarcopenia younger than 55 years old had a lower risk of death than those with sarcopenia older than 55 years old, and both had an increased risk of death compared with non-sarcopenia patients [[40]5]. A retrospective study using L3 plane psoas density to assess breast cancer prognosis suggested that the median OS was significantly shorter in the group with low psoas density than in the group with high psoas density [[41]9]. In addition, sarcopenia may also lead to immune dysfunction in patients, reducing the body’s ability to clear tumor cells and further increasing the risk of breast cancer recurrence [[42]10]. On the other hand, breast cancer itself and its treatment process may also have adverse effects on skeletal muscle [[43]11]. Patients with breast cancer often experience changes in body composition, including loss of skeletal muscle mass. Chemotherapy, radiotherapy, and surgery may lead to skeletal muscle damage and accelerated aging, further affecting the quality of life and prognosis of patients [[44]12, [45]13]. However, the mechanisms that breast cancer and sarcopenia share at the genetic level are unclear. In the study of breast cancer, using public data to screen differential genes has become an essential means to reveal the mechanism of tumor development. Qiu et al. used 33 genes related to scorch death to build a risk model to analyze the correlation between scorch death risk score, immune checkpoint-related gene expression, and anticancer drug sensitivity [[46]14]. Based on 15 immune genes, Chen et al. developed a breast cancer prognostic risk-scoring model [[47]15]. The screening of differential genes provides the basis for the follow-up research, but the massive gene expression data needs to be effectively reduced dimensionality processing for better analysis. Non-negative matrix decomposition (NMF), as a data dimensionality reduction method, has been widely used in bioinformatics [[48]16, [49]17]. researchers have also begun combining genetic information with clinical imaging features to predict breast cancer prognosis more fully. Bismeijer. et al. found a relationship between breast cancer MRI phenotype and its underlying molecular biology from gene expression data, suggesting that the enhancement and clarity of tumor margins are associated with ribosomes [[50]18]. The results of the study by Christina et al. showed that combining imaging and genetic hypoxia biomarkers to classify hypoxia could improve the prediction of the effect of radiotherapy and chemotherapy in cervical cancer [[51]19]. The application of this comprehensive approach provides a new perspective for the accurate treatment and prognosis assessment of breast cancer. In conclusion, the combination of genetic information and clinical imaging features can be used to evaluate breast cancer’s biological characteristics and clinical phenotypes comprehensively. First, we downloaded skeletal muscle age-related datasets from public databases and performed data integration and differential gene screening. We further conducted dimension reduction processing and survival analysis of breast cancer data. Using COX proportional risk models, we analyzed the relationship between breast cancer prognosis and muscle aging genes. We constructed a prognostic risk prediction model based on these findings and verified its effectiveness and practicability. Finally, we conducted external data verification of key gene SLC38A1 and explored the application of SLC38A1 gene expression and imaging data in breast cancer prognosis assessment. This study provides an important basis for understanding the role of SLC38A1 in the occurrence and development of breast cancer. It provides a powerful tool for developing personalized treatment and prognosis assessment. Figure [52]1 shows the study’s flow chart. Fig. 1. [53]Fig. 1 [54]Open in a new tab Our research workflow began with extracting gene expression profiles and clinical datasets linked to breast cancer and sarcopenia from the GEO repository. Subsequent steps included data preprocessing using R, differential gene detection with the limma package, and functional enrichment analysis via ClusterProfiler. We mapped out protein interactions using STRING and assessed patient survival rates with Survminer, applying the COX proportional hazards model. The prognostic risk prediction model was constructed using the RMS package, while subgroup analysis was conducted with the NMF package. Tumor characteristics and immune cell infiltration were evaluated with ESTIMATE and IOBR tools. Gene expression was validated through qRT-PCR, and the Radiant DICOM viewer software was utilized to outline the pectoralis major muscle area. GraphPad handled statistical analysis, and the research was conducted with ethical approval Method Data download and preprocessing Gene expression datasets and clinical data related to breast cancer and skeletal muscle aging were downloaded from the GEO database. [55]GSE25941, [56]GSE28392, and [57]GSE28422 were used to identify differential genes through data integration. The TCGA-BRCA dataset was used to identify muscle aging genes and their association with prognosis. External validation was conducted using the skeletal muscle aging-related gene sets [58]GSE117525, [59]GSE47881, [60]GSE4667, [61]GSE674, [62]GSE362, [63]GSE167186, [64]GSE38718, [65]GSE5086, [66]GSE8479, and [67]GSE9103. Additionally, breast cancer datasets [68]GSE42568, [69]GSE5364, [70]GSE20711, [71]GSE21653, [72]GSE25066, [73]GSE88770, and [74]GSE97342 were also used for external validation. To mitigate batch effects and technical variations, we utilized the Sva package in R. Principal Component Analysis (PCA) was performed using the prcomp function in R, demonstrating the main variations among samples through dimensionality reduction. Certain gene sets undergo correction via data normalization, ensuring that expression levels are consistently represented in the TPKM format. Differential gene analysis was conducted using the limma package in R. From the Genecard website, 226 sarcopenia-related genes were identified with a correlation coefficient of at least 0.40. Data set analysis ClusterProfiler in R was used to conduct enrichment analyses on GO and KEGG. The STRING website constructed the PPI network. Survival curves were plotted following the COX proportional hazards model analysis and survival analysis. The rms package was employed to construct nomograms, while the cor package in R was utilized for correlation analysis. Non-negative matrix factorization was carried out using the NMF package to explore potential subgroups and survival differences. The R package ESTIMATE calculated stromal, immune, and ESTIMATE scores for each patient based on gene expression. The R package IOBR reassessed the infiltration scores of immune cells for each patient in every tumor based on gene expression. The ggplot2 package was used to enhance graphics. Clinical data acquisition and processing Tissue samples from breast cancer patients were collected at the First Affiliated Hospital of Guangxi Medical University in 2022. Inclusion criteria: pathologically confirmed primary breast cancer; survival ≥ 3 months; patients had their hands raised above their heads during CT scanning. Exclusion criteria: The previous treatment of breast cancer (such as surgery, radiotherapy, chemotherapy, endocrine therapy, etc.); the presence of other malignant tumors or severe comorbidities; inability to cooperate with the required examinations, treatments, and follow-ups for the study. Finally, 34 patients met the requirements for subsequent research. Clinical information collected from patients, including age, height, weight, platelet count, neutrophil count, lymphocyte count, and monocyte count. In the analysis of CT images of breast cancer patients, the T4 thoracic vertebral level was selected based on published studies [[75]20, [76]21] to measure the pectoralis major muscle area (Figure [77]S1A). The outcome of the patient was recorded during follow-up due to breast cancer recurrence, lymph node metastasis, re-admission, or death. Referring to previous studies, NLR = neutrophils/lymphocytes; MLR = monocytes/lymphocytes; SII = neutrophils × platelets/lymphocytes; SIRI = neutrophils × monocytes/lymphocytes; PIV = neutrophils × monocytes × platelets/lymphocytes [[78]22]. To further optimize the prognostic prediction model, we calculated three different SLC38A1 indices, namely the ratio of SLC38A1 gene CT value to age (SLC38A1 index 1), the ratio of SLC38A1 gene CT value to BMI (SLC38A1 index 2), and the product of SLC38A1 gene CT value and pectoralis major muscle index (SLC38A1 index 3). This research protocol underwent rigorous review and approval by the hospital’s ethics committee, ensuring ethical compliance and the protection of patient’s rights and interests throughout the research process. qRT-PCR The expression difference of the SLC38A1 gene between breast cancer tissue and adjacent normal tissue was compared, and GAPDH was used as the internal reference gene for standardization. qRT-PCR was performed as previously described. The primer sequence is as follows: SLC38A1-F: 5′- GCATTTGTTTGCCACCCGTC − 3′; SLC38A1-R: 5′- GTCGGACTGCACGTTGTCAT − 3′; GAPDH-F: 5′- ATGGGGAAGGTGAAGGTCG − 3′; GAPDH-R: 5′- CTCCACGACGTACTCAGCG − 3′. Statistical analysis Data were summarized using descriptive statistics, including mean, standard deviation, median, and range. We employed independent sample t-tests or Mann-Whitney U tests to compare differences between different groups, and the results of correlation analysis were described using the Pearson correlation coefficient. The qRT-PCR results were completed using GraphPad software. The delineation of the pectoralis major muscle in CT images of breast cancer patients was done using Radiant DICOM viewer software. We defined |logFC| > 1 as the threshold for gene differential analysis. Statistical significance was defined as a p-value less than 0.05. Results Data Integration and Preliminary Analysis The PCA algorithm was employed to illustrate the effect of data integration. The results suggest that the batch effect has been eliminated(Figs. [79]2A and B). We further utilized the limma package to filter out those significantly associated with skeletal muscle aging and visualized 152 of these genes through a volcano plot (Fig. [80]2C). Additionally, 226 genes related to sarcopenia were obtained from the Genecard website (correlation coefficient > 0.40). After removing duplicates, 374 genes were included in subsequent studies. Among these, 106 genes were differentially expressed in the TCGA-BRCA dataset. The results of GO enrichment analysis revealed a significant enrichment of functional pathways related to cell proliferation and growth (Fig. [81]2D). Meanwhile, KEGG pathway enrichment analysis further highlighted the importance of membrane protein signaling, kinase signaling, and longevity-associated signaling pathways (Fig. [82]2E). Furthermore, we constructed a protein-protein interaction network, identifying interacting molecules primarily comprising growth factors, cytokines, and growth regulators (Fig. [83]2F), which may play crucial roles in regulating muscle aging and tumor progression. To explore the potential role of muscle aging-related gene sets in breast cancer, we utilized these genes as features to reduce dimensionality on TCGA breast cancer tumor data. The results of NMF data reduction indicated an optimal K value of 2 (Fig. [84]2G), enabling us to segment breast cancer patients into two distinct subgroups (Fig. [85]2H). Survival analysis revealed a higher survival rate in group 2 compared to group 1 (Fig. [86]2I, P = .49). When focusing on tumor patients over 60 years old, we observed a similar trend, with group 2 exhibiting a higher survival rate than group 1 (Fig. [87]2J, P = .25). These findings suggest that the expression of muscle aging-related gene sets in breast cancer patients may be closely associated with patient prognosis. Fig. 2. [88]Fig. 2 [89]Open in a new tab Molecular Mechanisms Associated with Skeletal Muscle Aging and Their Potential Role in Breast Cancer. (A) and (B) Data integration and PCA results. The integrated data exhibit good clustering in the PCA plot, effectively eliminating batch-to-batch variations. (C) Volcano plot of differentially expressed genes. (D) and (E) show the results of GO and KEGG enrichment analysis. (F) Protein-protein interaction network. (G) and (H) NMF analysis results indicate an optimal K value of 2, leading to the classification of breast cancer patients into two distinct subgroups. (I) and (J) Survival analysis results Genetics of muscle aging associated with breast cancer outcomes We employed the COX proportional hazards model for analysis to further investigate the association between breast cancer prognosis and muscle aging-related genes. Through univariate analysis, we identified 12 genes significantly associated with breast cancer patients’ prognosis (Fig. [90]3A). Subsequently, multivariate COX analysis was conducted, successfully identifying four independent prognostic genes: CHD9, LDLRAD3, GREB1, and SLC38A1 (Fig. [91]3B, P < .05). Patients with high expression of SLC38A1 had significantly fewer survival rates than those with low expression (Fig. [92]3C, P = .03), indicating that high SLC38A1 expression may be closely associated with poor prognosis in breast cancer patients. There were no significant differences in survival between high and low expression groups for CHD9, GREB1, and LDLRAD3(Fig. [93]3D, E and F; P = .81, P = .07, P = .08). We examined SLC38A1’s prognostic value in various breast cancer subtypes based on the heterogeneity of breast cancer. The results showed that patients in the low SLC38A1 expression group exhibited better survival advantages in triple-negative breast cancer (Fig. [94]3G, P = .64), HER2-positive breast cancer (Fig. [95]3H, P = .019), Luminal A breast cancer (Fig. [96]3I, P = .57), and Luminal B breast cancer (Fig. [97]3J, P = .42). Fig. 3. [98]Fig. 3 [99]Open in a new tab Investigating the Relationship between Breast Cancer Prognosis and Muscle Aging-Related Genes through COX Analysis. (A) Results of univariate analysis. (B) Four independent prognostic genes were confirmed by multivariate COX analysis: CHD9, LDLRAD3, GREB1, and SLC38A1. (C) Survival analysis results for SLC38A1. (D), (E) and (F) Survival analysis results for CHD9, GREB1, and LDLRAD3, respectively. (G), (H), (I), and (J) Survival analysis results for SLC38A1 in triple-negative breast cancer, HER2-positive breast cancer, Luminal A breast cancer, and Luminal B breast cancer, respectively Correlation analysis and construction of a prognostic risk prediction model We conducted a correlation analysis to investigate further the intrinsic relationship between muscle aging genes associated with breast cancer prognosis. The results revealed a strong correlation between the expression levels of HSD11B1 and HLA-DQA1 (Fig. [100]4A, R = .52, P < .001), while the correlations between other genes were relatively weak. Based on these discoveries, we further utilized seven muscle aging genes associated with breast cancer prognosis to construct a prognostic risk prediction model. Based on the mean risk index, we classified breast cancer patients into high-risk and low-risk categories (Fig. [101]4B). Heatmaps illustrate how muscle aging genes are expressed differently between high-risk and low-risk individuals (Fig. [102]4C). Figure [103]4D depicts the distribution of patients with varying risk scores, where red dots represent deceased patients and green dots represent surviving patients. Survival analysis revealed that low-risk patients’ survival rate was significantly lower than high-risk patients‘(Fig. [104]4E, P < .01). The risk model was used to develop a nomogram (Fig. [105]4F). This nomogram can assist clinicians in rapidly predicting the 3-year, 5-year, and 10-year survival rates of breast cancer patients. These results validate the effectiveness and practicality of the prognostic risk prediction model and provide a powerful tool for developing personalized treatment plans and prognostic evaluations. Fig. 4. [106]Fig. 4 [107]Open in a new tab Construction of a prognostic risk prediction model based on muscle aging genes associated with breast cancer prognosis. (A) Correlation analysis demonstrates the relationships between muscle aging genes and breast cancer prognosis. (B) Breast cancer patients were stratified into high-risk (red) and low-risk (green) groups using the mean risk index as the cutoff. (C) Heatmap illustrating the differential expression of prognosis-related muscle aging genes between the high- and low-risk groups. (D) Scatter plot showing patient risk distribution and survival status. Each point represents an individual patient, with red dots indicating deceased patients and green dots indicating surviving patients. The x-axis represents the patient’s risk score. (E) Nomogram assists clinicians in predicting breast cancer patients’ 3-year, 5-year, and 10-year survival rates External data validation of SLC38A1 gene SLC38A1 gene expression decreased significantly in older participants compared to younger participants in age-related datasets (Fig. [108]5A, P < .05). Considering that breast cancer predominantly occurs in women, we further investigated whether there are gender differences in SLC38A1 gene expression. Analysis of the [109]GSE117525 and [110]GSE8479 datasets revealed no significant difference in SLC38A1 gene expression between males and females (Fig. [111]5B, P > .5), suggesting the functional universality of this gene across genders. Next, we focused on the relationship between the SLC38A1 gene and physical health status. The analysis showed that SLC38A1 expression was lower in overweight patients and those with sarcopenia compared to healthy individuals (Fig. [112]5C, P < .01). Notably, SLC38A1 gene expression exhibited an increasing trend when patients began physical training (Fig. [113]5D), implying a potential regulatory role of physical training on SLC38A1 expression. The [114]GSE42568 and [115]GSE5364 breast cancer datasets were used to validate the expression of the SLC38A1 gene. According to consistent results, breast cancer tissues expressed significantly more SLC38A1 than normal tissues (Fig. [116]5E, P < .01), further supporting the crucial role of SLC38A1 in tumorigenesis and development. Finally, we conducted external survival validation to assess the impact of SLC38A1 gene expression on survival outcomes in breast cancer patients. In the [117]GSE20711 and [118]GSE88770 datasets, patients with low SLC38A1 expression appeared to have a better survival advantage (Fig. [119]5F, P > .1). However, in the [120]GSE42568 dataset, we observed opposite results (Fig. [121]5F, P < .01). Additionally, the results suggested a weak negative correlation between the SLC38A1 gene and breast cancer immunity (Figure [122]S1B). These inconsistent findings may reflect the heterogeneity of breast cancer and differences in patient characteristics across datasets. Fig. 5. [123]Fig. 5 [124]Open in a new tab The validation results of the SLC38A1 gene across multiple external datasets demonstrate consistent representation of expression levels in the TPKM format. (A) Comparison of SLC38A1 gene expression levels in age-related datasets. (B) Analysis of SLC38A1 gene expression differences between genders. (C) Relationship between the SLC38A1 gene and physical health status. (D) Impact of physical training on SLC38A1 gene expression. (E) SLC38A1 gene expression in breast cancer datasets. (F) External survival validation of the effect of SLC38A1 gene expression on survival outcomes in breast cancer patients SLC38A1 gene prognostic relationship in breast Cancer To understand the role of the SLC38A1 gene in the prognosis of breast cancer patients, we analyzed multiple datasets and validated our results using clinical samples. COX analysis revealed significant heterogeneity among the [125]GSE20711, [126]GSE21653, [127]GSE25066, [128]GSE88770, and [129]GSE97342 datasets. Specifically, in the [130]GSE97342 dataset, estrogen receptor (ER) and progesterone receptor (PR) positivity were unexpectedly identified as protective factors for breast cancer prognosis, while in the [131]GSE21653 dataset, a higher T-stage was considered a protective factor (Figure [132]S1C). These findings contradicted conventional clinical knowledge. Although the SLC38A1 gene was considered a risk factor in the [133]GSE88770, [134]GSE20711, and [135]GSE25066 datasets, these associations were not statistically significant (Figure [136]S1C, P > .1). To explore the clinical significance of SLC38A1 more directly, we included an independent sample set comprising 34 breast cancer patients. These patients were pathologically diagnosed with non-specific invasive ductal carcinoma grade II/III, with a mean age of 54.59 ± 12.85 years (range 37–81 years). Through qRT-PCR analysis, we again confirmed that SLC38A1 gene expression was significantly higher in tumors than in adjacent tissues (Fig. [137]6A, P = .001). As a result of univariate COX analysis, it was determined that SLC38A1 gene expression has a prognostic significance for breast cancer (Fig. [138]6B, P < .01). Although the SLC38A1 index had some reference value in breast cancer prognosis assessment, its statistical significance was not strong (Fig. [139]6B, P = .093, P = .063, P = .272). We then stratified patients into two groups based on the mean SLC38A1 expression. Survival analysis suggested that differences in SLC38A1 gene expression were helpful in prognosis assessment (Fig. [140]6D, P < .01). However, the pectoralis major index did not significantly contribute to prognosis (Fig. [141]6C, P = .5). The SLC38A1 index showed some promise in prognosis assessment (Fig. [142]6E, F, and G, P = .29, P = .22, P = .19). These findings suggest that gene expression levels and nutritional indicators may aid in the prognosis assessment of breast cancer patients. Fig. 6. [143]Fig. 6 [144]Open in a new tab Comprehensive exploration of the prognostic relationship between the SLC38A1 gene and breast cancer patients. (A) PCR results confirm the high expression pattern of SLC38A1 in breast cancer. (B) COX analysis results from external datasets. (C) The pectoralis major index did not significantly contribute to prognosis. (D) Survival analysis suggested that differences in SLC38A1 gene expression were helpful in prognosis assessment. (E), (F) and (G) Optimized prognostic prediction model results. This model was achieved by calculating three different SLC38A1 indices and conducting a COX analysis Discuss Breast cancer prognosis is not only related to the characteristics of the tumor itself but also closely associated with the overall physiological status of the patient. Among various physiological indicators, muscle mass has received widespread attention in recent years as an essential factor. In this study, we conducted an integrated analysis of multiple datasets to delve deeper into the potential role of skeletal muscle aging-related molecules in breast cancer. Based on this, we further explored the application value of these molecules in the prognostic assessment of breast cancer. In addition, we paid particular attention to the importance and clinical application value of the SLC38A1 gene in breast cancer prognosis evaluation. Using the COX proportional hazards model, we identified four independent prognostic genes: CHD9, LDLRAD3, GREB1, and SLC38A1, among which the high expression of SLC38A1 is closely associated with poor prognosis in breast cancer patients. Furthermore, we constructed a prognostic risk prediction model that effectively predicts the survival rate of breast cancer patients, providing a powerful tool for personalized treatment. Subsequently, we delved into the relationship between SLC38A1 gene expression and age, gender, physical health status, as well as breast cancer. To further optimize the prognostic prediction model, we calculated three different SLC38A1 indices: the ratio of SLC38A1 gene CT value to age (SLC38A1 Index 1), the ratio of SLC38A1 gene CT value to BMI (SLC38A1 Index 2), and the product of SLC38A1 gene CT value and pectoralis major muscle index (SLC38A1 Index 3). Therefore, this study aims to deeply explore the application value of muscle aging genes in breast cancer prognosis assessment and seeks to develop more accurate prognostic prediction models to provide personalized treatment guidance and improve patients’ quality of life. Enrichment analysis of genes significantly associated with skeletal muscle aging revealed that they are primarily involved in cell proliferation and growth, membrane protein signaling, kinase signaling, and longevity-related signaling pathways. This result suggests that these pathways may play crucial roles in muscle aging. Analysis of protein-protein interaction networks has uncovered the central roles of molecules such as growth factors, cytokines, and growth regulatory factors in regulating muscle aging and tumor progression. These molecules have far-reaching effects on cell signaling, proliferation regulation, and microenvironment shaping. Growth factors and cytokines are particularly affected by muscle aging [[145]23, [146]24], which affect the muscle’s regenerative capacity and metabolic status [[147]25]. Notably, their role in breast cancer has been confirmed by multiple studies. EGF binds to EGFR on the cell surface, triggering the classical EGF/EGFR signaling pathway, leading to upregulation of ERK1/2 and AKT phosphorylation levels and promoting tumor cell proliferation and survival [[148]26, [149]27]. The tumor necrosis factor mediates muscle atrophy in cancer cachexia or inflammation [[150]28]. The IL6/STAT3 pathway has been identified as a possible contributor to obesity-driven ER + breast cancer metastasis, according to studies [[151]29]. Abnormal activation of the IGF signaling pathway is closely associated with poor prognosis in breast cancer [[152]30, [153]31]. The combined effect of these factors forms a complex regulatory network that together drives the progression of breast cancer and sarcopenia. The results of our survival analysis show significant differences in survival among groups of breast cancer patients with different molecular characteristics of muscle aging. There is mounting evidence that sarcopenia reduces patients’ quality of life and substantially increases their death risk for breast cancer patients [[154]12, [155]32]. In addition, the study found that patients with sarcopenia were less tolerant during chemotherapy and were more likely to experience dose reductions or treatment interruptions, which affected the effectiveness of chemotherapy and patient outcomes [[156]33, [157]34]. These studies provide strong evidence for our results that skeletal muscle age-related molecules have important applications in evaluating breast cancer prognosis. Our study reveals the potential role of muscle aging-related genes in breast cancer prognosis and specifically highlights the SLC38A1 gene as an important prognostic marker. SLC38A1 is a crucial transporter and belongs to the SLC38 family. It plays a variety of key functions in the human body, especially in amino acid transport and metabolism processes, which play an important role. The expression of SLC38A1 is abnormal in many types of cancer, including breast, lung, and colon cancers [[158]35–[159]37]. In the context of breast cancer, SLC38A1 expression was significantly higher in tumors than in normal tissues, consistent with previous reports [[160]38]. Furthermore, research indicates that SLC38A1 may regulate the polarization of tumor-associated macrophages [[161]38]. Our results suggest a negative correlation between SLC38A1 expression and immune score in breast cancer. It is worth noting that this study observed some interesting but inconsistent results in some analyses. In exploring the relationship of the SLC38A1 gene to the prognosis of breast cancer patients, we found strong heterogeneity among different data sets. These results may be related to the heterogeneity of breast cancer, differences in patient characteristics between different data sets, and limitations of statistical methods. In addition to cancer, SLC38A1 may also be involved in the occurrence and development of other diseases. Some studies have found that SLC38A1 also plays a vital role in nervous and metabolic diseases [[162]39]. This study found that SLC38A1 expression in skeletal muscle may decline with age, and SLC38A1 expression is lower in overweight patients and patients with sarcopenia than in healthy people. It is worth noting that when patients began to receive physical training, the expression of the SLC38A1 gene in skeletal muscle showed an upward trend. These results suggest that physical training may have a regulatory effect on the expression of SLC38A1 in muscles. These results indicate that SLC38A1 expression is heterogeneous in different tissues, and high expression of SLC38A1 positively affects the development of skeletal muscle cells and breast cancer cells. However, the specific mechanism of action of SLC38A1 in these diseases is not fully understood at present, and further studies are needed to reveal it. In calculating muscle mass, the area of the pectoralis major muscle at different thoracic levels is often used as an indicator [[163]21, [164]40]. This measurement provides us with an effective means of quantifying muscle mass and underscores its importance in the assessment of breast cancer prognosis. Our results show that the expression of the SLC38A1 gene in tumors is identified as a risk factor for breast cancer recurrence. However, the pectoral major index showed no significant effect. We then combined genetic information and imaging data to assess the risk of breast cancer recurrence. Although HR showed a clear advantage, the statistical significance was not strong, but they still provided us with valuable information for prognostic prediction. There are still many limitations to this study. Although the prognosis prediction model constructed in this study has achieved certain predictive effects, it still needs to be further verified and optimized by expanding the number of patients. At the same time, the selection of T4 level pectoral major muscle area as a reference for patient muscle mass is still worth considering. In future studies, we will continue to explore the relationship between SLC38A1 expression in skeletal muscle and muscle mass and the prognosis of breast cancer patients, further expand the sample size, and include more prognostic factors to improve the prognostic value of the SLC38A1 index. In conclusion, this study provides a new perspective and biomarker for the prognosis assessment of breast cancer. Future studies should further explore the mechanism of muscle mass loss and its role in the occurrence and development of breast cancer, to provide new ideas and methods for prognosis prediction and treatment of breast cancer. Electronic supplementary material Below is the link to the electronic supplementary material. [165]Supplementary Material 1^ (302.8KB, pdf) Author contributions Y.W. and H.Y. are responsible for study conceptualization. Y.W. generated most of the data, assisted by P.Z. CJ.W. checked the statistical method. WJ.H., P.Z., and Y.W. prepared the figures. Y.W. wrote the manuscript. H.Y. provided resources, supervision, and editing. All authors contributed to the article and approved the submitted version. Funding This study was funded by the Youth Science Foundation of Guangxi Medical University(GXMUYSF202327), the National Natural Science Foundation of China (82160336), the Natural Science Foundation of Guangxi (2023GXNSFDA026013, 2020GXNSFDA238005). Data availability Data sources and handling of the publicly available datasets used in this study are described in Sect. 2. The raw data that support the findings of this study are available from the corresponding author upon request. Declarations Ethics approval and consent to participate The study was conducted in accordance with the principles of the Declaration of Helsinki and approved by the Medical Ethics Committee of the First Affiliated Hospital of Guangxi Medical University, with approval number 2021 (KY-E-034). All patients were fully informed about the study and provided written consent before participating. Conflict of interest The authors declare no conflicts of interest. Footnotes Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. References