Abstract Background Breast cancer survivors face long-term sequelae compared to the general population, suggesting altered metabolic profiles after breast cancer. We used metabolomics approaches to investigate the metabolic differences between breast cancer patients and women in the general population, aiming to elaborate metabolic changes among breast cancer patients and identify potential targets for clinical interventions to mitigate long-term sequelae. Methods Serum samples were retrieved from 125 breast cancer cases recruited from the Chicago Multiethnic Epidemiologic Breast Cancer Cohort (ChiMEC), and 125 healthy controls selected from Chicago Multiethnic Prevention and Surveillance Study (COMPASS). We used liquid chromatography-high resolution mass spectrometry to obtain untargeted metabolic profiles and partial least squares discriminant analysis (PLS-DA) combined with fold change to select metabolic features associated with breast cancer. Pathway analyses were conducted using Mummichog to identify differentially enriched metabolic pathways among cancer patients. As potential confounders we included age, marital status, tobacco smoking, alcohol drinking, type 2 diabetes, and area deprivation index in our model. Random effects of residence for intercept was also included in the model. We further conducted subgroup analysis by treatment timing (chemotherapy/radiotherapy/surgery), lymph node status, and cancer stages. Results The entire study participants were African American. The average ages were 57.1 for cases and 58.0 for controls. We extracted 15,829 features in total, among which 507 features were eventually selected by our criteria. Pathway enrichment analysis of these 507 features identified three differentially enriched metabolic pathways related to prostaglandin, leukotriene, and glycerophospholipid. The three pathways demonstrated inconsistent patterns. Metabolic features in the prostaglandin and leukotriene pathways exhibited increased abundances among cancer patients. In contrast, metabolic intensity in the glycerolphospholipid pathway was deregulated among cancer patients. Subgroup analysis yielded consistent results. However, changes in these pathways were strengthened when only using cases with positive lymph nodes, and attenuated when only using cases with stage I disease. Conclusion Breast cancer in African American women is associated with increase in serum metabolites involved in prostaglandin and leukotriene pathways, but with decrease in serum metabolites in glycerolphospholipid pathway. Positive lymph nodes and advanced cancer stage may strengthen changes in these pathways. Supplementary Information The online version contains supplementary material available at 10.1186/s12885-023-10656-1. Keywords: breast cancer, metabolome, metabolomics, prostaglandin, leukotriene, glycerophospholipid Introduction Breast cancer survivorship has been improved since 1990s due to advances in early detection and cancer therapy [[41]1]. Over 90% of all female breast cancer patients survived 5 years after their initial diagnosis [[42]2]. Currently, more than 3.8 million women in the US are estimated to have a history of breast cancer, comprising the largest group of cancer survivors [[43]1]. Meanwhile, breast cancer patients generally report poorer health compared to the general population even after successful treatment [[44]3, [45]4]. Higher risks for cardiovascular disease (CVD) [[46]5–[47]7], pulmonary disease [[48]8], fatigue [[49]9], chronic pain [[50]10], and cognitive decline [[51]11] are documented among breast cancer survivors. Studies have suggested that CVD is the largest single cause of death among breast cancer survivors, exceeding cancer-related causes [[52]12]. The long-term sequelae of breast cancer are usually attributed to the side effects of cancer therapies, because radiotherapy, chemotherapy, and trastuzumab therapy are able to induce cardiotoxicity and have been linked to higher risks for cardiovascular and pulmonary diseases [[53]13–[54]23]. However, not all sequelae can be explained by cancer therapies. Genetic susceptibility [[55]24, [56]25] and shared risk factors including age, obesity, and physical inactivity [[57]26–[58]30], also contribute to the development of long-term sequelae after breast cancer diagnosis. Among all potential risk factors, biological changes induced by breast cancer among survivors should not be overlooked. Given that these long-term sequelae appear to manifest nearly 5 years after initial diagnosis of breast cancer [[59]5, [60]7], there exists a potential window period for interventions mitigating disease burdens. Meanwhile, minority groups such as African American face a disproportionately large burden of both breast cancer and chronic diseases [[61]31]. Therefore, understanding the cause and etiology of these sequelae among breast cancer patients will be crucial for the development of such medical interventions, especially among minority groups. As disturbances in metabolic activities underlie most diseases, the study of the metabolome can provide important insight into the etiology of breast cancer sequelae as well as offer the potential to identify metabolic pathways for clinical interventions [[62]32]. More recently, metabolomics technologies give us the ability to measure thousands of metabolites in biological samples, assisting researchers in the investigation of metabolic changes. Metabolomics has demonstrated an emerging and promising role in the diagnosis and prognosis of chronic diseases and clinical interventions [[63]33]. Within this context, we aim to investigate the metabolic profile of breast cancer patients soon after diagnosis and identify potential biological pathways using the state-of-the-art metabolomics technologies among a case-control study with 125 breast cancer cases and 125 controls that were frequency-matched by age. All the participants were African American women residing in Chicago, offering the opportunity to mitigate the larger burdens of both breast cancer and chronic disease in this group [[64]31]. Method Study population The Chicago Multiethnic Epidemiologic Breast Cancer Cohort (ChiMEC) was initiated as a hospital-based case-control study to facilitate research on the effects of high-penetrance susceptibility genes, common genetic variants, and environmental risk factors for breast cancer [[65]34–[66]37]. Breast cancer cases were followed for survival, disease recurrence, and other outcomes to form the ChiMEC cohort, with a current sample size of 5097 patients [[67]38]. Patients diagnosed or treated at the University of Chicago Hospitals were ascertained through the cancer risk clinic and breast center. Clinical, pathological, and treatment data were collected via electronic medical records. Epidemiological risk factor data were collected via a questionnaire. A biobank was established, collecting blood and tumor samples. This analysis randomly selected female African American patients who were ≥ 18 years of age at diagnosis, lived in Chicago, enrolled in the ChiMEC study between 2012 and 2018, had histologically diagnosed non-metastatic invasive breast cancer, and had serum samples available. Stage IV distant metastatic patients were excluded as these patients might have tumor cells in circulation and experience dramatic metabolic changes. The healthy controls were selected from the Chicago Multiethnic Prevention and Surveillance Study (COMPASS), a large scale, longitudinal cohort study with a current sample size of 7728 participants from 72 of the 77 Chicago community areas [[68]39]. Residents of the greater Chicago area were eligible for COMPASS if they were: (1) 18 or older at the time of enrollment; (2) able to give consent and provide survey data in English or Spanish; (3) willing to provide blood, urine, saliva samples, and access to medical records. Recruitment strategies to increase minority enrollment have included a predominantly minority interviewer team and focus on recruitment in census tracts with minority and diverse populations as the primary sampling unit. Enrollment entails the completion of a 1-hour long survey, consenting for past and future medical records from all sources, the collection of clinical and physical measurement data and the on-site collection of biological samples including blood, urine and saliva. On collection, all biological samples are processed and liquated within 24 h before long-term storage and subsequent analysis. Participants completed an extensive survey providing information on medical history, socioeconomic status, psychosocial variables, lifestyle behaviors, social environment, immune status, use of medical services, medication use, among other covariates. Blood collection occurred at the same time as consent and the in-person interview. Informed consent has been obtained from all ChiMEC and COMPASS participants. For the current analysis, 125 African American participants were randomly selected from ChiMEC and COMPASS, respectively, frequency-matched by age. In addition to data collection through questionnaire interview and electronic medical records, we also conducted geocoding analysis based on residence addresses of participants in both studies to collect neighborhood-level characteristics, and we calculated area deprivation index (ADI) [[69]40]. High-resolution metabolomics The precipitation of proteins were performed using the Ostro 96-well plate by following the manufacturer’s protocol for each serum sample. In brief, 100uL of serum were placed in the well with 100uL of surrogate standard Loratadine. Then 200uL of cold solvent (acetonitrile:formic acid 99:1) were added, followed by gentle mixing before the filtration by manifold processor. The procedure was repeated with 400ul cold solvent (acetonitrile/water/formic acid 3:1:1%). The resulting solution were dried by nitrogen flow and reconstituted in 200uL solvent (acetonitrile/water 1:1) with spiked-in internal standards (Celecobix and 4-Aminobiphenyl). For testing the serum extraction procedure, a quality control sample was prepared by mixing same volumes of all serum samples. The same extraction protocol was performed on these quality control samples randomly placed in the well plate. The quality control extracts together with the extraction blanks were injected during the sample analysis. Liquid chromatography mass spectrometry (LC-MS) analysis of the metabolite extracts was performed using an Agilent 6545 Q-TOF and 1290UPLC system controlled by the Agilent Mass Hunter acquisition software. The mass spectrometer was operated in 2 GHz extended dynamic range mode employing precursor ion analysis for relative quantification experiments in positive/negative ion modes. Internal references were