Abstract Background Graft-versus-host disease (GVHD) and relapse are major complications following allogeneic hematopoietic stem cell transplantation (allo-HSCT). Metabolites play crucial roles in immune regulation, but their causal relationships with GVHD and relapse remain unclear. Methods We utilized genetic variants from genome-wide association studies (GWAS) of 309 known metabolites as instrumental variables to evaluate their causal effects on acute GVHD (aGVHD), gut GVHD, chronic GVHD (cGVHD), and relapse in different populations. Multiple causal inference methods, heterogeneity assessments, and pleiotropy tests were conducted to ensure result robustness. Multivariable MR analysis was performed to adjust for potential confounders, and validation MR analysis further confirmed key findings. Mediation MR analysis was employed to explore indirect causal pathways. Results After correction for multiple testing, we identified elevated pyridoxate and proline levels as protective factors against grade 3–4 aGVHD (aGVHD[3]) and relapse, respectively. Conversely, glycochenodeoxycholate increased the risk of aGVHD[3], whereas 1-stearoylglycerophosphoethanolamine had a protective effect. The robustness and stability of these findings were confirmed by multiple causal inference approaches, heterogeneity, and horizontal pleiotropy analyses. Multivariable MR analysis further excluded potential confounding pleiotropic effects. Validation MR analyses supported the causal roles of pyridoxate and 1-stearoylglycerophosphoethanolamine, while mediation MR revealed that pyridoxate influences GVHD directly and indirectly via CD39^ + Tregs. Pathway analyses highlighted critical biochemical alterations, including disruptions in bile acid metabolism and the regulatory roles of vitamin B6 derivatives. Finally, clinical metabolic analyses, including direct fecal metabolite measurements, confirmed the protective role of pyridoxate against aGVHD. Conclusions Our findings provide novel insights into the metabolic mechanisms underlying GVHD and relapse after allo-HSCT. Identified metabolites, particularly pyridoxate, may serve as potential therapeutic targets for GVHD prevention and management. Supplementary Information The online version contains supplementary material available at 10.1186/s12916-025-04026-w. Keywords: Allogeneic hematopoietic cell transplantation, Graft-versus-host disease, Metabolite, Immune cell, Mediation, Mendelian randomization Key points 1. Strong evidence has found the protective effect of pyridoxate against aGVHD and has been validated in patients. 2. Mediated MR analysis suggests that pyridoxate may reduce aGVHD risk by increasing CD39^+ Tregs level. Supplementary Information The online version contains supplementary material available at 10.1186/s12916-025-04026-w. Background Allogeneic hematopoietic stem cell transplantation (allo-HSCT) emerges as a critical modality for treating various hematologic malignancies. Simultaneous consideration of post-transplant relapse and graft-versus-host disease (GVHD) occurrence holds paramount significance in enhancing the prognosis of transplant recipients [[48]1]. Recent research suggests that allogeneic immune responses are susceptible to host microbiota [[49]2, [50]3] and metabolome [[51]4]. Blood metabolites can enter the circulation system and have a direct impact on the host’s metabolic landscape [[52]5–[53]7]. Distinct alterations in blood metabolites have been identified among post-transplant patients who develop GVHD. In acute GVHD (aGVHD), these alterations involve reduced tryptophan and arginine pathway components, along with diminished levels of lipids such as lysoplasmalogens, plasmalogens, and phospholipids. Concurrently, there is an observed elevation in various complex lipid products, encompassing medium- and long-chain fatty acids, as well as polyunsaturated fatty acids [[54]6]. In the case of chronic GVHD (cGVHD), significant metabolic alterations related to lipid, fatty acid, and bile acid metabolism were observed 1 year after transplantation [[55]8]. Moreover, changes in short-chain fatty acids (SCFAs), namely propionate and butyrate, manifested 100 days post-transplant [[56]9]. In recent years, several studies utilizing clinical cohorts have explored the relationship between metabolic pathways and the development of aGVHD [[57]10–[58]13]. For instance, Tyszka et al. identified a potential link between lipid metabolism, including bile acid transformation and cholesterol synthesis, and the risk of aGVHD. While these observational studies provide valuable insights, they are inherently limited in their ability to infer causal relationships due to confounding factors and the relatively small sample sizes of clinical cohorts. Also, experimental evidence highlights the pivotal role of certain metabolites in influencing the progression of GVHD [[59]4, [60]14, [61]15]. Recent research has investigated the role of microbiome-derived metabolites in modulating clinical outcomes in patients undergoing allo-HSCT, revealing how specific bacterial families and their associated bacteriophages contribute to protective immunomodulatory metabolites that influence survival and relapse rates. To date, no large-scale metabolomic risk factor screening studies for GVHD have been reported. A significant knowledge gap exists regarding the causal relationship between metabolites and post-transplant outcomes. Furthermore, traditional studies exploring risk factors for allo-HSCT outcomes are often confounded by variables such as donor types and preventive strategies, making it challenging to effectively and accurately identify potential biomarkers [[62]16]. Mendelian randomization (MR) has emerged as a powerful tool for exploring causal relationships between exposures and diseases [[63]17–[64]19], such as osteoarthritis [[65]20] and psychiatric disorders [[66]21, [67]22]. Unlike observational studies, MR can provide less biased estimates by leveraging the fact that genotypes are determined at conception and are generally unaffected by confounders. This approach also enables the linkage of exposures and outcomes from different populations, such as connecting exposures in healthy populations to various secondary outcomes, including postpartum depression [[68]23] and post-traumatic stress disorder [[69]24]. Recent advancements in GWAS have expanded the metabolome, resulting in the creation of a comprehensive map of genetically determined metabotypes (GDMs). These GDMs allow the evaluation of the causal effects of genetically determined, untargeted metabolites on complex traits [[70]21, [71]25, [72]26]. To address these limitations, our study adopts a MR approach, leveraging large-scale GWAS datasets to systematically assess the causal effects of hundreds of metabolites on GVHD and relapse outcomes. This method enables us to overcome confounding factors and establish direct causal links, providing a more robust and comprehensive understanding of the metabolic pathways underlying GVHD. Utilizing the GDMs and the latest GWAS findings on GVHD and relapse after allo-HSCT, we conducted a two-sample MR approach, where genetic associations with the exposure and outcome are measured in different cohorts, to achieve the following objectives: (1) assess the causal effects of human blood metabolites on major allo-HSCT outcomes, including aGVHD, cGVHD, and relapse; (2) identify common metabolites that have causal effects across multiple allo-HSCT outcomes; (3) identify metabolic pathways that might contribute to the development of GVHD and relapse; (4) estimate both the direct and indirect effects of metabolites on outcomes through immune traits by employing a two-step MR approach. Additionally, fecal samples were also included in the analysis to assess the role of the gut metabolites in modulating immune responses from another prospective, in the context of GVHD. Previous studies have suggested a link between blood metabolites and fecal-derived metabolites [[73]27], and this complementary approach provides insights into the metabolites to systemic immune regulation. Workflow is presented in Fig. [74]1. Fig. 1. [75]Fig. 1 [76]Open in a new tab Workflow of our research Methods Genome-wide association study (GWAS) data sources Metabolite profiles We obtained summary association data from the most extensive genetic study examining the impact on human metabolism, which is publicly accessible via the Metabolomics GWAS Server ([77]http://metabolomics.helmholtz-muenchen.de/gwas/) [[78]28]. This research involving 7824 adults from two European population studies identified 145 significant genetic loci associated with human metabolism, with metabolite levels measured by liquid-phase chromatography (LC–MS) and gas chromatography separation (GC–MS) coupled with tandem mass spectrometry, expressed as − log[10] (relative abundance) to ensure standardized, normalized data across samples. We focused on the 309 known metabolites, defined as those with identified chemical structures, categorized into 60 subclasses and 8 broad classes based on KEGG pathways (e.g., amino acids, carbohydrates, cofactors and vitamins, energy, lipids, nucleotides, peptides, xenobiotic metabolism). The remaining 196 metabolites (37%) were classified as “unknown” due to undetermined chemical identities at the time of analysis [[79]29]. We extracted details such as effect size, standard error, P value, and sample size for each SNP from the full GWAS summary statistics. The GWAS summary statistics for validation MR analysis were selected for specific metabolites and included 8089 individuals for pyridoxine levels, glycochenodeoxycholate, and proline [[80]30], and 4959 individuals for 1-stearoylglycerophosphoethanolamine [[81]31]. GVHD and relapse risk In this study, we utilized GWAS data from previous research [[82]32] to identify genetic factors influencing outcomes in aGVHD, cGVHD, gastrointestinal GVHD (gut GVHD), and relapse. aGVHD was classified based on peak severity into three categories: grade 2 to 4 (aGVHD[2]), grade 2b to 4 (aGVHD[2b]), and grade 3 to 4 (aGVHD[3]). Additionally, the classification encompassed stage 2 to 4 gut GVHD, cGVHD, and cases of recurrent or progressive malignancy, referred to as relapse. The aGVHD[2b] category, as defined in the original study, specifically excluded isolated stage 1 gut GVHD to enhance specificity and focus on cases with more significant systemic or organ involvement, addressing a condition frequently observed at the research center. We only considered the genotype of the receptor for subsequent analyses, as genetically determined risk is carried at birth. Donor types include (1) all kinds of donors (ALL); (2) matched siblings (FS); and (3) matched unrelated donors (URD). Detailed information on allo-HSCT outcomes definitions were presented in Additional file 1: Supplemental Methods and elsewhere [[83]32]. Detailed information of GWAS summary statistics was provided in Additional file 2: Table S1. Untargeted metabolite profiling of fecal samples Fecal samples collected from the Department of Hematology of the First Affiliated Hospital of Soochow University were selected for LC–MS analysis. This study, titled “Alterations in Intestinal Microbiota, Metabolites, and Immune Cells in Allo-HSCT,” is an ongoing cohort study (2020–present), which includes 38 patients diagnosed with aGVHD and 52 patients without aGVHD ([84]NCT06143501). Samples were collected at multiple time points: before transplantation (Pre), 1–2 weeks post-transplantation (P1), 2–4 weeks post-transplantation (P2), and 4–6 weeks post-transplantation (P3). The Ethics Committee of the First Affiliated Hospital of Soochow University granted approval for sample collection (Approval No. 2023369). Metabolites extracted were analyzed using a Dionex U3000 UHPLC system coupled with a QE Plus high-resolution mass spectrometer (Thermo Fisher). Samples after quality control were used to evaluate the pre-treatment, sample loading, and stability of the mass spectrometry system. Mixed lymphocyte reaction To induce T cell activation, a mixed lymphocyte reaction (MLR) was performed using BALB/c (H2-Kd) and C57BL/6 (H2-Kb) mouse strains. Spleens from BALB/c mice were harvested and treated with 10 µg/ml mitomycin C for 30 min at 37 °C to inhibit cellular proliferation. The mitomycin C-treated BALB/c spleen cells were used as allogeneic stimulators. T cells were isolated from the spleens of C57BL/6 mice using the EasySep™ Mouse T Cell Isolation Kit (19,851, Stemcell Technologies), according to the manufacturer’s instructions. Isolated T cells were then co-cultured with BALB/c spleen cells in a 1:1 ratio at a concentration of 4 × 10^5 cells per well in 96-well plates, with RPMI-1640 medium supplemented with 10% FBS, 1% penicillin–streptomycin, and 2 mM L-glutamine. Pyridoxate (P464801, Aladdin) was dissolved in dimethyl sulfoxide (DMSO) and added to the co-culture system at gradient concentrations. DMSO alone was added to the control group. The cultures were incubated for 72 h at 37 °C with 5% CO[2]. Flow cytometry analysis After 72 h of co-culture, cells were harvested and analyzed for T cell using flow cytometry. The cells were first stained with surface antibodies: anti-CD4-BV421 (100,438, Biolegend), anti-CD8-PE (100,708, Biolegend), anti-H2-Kb-PerCP-Cy5.5 (116,516, Biolegend), anti-CD69-FITC (104,506, Biolegend), anti-CD44-PE-Cy7 (25–0441-82, eBioscience), anti-CD62L-APC-Cy7 (104,428, Biolegend), and anti-PD-1-Alexa Flour 647 (566,715, BD Biosciences). Data acquisition was performed on a BD FACS Canto flow cytometer (BD Biosciences), and the data were analyzed using FlowJo software. Statistical analysis and data presentation were performed using GraphPad Prism 9 (v9.5.0) was used for statistical analyses. Statistical analysis Two-sample MR analysis In line with the methodology outlined in [[85]21, [86]26], we employed a clumping procedure from PLINK (version v1.90 b3.38) [[87]33] to choose instrumental variables (IVs) for each metabolite, using a relatively lenient criterion (P = 1 × 10^−5, r ^2 = 0.01 with window set to 500 kb). We employed various methods, including fixed-effect IVW [[88]34], random-effect IVW, weighted median-based approach [[89]35], D-IVW [[90]36], MR-RAPS method [[91]37], and MR-Egger regression [[92]38], to scrutinize the heterogeneity and potential pleiotropic effects. Detailed information of MR analysis was presented in Additional file 1: Supplementary Methods [[93]21, [94]26, [95]33–[96]35, [97]38–[98]46]. To account for the multiple tests inherent in metabolomic analyses, we applied false discovery rate (FDR) correction using the Benjamini–Hochberg procedure. Metabolites with FDR threshold of q < 0.10 was viewed as significantly causal metabolites. The suggestive associations between metabolites and allo-HSCT outcomes were determined at the nominal significance level with the threshold set at P < 0.05, which is mainly used for further metabolic pathway analyses. Our results classified the metabolites into three categories: (1) causal metabolites, FDR q < 0.10; (2) suggestive causal metabolites, P < 0.05; (3) other metabolites, P > 0.05. Scatter plots were used to identify outliers or genetic variants that do not match the expected relationship, which marked as outliers. Funnel plots are used to assess the presence of horizontal pleiotropy, where the estimate of the effect of each genetic variant on the outcome corresponds to its precision (the inverse of the standard error). Asymmetries in the funnel plot may indicate the presence of pleiotropy or other biases. We also conducted reverse MR analysis to estimate the causal effect of allo-HSCT outcomes on identified metabolites to rule out the possible bi-directional association. Multivariable and mediation MR analysis By using MVMR, we evaluated the independent causal effects of identified metabolites on GVHD and relapse risk after adjusting for common complex traits [[99]47, [100]48]. Detailed information on GWAS summary statistics used in MVMR was presented in Additional file 1: Supplementary method [[101]47–[102]49]. Additionally, based on the MVMR framework, we estimated the direct and indirect effect of metabolites on allo-HSCT outcomes via immune traits [[103]47, [104]50, [105]51] by using a two-step MR approach [[106]48]: we used SNPs associated with metabolites to estimate the causal effects of metabolites on immune cells and allo-HSCT outcomes. Then, SNPs associated with both metabolites and immune cells were together used to estimate the marginal effect size of metabolites and immune cells on allo-HSCT outcomes. Detailed information was presented in Additional file 1: Supplementary Methods [[107]52–[108]55]. Metabolic pathway analysis To explore the possible biological functions and pathways in which the identified GVHD-related and relapse-related causal metabolites may be involved, we performed metabolic pathway analyses on metabolites suggestively associated with different allo-HSCT outcomes (where we did not differentiate between populations) separately using MetaboAnalyst 4.0 ([109]https://www.metaboanalyst.ca/) [[110]56]. The pathway analyses mainly included two libraries, the Small Molecule Pathway Database (SMPDB) [[111]57] and the KEGG [[112]29] database, and the significance level of the pathway analysis was nominally significant, i.e., P < 0.05. Metabolite differential analysis Differential analysis of metabolites between patients with and without aGVHD was performed using the Wilcoxon rank-sum test, and line plots were used to demonstrate the distribution of the metabolites in patients across different time points, with the significance level of the test (α) set at 0.05. Data was presented as the mean ± standard error of the mean. Phenome-wide association studies (PheWAS) for identified metabolites We obtained the GWAS-meta summary datasets of FinnGen R10 and UK Biobank from the FinnGen website ([113]https://r10.finngen.fi/), which included 652 complex diseases in cancer, blood system, digestive tract, cardiovascular system, etc. We used the two-sample MR method (mainly IVW method) to comprehensively evaluate the causal association between pyridoxine and these systemic diseases. Software All statistical analysis was performed in R 3.5.1 software. MR analysis was conducted with the “MendelianRandomization” package (version 0.4.3) [[114]58], “mr.raps” package (version 0.2), and MR-PRESSO was performed with MR-PRESSO package [[115]43]. Multiple test adjustments were performed in “fdrtool” package (version 1.2.17) [[116]59], and figures were drawn based on “ggplot2” package [[117]60]. Results Overview of our results The result of IVW method revealed 188 suggestive metabolites that show significant associations at the nominal significance level (P < 0.05) with various allo-HSCT outcomes (Fig. [118]2 and Additional file 2: Table S2). We summarized the distribution and percentage of metabolites identified between the different phenotypes in Additional file 1: Fig. S1. After multiple test adjustment (FDR q < 0.10), 4 causal metabolite features were finally identified, 1 of which for aGVHD[3] FS (pyridoxate), 2 for aGVHD[3] URD (1-stearoylglycero phosphoethanolamine, glycochenodeoxycholate), and 1 for relapse all (proline) (Fig. [119]3 and Additional file 2: Table S3). Next, we conducted sensitivity analysis, multivariable MR, mediation MR analysis, and external validation of the four identified causal metabolites (FDR q < 0.10). No sample overlap exists between GWAS summary statistics for exposures and outcomes. Metabolic pathway enrichment analysis was performed on the suggestive metabolites identified to investigate the potential mechanisms. Finally, we validated the protective effects of pyridoxine and 1-stearoylglycerophosphoethanolamine in different GWAS datasets. And lower concentrations of pyridoxine were found in aGVHD patients compared to non-aGVHD patients. Fig. 2. [120]Fig. 2 [121]Open in a new tab Heatmap of causal relationship between metabolites and allo-HSCT outcomes. Note: Only metabolites with significant associations in at least three allo-HSCT outcomes are shown in the figure. Red blocks represent high risk and blue blocks represent low risk. In addition, six different colors represent the pathways to which each metabolite belongs. Abbreviation: GVHD, graft-versus-host disease; FS, matched siblings; URD: matched unrelated donors Fig. 3. [122]Fig. 3 [123]Open in a new tab Significant causal associations identified for GVHD and relapse risk. A The forest plot shows the OR values and CIs of pyridoxate on aGVHD[3] FS risk estimated using different MR methods; B scatterplot of causal association between pyridoxate and aGVHD[3] FS risk; C funnel plot of causal associations between pyridoxate and aGVHD[3] FS risk; D the forest plot shows the OR values and CIs of glycochenodeoxycholate on aGVHD[3] URD risk estimated using different MR methods; E scatterplot of causal association between glycochenodeoxycholate and aGVHD[3] URD risk; F funnel plot of causal associations between glycochenodeoxycholate and aGVHD[3] URD risk; G the forest plot shows the OR values and CIs of 1-stearoylglycerophosphoethanolamine on aGVHD[3] URD risk estimated using different MR methods; H scatterplot of causal association between 1-stearoylglycerophosphoethanolamine and aGVHD[3] URD risk; I funnel plot of causal associations between 1-stearoylglycerophosphoethanolamine and aGVHD[3] URD risk; J the forest plot shows the OR values and CIs of proline on relapse ALL risk estimated using different MR methods; K scatterplot of causal association between proline and relapse ALL risk; L funnel plot of causal associations between proline and relapse ALL risk. Abbreviation: GVHD, graft-versus-host disease; FS, matched siblings; URD: matched unrelated donor Causal effects of metabolites on allo-HSCT outcomes In the discovery MR analysis, pyridoxate was found to potentially reduce the risk of aGVHD[3] FS. The IVW method estimated that an increase of 0.1 units in pyridoxate levels resulted in 0.593 times lower odds of developing aGVHD[3] FS (95% CI: 0.458 ~ 0.768, P = 7.49 × 10^−05, FDR q = 0.034). This protective effect was supported by five other methods (P < 0.05), namely weighted mode, weighted median, DIVW, MR-RAPS, and MR-PRESSO (Fig. [124]3A). Glycochenodeoxycholate may be a risk factor for aGVHD[3] URD. The IVW method estimated an OR of 1.276 (95% CI: 1.124 ~ 1.449, P = 1.68 × 10^−04, FDR q = 0.039). This finding was supported by the weighted mode, weighted median, DIVW, MR-RAPS, and MR-PRESSO methods (P < 0.05, Fig. [125]3D). Elevated levels of 1-stearoylglycerophosphoethanolamine could decrease the risk of aGVHD[3]URD, with an OR estimated to be 0.624 (95% CI = 0.494 ~ 0.787, P = 6.94 × 10^−05, FDR q = 0.032) by IVW. This finding is consistent with the results from weighted mode, weighted median, DIVW, MR-RAPS, and MR-PRESSO (P < 0.05, Fig. [126]3G). Elevated proline levels serve as a protective effect against relapse ALL. The OR for this association was estimated to be 0.746 (95% CI = 0.642 ~ 0.866, P = 1.23 × 10^−04, FDR q = 0.055) by IVW (fixed). This estimate aligns with the findings from weighted median, DIVW, MR-RAPS, and MR-PRESSO (P < 0.05, Fig. [127]3J). Comprehensive sensitive analysis proved the robustness of identified associations; detailed information was presented in Supplementary Results, Fig. [128]3, and Additional file 2: Table S4. The metabolite signal distributions shared between different GVHD and relapse phenotypes are shown in Additional file 1: Figs. S2 (all participants), S3 (sibling participants), and S4 (unrelated participants). Four causal metabolites were identified across different outcomes: Pyridoxate decreased aGVHD[2b], aGVHD[2], aGVHD[3], and gut GVHD risk but increased relapse risk in the FS cohort. 1-Stearoylglycerophosphoethanolamine decreased relapse risk in the FS cohort and aGVHD[2b] and aGVHD[3] in the URD cohort. Glycochenodeoxycholate increased aGVHD[3] and gut aGVHD in both the all and FS cohorts, and aGVHD[2b] and aGVHD[3] in the URD cohort. Proline increased gut GVHD in the FS cohort and decreased relapse in the whole cohort. Metabolites such as taurodeoxycholate, pyridoxate, glycochenodeoxycholate, alpha-hydroxyisovalerate, and trans-4-hydroxyproline were significant in multiple outcomes. Bile acid derivatives, particularly taurodeoxycholate and glycochenodeoxycholate, showed repeated significance, suggesting a role of bile acids in GVHD onset. MVMR analysis excluded the pleiotropic effect of common traits on the associations between identified metabolites and GVHD From Additional file 1: Fig. S5 and Additional file 2: Table S5, we can conclude that despite adjusting for various factors, the four metabolites showed significant associations with the allo-HSCT outcomes in almost all results, indicating that they have independent and stable causal effects on the allo-HSCT outcomes. For pyridoxate, the association lost significance after adjusting for confounding factors including body mass index (BMI), blood pressure (BP), and different lipid levels (such as high-density lipoprotein cholesterol and low-density lipoprotein cholesterol). However, the association between metabolites and allo-HSCT outcomes showed a consistent direction, and adjustment of confounding factors only weakened part of its effect and did not reverse the overall association direction. Validation of four identified causal metabolites To verify the reliability of our results, we obtained data on four metabolites from an independent metabolome GWAS study for validation (Additional file 2: Table S6). Additionally, we performed a GWAS meta-analysis on these four identified metabolites from two different sources to further validate the findings and conduct subsequent mediation tests (Additional file 1: Fig. S6). In the validation MR analysis, results revealed that different metabolite levels were also associated with varying risks of GVHD and relapse in patients (Additional file 1: Fig. S7). Specifically, higher proline levels were associated with an increased risk of acute GVHD[2b] in ALL patients. Pyridoxate levels consistently showed a significant reduction in risk across various outcomes and donor types, indicating a protective role against both aGVHD and cGVHD as well as relapse. For 1-stearoylglycerophosphoethanolamine, there was a significant reduction in the risk of aGVHD[2b] and aGVHD[3] in URD patients, and a consistent protective effect against cGVHD in ALL and FS donor types. Glycochenodeoxycholate levels, however, showed fewer significant associations and no associations with GVHD risk (Fig. [129]4). Fig. 4. [130]Fig. 4 [131]Open in a new tab Causal effects of four identified metabolites on GVHD in validation MR analysis. Note: Blue represents validation MR analysis and red represents meta (discovery + validation) MR analysis. Abbreviation: GVHD, graft-versus-host disease; FS, matched siblings; URD: matched unrelated donors Mediated MR analysis explains possible immune mechanisms in the metabolite-GVHD pathway In order to definitively identify the possible causal role of identified allo-HSCT-related metabolites in the immunological process (Fig. [132]5A, Additional file 1: Fig. S8), we estimated causal effects of four metabolites on immune cell traits (Additional file 2: Table S7). At the nominal significance level (P < 0.05), we identified causal effects of identified metabolites of GVHD on 171 immune cells by using IVW method. After adjusting for multiple testing (FDR q < 0.10), we identified significant causal effects of proline on seven immune cell populations, including CD39^+ CD4^+ AC and CD39^+ CD4^+ %T cells, as well as the effect of glycochenodeoxycholate on one immune cell population (CD45 on CD33dim HLA-DR^+ CD11b^−). Also, we observed the causal effect of pyridoxate on three immune cells at the significance of FDR q < 0.20 (CD45 on granulocyte, CD39^+ secreting Treg %CD4 Treg, CD39^+ secreting Treg %secreting Treg) (Fig. [133]5B and Additional file 2: Table S8). Fig. 5. [134]Fig. 5 [135]Open in a new tab Immune cells mediated a causal association between metabolites and GVHD risk. A Diagram of mediation MR analysis; B causal associations between metabolites on immune cells (FDR q < 0.10); C immune cells mediated causal associations between pyridoxate and GVHD risk; D immune cells mediated causal associations between 1-stearoylglycerophosphoethanolamine and GVHD risk; E effects of causal mediation MR analysis between two metabolites and GVHD risk. Note: The contribution of mediators (indirect effect 1* indirect effect 2) for the association between metabolites and GVHD risk. Abbreviation: IDE, indirect effect; DE, direct effect; TE, total effect; GVHD, graft-versus-host disease; FS, matched siblings; URD: matched unrelated donors Causal mediation analyses (Fig. [136]5A) based on MVMR framework revealed that a total of four significant immune cells (CD19 on IgD^− CD38^dim, CD39^+ activated Treg %activated Treg, CD28^− CD25^++ CD8^br %CD8^br, CD39^+ resting Treg AC) were found to have indirect effects via significant causal metabolites and outcomes (Fig. [137]5C, D, Additional file 2: Table S9). All mediation effects were consistent with the direction of the main causal effect. Two CD39^+ Treg-related immune indicators were identified to mediate the causal effects of pyridoxate on aGVHD[3] URD: CD39^+ activated Treg %activated Treg accounting for 23% of the total effect (Beta = − 0.088, 95% CI = − 0.193 ~ − 0.018) and CD39^+ resting Treg AC accounting for 17.4% of the total effect (Beta = − 0.067, 95% CI = − 0.177 ~ − 0.001). Another two immune cells mediated the causal associations between 1-stearoylglycerophosphoethanolamine and aGVHD[3] FS: CD19 on IgD^− CD38^dim accounting for 40.7% of the total effect (Beta = − 0.262, 95% CI = − 0.619 ~ − 0.018) and CD28^− CD25^++ CD8^br %CD8^br accounting for 48.7% of the total effect (Beta = − 0.313, 95% CI = − 0.730 ~ − 0.002) (Fig. [138]5E). Pyridoxate suppresses T cell function in mixed lymphocyte reactions In the MLR, pyridoxate treatment significantly reduced the proportion of CD69^+CD4^+ and CD44^+CD62L^− CD4^+ T cells, indicating that pyridoxate may impair the activation and function of CD4^+ T cells (Fig. [139]6). Additionally, pyridoxate increased the expression of PD-1 on CD8^+ T cells, suggesting a potential promotion of CD8^+ T cell exhaustion (Fig. [140]6). These effects were observed in a concentration-dependent manner, highlighting the dose-dependent modulation of T cell activation and exhaustion by pyridoxate. Collectively, these findings suggest that pyridoxate may attenuate immune cell function during allogeneic antigen stimulation. Fig. 6. [141]Fig. 6 [142]Open in a new tab Pyridoxate suppresses T cell function in mixed lymphocyte reactions. Percentages of CD69^+CD4^+ T cells (A) and CD44^+CD62L^− CD4^+ T cells (B) at different concentrations of pyridoxate. C Percentages of PD-1^+ CD8.^+ T cells at varying concentrations of pyridoxate. Data are represented as mean ± SEM, with n = 4 per group. *P < 0.05; **P < 0.01 Metabolic pathway analysis revealed potential biological functions In the intricate landscape of the metabolite pathway after allo-HSCT, an array of pathways recurrently emerges, illuminating their pivotal roles in the disease’s complex pathophysiology (Fig. [143]7A and Additional file 2: Table S10). In relapse, key metabolic pathways include porphyrin and chlorophyll metabolism, butanoate metabolism, and D-glutamine and D-glutamate metabolism. aGVHD[2] is marked by aminoacyl-tRNA biosynthesis, caffeine metabolism, and arginine and proline metabolism. aGVHD[2b] primarily involves arginine and proline metabolism, and phenylalanine, tyrosine, and tryptophan biosynthesis. aGVHD[3] is characterized by significant activity in caffeine metabolism, arginine biosynthesis, and primary bile acid biosynthesis. cGVHD features pathways such as pantothenate and CoA biosynthesis, aminoacyl-tRNA biosynthesis, and glycine, serine, and threonine metabolism. Finally, gut GVHD is associated with heightened activity in aminoacyl-tRNA biosynthesis, primary bile acid biosynthesis, and caffeine metabolism. Fig. 7. [144]Fig. 7 [145]Open in a new tab A Metabolic pathway significance across various allo-HSCT outcomes as analyzed by MetaboAnalyst 4.0 based on KEGG and SMPDB database. Each bar represents the − log[10] (P values) for the association between the phenotype and the metabolic pathway. Higher values indicate greater statistical significance. Concentrations of causal metabolites measured in aGVHD patients and non-aGVHD patients after allo-HSCT. B Concentration of pyridoxine measured by fecal LCMS metabolome analysis; C concentration of glycochenodeoxycholic acid 3-glucuronide measured by fecal LCMS metabolome analysis; D concentration of glycochenodeoxycholic acid 7-sulfate measured by fecal LCMS metabolome analysis. E Manhattan plot of PheWAS for pyridoxate on 652 endpoints. Abbreviation: GVHD, graft-versus-host disease; FS, matched siblings; URD: matched unrelated donors. LCMS, liquid chromatography mass spectrometry Fecal metabolic profile analysis of patients in a prospective cohort In our cohort comparing metabolite concentrations between patients with aGVHD and those without aGVHD (Fig. [146]7B–D), pyridoxine levels in the aGVHD group were significantly lower than those in the non-aGVHD group 4–6 weeks post-transplantation (P = 0.023), while glycochenodeoxycholic acid 3-glucuronide were higher 2–4 weeks post-transplantation (P = 0.027). Although the differences in other comparisons did not reach statistical significance, pyridoxine levels were almost higher in the non-aGVHD group, while both glycochenodeoxycholic acid 7-sulfate and glycochenodeoxycholic acid 3-glucuronide tended to be elevated in the aGVHD group. Phenotype-wide association MR analysis of pyridoxine We observed that pyridoxine plays a significant role in GVHD, prompting us to further evaluate its causal associations with other phenotypes using GWAS summary datasets from FinnGen R10 and UKBB. The results showed that pyridoxine had causal effects on 95 out of 652 endpoints at a significance level of 0.05. After FDR correction (FDR q < 0.10), 23 endpoints remained significant. Specifically, pyridoxine was found to reduce the risk of anemia, vestibular dysfunction, skin diseases, purpura and other hemorrhagic conditions, and depression, but it also increased the risk of certain tumors, hemangioma, and lymphangioma (Fig. [147]7E, Additional file 2: Table S11). Discussion Overview The study established suggestive causal associations between 188 blood metabolites and allo-HSCT outcomes. After FDR adjustment, glycochenodeoxycholate was causally linked to increased aGVHD[3] risk in URD, while 1-stearoylglycerophosphoethanolamine was protective. Elevated pyridoxate and proline levels were causally associated with decreased risks of aGVHD[3] in FS and relapse in ALL. Validation MR analysis confirmed the protective effect of pyridoxate on GVHD risk. Additionally fecal metabolome analysis in our cohort also observed a protective effect of pyridoxate on aGVHD. MVMR analyses evaluated independent causal effects, and mediation MR analyses revealed direct and indirect effects on GVHD risk via immune traits. Metabolic pathway analysis highlighted key pathways, including bile acid metabolism, vitamin B[6] derivatives, and amino acid derivatives. Pyridoxate’s consistent causal effects on allo-HSCT outcomes were confirmed in both discovery and validation studies. Additionally, in our long-term follow-up cohort, levels of pyridoxine in feces were consistently lower in the aGVHD group compared to the non-aGVHD group at most time points from pre-transplant to 6 weeks post-transplant. These findings, along with the fact that pyridoxine in feces and pyridoxate in blood are different stages of the vitamin B[6] metabolic pathway, underscore the critical protective role of vitamin B[6] against aGVHD. The anti-inflammatory properties of vitamin B[6] are not only characterized by its low expression in conditions such as rheumatoid arthritis [[148]61] and inflammatory bowel disease [[149]62], but also by its negative correlation with inflammatory markers such as TNF-α [[150]63], ESR [[151]64], and CRP [[152]65]. This suggests that vitamin B[6] plays a crucial role in modulating inflammatory responses. Furthermore, vitamin B[6] has been shown to inhibit NLRP3 inflammasome activation [[153]66] and regulate IL-33 homeostasis, leading to decreased IL-33 levels [[154]67]. IL-33 serves as a crucial costimulatory molecule for Th1 cells within the gastrointestinal tract during aGVHD [[155]68, [156]69]. Our mediation analysis has further confirmed the role of CD39^+ Treg cells in the vitamin B[6]-aGVHD pathway, consistent with previous findings that Treg cells can prevent and treat GVHD without increasing the risk of relapse or infection [[157]70, [158]71]. CD39 enhances Treg function by hydrolyzing extracellular ATP and ADP into AMP, reducing pro-inflammatory ATP levels. The AMP is further converted by CD73 into adenosine, a potent immunosuppressive molecule. Notably, CD39^+ Treg cells have demonstrated enhanced suppressive functions across various disease models, including rheumatic joints [[159]72], multiple sclerosis [[160]73], colorectal cancer [[161]74], and lipopolysaccharide-induced acute lung injury [[162]75]. These findings underscore the potential therapeutic value of monitoring and possibly augmenting vitamin B[6] levels in patients undergoing allo-HSCT. Nevertheless, additional in vivo and in vitro experiments specific to vitamin B[6] are necessary to validate the aforementioned conclusions in GVHD models. Emerging evidence highlights the critical role of the gut microbial metabolites in the development and progression of aGVHD [[163]76]. Bile acid metabolism initiates in the liver (via CYP7A1 and CYP27A1) and produces primary bile acids that are further processed by gut microbes into secondary bile acids. Recent studies provided detailed analysis of bile acid alteration during HSCT. Sarah et al. observed decreased total BA levels in GVHD patients compared to controls, which was driven by a reduction in the microbe-derived BA compartment. Metagenomic sequencing revealed reduction of bile acid metabolism genes including bsh and bai operon genes, which led to lower unconjugated bile acids and microbial-derived bile acids, linking to GVHD onset [[164]77]. Longitudinal metabolome analysis performed by Orberg et al. revealed higher abundance of primary bile acid CDCA and CA, but lower abundance of secondary bile acid DCA and LCA on day 28, compared to that on day − 7. Particularly, LCA, immunomodulatory functions, was significantly lower in gut GVHD patients compared to control [[165]78]. Another study identified Faecalibacterium expansion, accompanied with secondary bile acid (such as UDCA) reduction as one of the possible factor of GVHD onset [[166]79]. Combining these studies, intestinal BA pool alteration, manifested as increase of primary bile acid (CDCA) or decreased secondary bile acids might be an important biomarker for GVHD onset. In both studies, changes in bile acids were closely related to changes in the microbiota. Indeed, the abundance of intestinal bile acids is codetermined by the microbial enzyme activity [[167]80], ileum transportation [[168]81, [169]82], and cross-regulation of FXR-FGF15/19 [[170]83] feedback mechanism. Thus, characterizing individual BA that was differentially abundant in GVHD patients is challenging. Although bile acid metabolism is a complex dynamic process, it is still necessary to screen and identify bile acids that can serve as disease biomarkers, as well as bile acids with immunomodulatory effects. It was estimated that 95% of total Bas was transported in the terminal ileum via the BA transporters [[171]81, [172]82]. Our study complements previous findings by identifying the conjugated primary bile acid glycochenodeoxycholic acid in the serum as a risk factor for aGVHD using MR analysis. Together, these studies underscore the critical role of bile acid metabolism and its interaction with gut microbiota in shaping post-transplant outcomes. 1-Stearoylglycerophosphoethanolamine, as a phospholipid, has been less extensively studied, primarily in association with deep vein thrombosis [[173]84] and pneumonic plague [[174]85]. Surprisingly, this metabolite exhibits a protective effect against GVHD, a phenomenon that warrants further exploration. Following allo-HSCT, the most typical intrinsic mechanisms for immune evasion and relapse all share the common objective of disrupting the interaction between donor T cells and tumor cells [[175]86, [176]87]. Moreover, studies have shown that lymphocyte membrane viscosity can be modulated by restoring an optimal cholesterol/phospholipid ratio [[177]88]. B cells are commonly recognized as effector cells in cGVHD [[178]89]. Interestingly, we observed a significant indirect effect of 1-stearoylglycerophosphoethanolamine on aGVHD risk via immune cells (i.e., CD19 on lgD^− CD38^dim (MFI, B cell)), broadening the understanding the pathogenic role of B cells. This state of B cells is typically defined as either activated or memory B cells, which exhibit an enhanced capacity for rapid antigen response. Notably, research suggested that mesenchymal stem cell (MSC) therapy for aGVHD might achieve therapeutic effects by specifically targeting the antigen-presenting capabilities of B cells [[179]90]. Proline uptake has been demonstrated to promote the activation of lymphoid tissue inducer cells [[180]91]. We postulate whether a similar activating effect may be observed in donor T cells. Furthermore, proline has been shown to protect Jurkat and BJAB cell lines against reactive oxygen species (ROS)-mediated oxidative stress [[181]92], a phenomenon that has been confirmed to impart a survival advantage to leukemic blasts and enhance chemotherapy resistance [[182]93]. However, given that proline can serve as a source of energy and a precursor for protein synthesis in solid tumor cells [[183]94], we posit that its relationship with more hematologic malignancies merits further exploration. In the complex landscape of allo-HSCT outcomes, a detailed examination of metabolic pathways reveals a sophisticated interplay of biological processes. The relapse phenotype is marked by active porphyrin and chlorophyll, butanoate, D-glutamine, and D-glutamate metabolism pathways, indicative of increased cellular turnover and oxidative stress adaptation. Interestingly, the caffeine metabolism pathway emerges as a significant factor across four allo-HSCT outcomes, suggesting its pivotal role in metabolic disturbances. Coffee consumption, primarily attributed to its abundant antioxidant properties, has been shown to decrease biomarkers associated with oxidative stress and inflammation [[184]95]. Given that a heightened inflammatory state is a prominent feature of GVHD [[185]96], caffeine metabolism is also hypothesized to play a significant role. Studies have shown significant differences in caffeine metabolites between participants with cGVHD and those without cGVHD [[186]97]. This intricate network, where aminoacyl-tRNA biosynthesis is a recurring element, points to potential intervention targets, meriting further investigation for specialized therapeutic approaches. In aGVHD, this pathway, along with arginine and proline metabolism, signals disrupted protein synthesis and amino acid metabolism, characteristic of an inflammatory response [[187]98]. Although previous studies did not report the direct relationship between aminoacyl-tRNA biosynthesis pathway and GVHD risk, this pathway could participate in different types of immune response and immune diseases [[188]99]. The gut GVHD phenotype, distinguished by aminoacyl-tRNA biosynthesis, underscores the gut’s critical role, possibly affecting nutrient processing and microbiome interactions [[189]100]. The observed differences in metabolite effects by donor type, particularly for pyridoxate, can likely be attributed to the genetic similarity between donors and recipients in the FS cohort. In this cohort, donor-recipient pairs share identical SNP profiles, which may influence coenzyme biosynthesis and metabolic pathways in a way that differs from other donor types. These differences in coenzyme regulation could subsequently lead to variations in metabolite levels and their downstream effects on immune modulation [[190]101]. For pyridoxate, its protective effect against aGVHD (e.g., aGVHD[2b], aGVHD[2], aGVHD[3], and gut GVHD) and its association with increased relapse risk in the FS cohort may reflect the dual nature of its immunomodulatory properties. This explanation underscores how donor-recipient genetic backgrounds, mediated through SNP-related coenzyme regulation, can influence metabolite profiles and their functional outcomes. Clinical importance Our findings provide novel insights into the metabolic pathways underlying GVHD and relapse outcomes, with significant implications for clinical translation. The identification of pyridoxate as a protective factor against severe aGVHD highlights its potential as both a biomarker and a therapeutic agent. Clinically, pyridoxate and related metabolites could be developed as predictive biomarkers to stratify patients at risk of severe aGVHD, enabling earlier and more targeted prophylactic interventions. For instance, patients with lower pyridoxate levels could receive enhanced monitoring or personalized treatment strategies. Additionally, vitamin B[6] derivatives like pyridoxate could be evaluated as adjunct therapies to modulate immune responses, particularly by enhancing CD39 ^+ Tregs to reduce GVHD risk without impairing GVL effects. Furthermore, our findings underscore the potential of targeting bile acid metabolism, particularly by reducing harmful primary bile acids like glycochenodeoxycholate, which we identified as a risk factor for severe aGVHD. Interventions aimed at decreasing harmful bile acids or promoting beneficial secondary bile acids may further improve GVHD outcomes by rebalancing bile acid metabolism and mitigating pro-inflammatory effects. Future research could validate these therapeutic approaches in clinical settings and explore their potential to improve patient outcomes while minimizing the adverse effects of broad immunosuppression. Strengths and limitations Unlike these studies which were constrained by their inability to infer causality and relatively small cohort sizes, our study utilizes MR to establish robust causal relationships. Using genetic variants as instrumental variables, we identified glycochenodeoxycholate as a risk factor for aGVHD and pyridoxate as a protective factor, expanding on previously observed associations with novel mechanistic insights. Moreover, the significantly larger sample size in our study, derived from GWAS datasets, enhances the generalizability and statistical power of our findings compared to prior cohort-based studies. These methodological advancements allow us to move beyond correlations to provide causal evidence, paving the way for targeted therapeutic interventions in GVHD management. MR studies often rely on the analysis of GWAS summary datasets. A major limitation of this approach is that it does not allow for more detailed stratification of the sample. For example, the study may not be able to group the sample according to factors such as age, gender, lifestyle, or disease status, which may mask the potential impact of these variables on the findings. Due to the lack of stratification, the results of the study may not accurately reflect the actual situation of specific subgroups, resulting in limited applicability and generalizability of the study. MR studies are usually conducted based on linear assumptions. However, in many biological and medical scenarios, the relationships between variables may be non-linear, such as curvilinear or stepwise. This limitation may lead to important biological associations or interactions being overlooked, especially when complex metabolic pathways or immune responses are involved. Our MR studies rely on genetic data from populations of predominantly European descent. This population specificity limits the general applicability of the study because different ethnicities and populations may have different genetic backgrounds and environmental factors. One limitation of this study is the use of an FDR threshold of 0.10 instead of the more stringent 0.05. While this threshold increased sensitivity and allowed for the identification of more potential associations, it may also introduce a slightly higher rate of false positives. However, we addressed this limitation by conducting independent validation of the identified metabolites to confirm the robustness of our findings. Despite our careful selection of instrumental variables and the use of reverse MR analysis, MR cannot completely rule out the possibility that reverse causation may influence some observed associations. Conclusions Our study highlights pyridoxate as a protective factor against severe aGVHD, with consistent findings across multiple evidence. This underscores the potential immunomodulatory role of pyridoxate in allo-HSCT outcomes. Our findings provide new insights into the metabolic mechanisms underlying GVHD and relapse, offering potential biomarkers and therapeutic targets for improving patient management after allo-HSCT. Further experimental validation and clinical studies are warranted to explore the translational potential of pyridoxate and other identified metabolites in GVHD prevention and treatment. Supplementary Information [191]12916_2025_4026_MOESM1_ESM.docx^ (2.9MB, docx) Additional file 1: Supplementary Methods, Supplementary Results, and Figs. S1–S8. Supplementary Methods—A supplementary document on GVHD phenotypes, GWAS data sources, immune cell traits, multivariable Mendelian randomization, mediation analysis, and statistical methods. Supplementary Results—A supplementary document on supplementary results on external sensitivity analysis, pleiotropy assessment, and reverse-direction Mendelian randomization. Fig. S1. The bar plot above visually represents the count of significant metabolitesacross different GVHD phenotypes. Fig. S2. Heatmap of causal relationship between metabolites and GVHD outcomes. Fig. S3. Heatmap of causal relationship between metabolites and GVHD outcomes. Fig. S4. Heatmap of causal relationship between metabolites and GVHD outcomes. Fig. S5. Forest plot depicting the association between metabolites and allo-HSCT outcomes after adjusting for various confounding factors. Fig. S6. Manhattan and QQ plots of GWAS meta-analysis for four identified metabolites using discovery and validation summary statistics. Fig. S7. Causal associations between four metabolites and allo-HCST outcomes in validation MR analysis. Fig. S8. The bar plot above visually represents the count of significant metabolitesacross different GVHD phenotypes [192]12916_2025_4026_MOESM2_ESM.xlsx^ (2.6MB, xlsx) Additional file 2: Tables S1–S11. Table S1. Data sources. Table S2. Causal associations between serum metabolites and allo-HCST outcomes by using IVW methods. Table S3. Causal associations between serum metabolites and allo-HCST outcomes by different MR methods. Table S4. Reverse MR analysis to test causal associations between allo-HCST outcomes and serum metabolites by different MR methods. Table S5. Multivariable MR analysis for identified causal metabolites on allo-HCST outcomes. Table S6. Validation MR analysis for identified causal metabolites on allo-HCST outcomes in independent cohorts. Table S7. Causal effects of identified metabolites on immune cell traits. Table S8. Sensitive analysis of identified metabolites on immune cell traits. Table S9. Mediation effect of immune cell traits between identified metabolites and GVHD. Table S10. Enrichment analysis of identified metabolites at the nominal significance level. Table S11. PheWAS analysis of pyridoxate Acknowledgements