Abstract Introduction Numerous evidence have highlighted a robust association between inflammatory proteins, gut microbiotas, and immune cells and leukemia. However, the causal relationship remains poorly defined. To delve into this connection, we implemented a bidirectional Mendelian randomization (MR) study. Materials and methods This study utilized genetic variation data from publicly accessible genome-wide association study (GWAS) datasets. We used methods such as inverse variance weighting (IVW) to assess the causal relationship between exposure and the outcome of leukemia. Mediation analyses were applied to investigate the associations between immunophenotypes, gut microbiotas and inflammatory proteins and leukemia. Instrumental variables (IVs) mapping genes were identified, and functional analyses of the related genes were subsequently carried out. Sensitivity analyses was implemented to fortify the robustness of methods and results. Results This study uncovered four inflammatory proteins exhibiting significant associations with elevated leukemia risk, while leukemia exerted discernible effects on six inflammatory cytokines. IVW analysis revealed two immune cell subtypes with opposing roles on leukemia risk. One gut microbiota subtypes exhibited a pro-leukemogenic association, contrasted by four subtypes displaying protective influences. Enrichment analysis further identified three differentially expressed genes between malignant and adjacent normal tissues, with related genes demonstrated pronounced pathway enrichment in the mitogen-activated protein kinase (MAPK) signaling pathway. Conclusion These findings shed new light on the genetic associations between circulating inflammatory proteins, gut microbiotas, and immune cells and leukemia. It may not only enrich the understanding but also guide deeper clinical and basic research in this domain. Supplementary Information The online version contains supplementary material available at 10.1007/s12672-025-02863-y. Keywords: Leukemia, Circulating inflammatory proteins, Mendelian randomization, Causal relationship, Genome-wide association study Introduction Leukemia, a hematologic malignancy characterized by striking metabolic dysregulation in malignant cells and leukemic stem cells compared to their normal counterparts [[32]1]. One of the leading causes of cancer-related deaths globally is leukemia [[33]2]. The worldwide incidence of leukemia is expected to continue to decrease in the future, according to predictive models, however, men and older people are still at higher risk, particularly given the higher death rates from acute myeloid leukemia (AML) and chronic myeloid leukemia (CML) [[34]3, [35]4]. The incidence of leukemia linked to metabolic syndrome rose by 1.2% year, whereas traditional risk factors (such as exposure to benzene) decreased, indicating the influence of lifestyle modifications [[36]5, [37]6]. Chronic inflammation plays a pivotal role of leukemia, extending beyond traditional risk factors to encompass novel inflammatory biomarkers as critical disease modifiers in leukemia progression [[38]7]. Circulating inflammatory proteins include various cytokines and chemokines such as interleukin-6 (IL-6), interleukin-8 (IL-8), and IFN-γ [[39]8], which are crucial in the body’s immune response, regulating physiological functions like inflammation, cellular growth and differentiation, and immune cell activation and migration [[40]9, [41]10]. Mixed immunophenotypic profiles, as demonstrated by a research, correlate with adverse clinical outcomes in AML patients [[42]11]. A Mendelian randomization (MR) study revealed the gut microbiotas-leukemia interactions [[43]12]. Transcriptional targets of Signal transducer and activator of transcription 3 (STAT3), which include inflammatory factors, somatically mutate in 30–40% of large granular lymphocyte (LGL) leukemia cases [[44]13]. Emerging as a powerful epidemiological tool, MR employs genetic variants as instrumental variables (IVs) to infer causal links between risk factors and specific diseases [[45]14]. By leveraging genetic diversity, MR aims to untangle causal associations between exposures and outcomes, effectively addressing mitigating factors inherent in epidemiological investigations [[46]15]. Through MR methods, genetic variability adheres to the random allele assignment principle during meiosis, similar to randomization in controlled trials. This feature enables MR to overcome challenges such as confounding biases, reverse causation, and biased sampling that commonly affect observational studies, while circumventing practical constraints faced by randomized trials [[47]16]. Despite its transformative potential in decoding disease etiologies, the application of bidirectional MR to elucidate leukemia-inflammatory proteins interactions remains conspicuously absent from current literature. To address this critical knowledge gap, we conducted a two-way MR analysis to systematically investigate the causal interplay between leukemia and circulating inflammatory proteins. Materials and methods Study design This study implemented a two-sample MR to explore the causal link between 91 inflammatory proteins, gut microbiome of Dutch Microbiome Project, and immune cells and leukemia. The IVs used in MR analysis must satisfy three key assumptions: (1) Strong association between genetic instruments and environmental exposure, (2) Absence of possible confounders linking genetic variants, (3) The effects of genetic variants on outcome only mediated through the exposure [[48]17, [49]18]. Data source This study leveraged publicly accessible summary statistics from genome-wide association study (GWAS). Population data were procured from the UK Biobank ([50]https://gwas.mrcieu.ac.uk/datasets/). Specifically, the data for leukemia comprised 3,301 participants of European descent (identifier: prot-a-235). Data on 91 inflammatory proteins were collected from a study exploring their genetic associations. The research involved eleven groups, with a total of 14,824 individuals [[51]19]. Genetic information was gathered from across the genome and plasma protein data via the Olink Targeted Inflammation Panel. Data for mediation analysis comprising 207 gut microbiotas [[52]20], 731 immune cells [[53]21], 91 inflammatory proteins [[54]8] was gathered from the research. Instrumental variable selection The criteria for selecting IVs are as follows: (1) Strong Exposure Relevance: The instrumental variable (IV) must demonstrate robust association with target exposures, achieving significance threshold (p < 5e-08) and instrumental strength (F-statistic > 10) in forward IVs selection and significance threshold (p < 1e-05) in reverse IVs selection. (2) Confounder Exclusion: The selected IVs should exhibit complete independence from any potential confounders affecting exposure-outcome relationship. (3) Horizontal Pleiotropy Restriction: Genetic effects must be channeled exclusively through specified exposure. This tripartite validation protocol ensures causal estimates withstand scrutiny from both biological plausibility and statistical rigor perspectives. Central to Mendelian randomization’s validity is preserving the randomness of genetic variants, a cornerstone compromised by linkage disequilibrium (LD) [[55]22]. To uphold research integrity, we implemented stringent clumping thresholds: r²>0.001; kb>10,000, systematically filtering out single nucleotide polymorphisms (SNPs) exhibiting allelic correlation beyond these genomic tolerances. By rigorously applying these selection criteria for IVs, our research aims to uphold the validity and reliability of our causal inference analyse. MR analysis This study leveraged the ‘TwoSampleMR’ package (R v4.3.1) for the MR analysis. We implemented five statistical techniques: inverse variance weighting (IVW), weighted median (WM) estimation, Mendelian randomization-Egger regression (MR-Egger regression), simple mode, and weighted mode. Crucially, IVW-derived estimates presume IVs validity, which is an idealized scenario rarely achieved in practical contexts [[56]23]. To mitigate potential violations of the exclusion restriction assumption, a critical limitation in conventional MR, we deployed complementary pleiotropy-robust methodologies. These approaches eschew strict dependence on universal instrument validity, thereby enhancing causal estimate stability. Specifically, the WM estimator was prioritized for its unique capacity to yield unbiased effect estimates even when ≥ 50% of IVs exhibit validity [[57]24]. Concurrently, MR-Egger regression was implemented to correct for horizontal pleiotropy. By relaxing the stringent “no pleiotropy” assumption inherent to IVW, this method significantly bolstering causal inference robustness in polygenic settings [[58]25]. Sensitivity analysis All statistical computations were executed using R v4.2.1, leveraging the ‘TwoSampleMR’ and ‘MR-PRESSO’ packages for causal inference and graphical representation. While IVW served as our anchor method—its intercept-free regression providing asymptotically efficient estimates when directional pleiotropy is negligible [[59]26]. To quantify the heterogeneity of selected IVs, Cochran’s Q statistics were calculated, with significant deviation from homogeneity (P < 0.05) triggering adoption of random-effects IVW models [[60]27]. Both MR-Egger and WM are indispensable components for pleiotropy detection in MR studies involving multiple genetic variants [[61]28, [62]29]. The MR-PRESSO methodology systematically identifies horizontal pleiotropic outliers in MR tests through multiple instruments at the summary level. This outlier-corrected approach recalculates causal effects post-pleiotropic variant exclusion [[63]30]. Mapping SNPs to genes * We implemented R programme to map each variant to its closest gene. It may be an overlapping gene or a downstream or upstream gene [[64]31]. Subsequently, MR analyses were performed for instrument SNPs and exposure-associated genes. Expression quantitative trait loci (eQTL) data was derived from the CAGE study [[65]32], which examined gene expression at the transcript level from peripheral blood of 2, 765 predomimantly European-ancestry individuals. All independent eQTLs for the focal gene with a conditional p-value < 0.05 were included. The flowchart of the whole process was shown in Fig. [66]1. Fig. 1. [67]Fig. 1 [68]Open in a new tab The flowchart of the whole process Results Detailed information of the included SNPs The information of leukemia-associated inflammatory mediators is available in the supplementary material. The IVW method employed a stringent significance threshold (P < 0.05), with all causal associations and genetic variants characteristics meticulously documented (Supplementary Table [69]S1). This comprehensive collection includes specifics such as genetic locus, effective allele (EA), and effective allele frequency (EAF). By systematically selecting SNPs, we ensure consistent associations with target exposures. Efforts are made to effectively manage relationships with outcome variables and potential confounders at acceptable levels. Therefore, all chosen SNPs can be classified as reliable and versatile IVs. The causal impact of inflammatory proteins on leukemia Our bidirectional MR analysis revealed four inflammatory mediators with links to leukemia: CUB domain-containing protein 1 (CDCP1) levels (OR [95% CI] = 1.316 [1.125, 1.539], P = 5.8e − 04), C-X-C motif chemokine 9 (CXCL9) levels (OR [95% CI] = 1.294 [1.007, 1.661], P = 4.4e − 02), interleukin-10 (IL-10) receptor subunit beta levels (OR [95% CI] = 1.129 [1.007, 1.266], P = 3.8e − 02), and oncostatin-M (OSM) levels (OR [95% CI] = 1.413 [1.063, 1.879], P = 1.7e − 02) (Fig. [70]2A). The two-sample MR analysis was preferred over the IVW method when residual heterogeneity and pleiotropic bias were statistically excluded. Sensitivity analysis supported the strength of the causal associations. Furthermore, the lack of statistical significance in the difference between the Egger intercept of MR Egger and 0 (Pval > 0.05, Supplementary Table S2) indicated no horizontal pleiotropy. Fig. 2. [71]Fig. 2 [72]Open in a new tab The causal impact of: A) inflammatory proteins on leukemia; B) leukemia on inflammatory proteins The causal impact of leukemia on inflammatory proteins Our MR analyses unveiled a total of six inflammatory factors exhibiting significant leukemia-associated variations (P < 0.05). Leukemia was identified as an increased risk for these six inflammatory factors: cystatin D levels (OR [95% CI] = 1.060 [1.012 to 1.110], P = 1.4e − 02), hepatocyte growth factor (HGF) levels (OR [95% CI] = 1.058 [1.013 to 1.104], P = 1.1e − 02), interleukin-12 (IL-12) subunit beta levels (OR [95% CI] = 1.061 [1.008 to 1.116], P = 2.4e − 02), macrophage inflammatory protein 1a (MIP-1α) levels (OR [95% CI] = 1.051 [1.004 to 1.099], P = 3.3e − 02), transforming growth factor-alpha (TGF-α) levels (OR [95% CI] = 1.071 [1.025 to 1.119], P = 2.1e − 03), and vascular endothelial growth factor A (VEGF-A) levels (OR [95% CI] = 1.051 [1.007 to 1.098], P = 2.4e − 02) (Fig. [73]2B). The Egger intercept of MR-Egger showed no significant difference from 0 (Pval > 0.05, Supplementary Table S3), indicating the absence of horizontal pleiotropy. Sensitivity analyses also validated the reliability of the causal relationships identified in the analysis (Supplementary Table S4). The genetically associated immunophenotypes of leukemia by MR To elucidate the causal interplay between immunophenotypes and leukemia, we employed a bidirectional two-sample MR analysis incorporating five analytical approaches. This investigation uncovered thirty immune cell subtypes potentially influencing leukemia risk, with two phenotypes demonstrating significant associations (P < 0.001). The IVW method pinpointed CD4 + CD8dim T cell absolute count as a significant risk amplifier (OR = 1.344, 95% CI: 1.150–1.571), while conversely identifying CD28 + CD45RA- CD8 + T cell absolute count as a low-risk immunophenotype (OR = 0.913, 95% CI: 0.861–0.968) (Fig. [74]3). Robustness evaluations through Cochran’s Q statistics and MR-Egger intercept analyses confirmed the absence of detectable heterogeneity or horizontal pleiotropy. Comprehensive methodological validation including reverse MR analysis and sensitivity analysis was detailed in supplementary material (Supplementary Table S5-8). Fig. 3. [75]Fig. 3 [76]Open in a new tab Causal relationships between immunophenotypes and leukemia through MR analysis. (A) A circular heatmap illustrated the MR-derived causal associations between immunophenotypes and leukemia, incorporating five distinct methods of MR analysis. (B) Volcano plots of the results of IVW in causal estimates between immunophenotypes and leukemia. (C) Forest plot of significant MR results regarding the causal links between immunophenotypes and leukemia The genetically linked gut microbiotas leukemia by MR Employing a two-sample MR analysis with five analytical approaches, we systematically investigated gut microbiotas-leukemia interactions. This multi-method integration revealed five gut microbiotas exhibiting potential causal links with leukemia (P < 0.05). Notably, IVW-derived estimates demonstrated: s_Ruminococcus_lactaris abundance showed leukemia risk amplification (OR = 1.273, 95% CI: 1.030–1.573, P = 0.025). In contrast, four gut microbiota subtypes demonstrated protective effects: f_Lactobacillaceae (OR = 0.903, 95% CI: 0.832–0.981, P = 0.016), g_Lactobacillus (OR = 0.906, 95% CI: 0.832–0.988, P = 0.025), f_Veillonellaceae (OR = 0.836, 95% CI: 0.713–0.981, P = 0.028), and s_Butyrivibrio_crossotus* (OR = 0.913, 95% CI: 0.836–0.998, P = 0.044) (Fig. [77]3A-C). Comprehensive assessments of reverse causality and methodological validations were detailed in supplementary materials (Supplementary Table S9-12). The causal associations between inflammatory factors and leukemia We implemented a bidirectional MR analysis integrating five distinct approaches to decode causal inflammatory factors-leukemia interactions. The IVW method identified four pro-leukemogenic mediators with significance (P < 0.05): CDCP1 levels* (OR = 1.316, 95% CI: 1.125–1.539, P = 0.001), OSM levels (OR = 1.413, 95% CI: 1.063–1.879, P = 0.017), IL-10 levels (OR = 1.129, 95% CI: 1.007–1.266, P = 0.038), and CXCL9 levels (OR = 1.294, 95% CI: 1.007–1.661, P = 0.044) (Fig. [78]3D-E). Comprehensive methodological validations are documented in supplementary materials. Robustness evaluations through Cochran’s Q statistics and MR-Egger intercept analyses confirmed negligible heterogeneity and directional pleiotropy in the MR study (Supplementary Table S13-16). Fig. 4. [79]Fig. 4 [80]Open in a new tab MR-derived causal estimates of significant gut microbiotas and inflammatory factors in leukemia. (A) A circular heatmap illustrated the MR-based causal associations between gut microbiotas and leukemia, incorporating five different methods of MR analysis. (B) IVW results for gut microbiotas-leukemia causal estimates were visualized through volcano plots. (C) Significant MR findings regarding gut microbiotas-leukemia associations were presented in a forest plot. (D) A heatmap displayed causal estimates between inflammatory factors and leukemia using five MR methods. (E) A forest plot summarized significant MR results of casual associations between inflammatory factors and leukemia The results of enrichment analysis Our findings revealed divergent expression of KIAA1644, TMEM150C, HLA.DRB6, HCG23, BDKRB2, and RNFT2 between malignant and adjacent normal tissues, with all six genes demonstrating constitutive expression in diffuse large B-cell lymphoma (DLBC) microenvironments. Strikingly, KIAA1644, TMEM150C, and BDKRB2 exhibited pronounced tumor-specific expression differences (Fig. [81]5A). Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis uncovered that among immune cell leukemia related genes, gut microbiota leukemia related genes, and inflammatory proteins leukemia related genes, genes associated with the the MAPK signaling pathway accounted for a prominent proportion, particularly among inflammatory proteins leukemia related genes, indicating its potential crucial role in mediating the relationship between leukemia and these genes (Fig. [82]5B-D). Fig. 5. [83]Fig. 5 [84]Open in a new tab The results of genes association and enrichment analysis. (A) The expression of the related genes of immune cells-leukemia, gut microbiotas-leukemia, and inflammatory proteins-leukemia in DLBC. (B) KEGG pathway analysis of immune cells-leukemia related genes. (C) KEGG pathway analysis of gut microbiotas-leukemia related genes. (D) KEGG enrichment analysis of inflammatory proteins-leukemia related genes Discussion Leveraging a substantial dataset of publicly accessible genetic data, this bidirectional MR investigation dissected the interplay among 91 inflammatory markers, immune cells, and gut microbiotas of Dutch Microbiome Project and leukemia. Key discoveries include: a significant causal link between four inflammatory markers and leukemia, along with six additional markers associated with the disease. Two immune cell subtypes were shown to have opposing impacts on leukemia risk: one serving as a high-risk factor and the other as a protective factor. Four gut microbiota subtypes showed possible protective effects, whereas one showed a positive connection with leukemia risk. Four inflammatory mediators were also substantially associated with an increased risk of leukemia. Functional analysis revealed the pronounced overexpression of three genes in DLBC tissues and the MAPK signaling pathway enrichment. Emerging studies have progressively delineated the roles of inflammatory mediators in hematologic malignancies. CDCP1 upregulates in AML, and its overexpression is associated with poor treatment response and unfavorable prognosis, positioning it as a promising prognostic biomarker [[85]33]. Although there is no direct correlation between CDCP1 and certain lifestyle variables or co-existing conditions that raise the risk of leukemia, obesity may have an indirect impact on CDCP1 expression (via the inflammatory microenvironment), which requires more research [[86]34]. Another study identified diagnostic elevations of CXCL9 and IL-10 in B-cell acute lymphoblastic leukemia (B-ALL) patients, with these cytokines synergistically promoting leukemic blast proliferation [[87]35]. Chronic infection-induced inflammation may enhance the CXCL9/CXCR3 axis and increase leukemia cell survival [[88]36]. Through the IL-10 pathway, inflammation associated with diabetes may influence immunological dysregulation and promote leukemia [[89]34]. A research investigated OSM’s potential as cancer treatment, highlighting how important it is to several physiological functions [[90]37]. In the bone marrow, chronic inflammatory conditions (such autoimmune illness) may increase OSM expression, which will induce aberrant hematopoiesis [[91]38]. Our MR analysis substantiated causal roles for CDCP1, CXCL9, IL-10, and OSM and leukemia. These discoveries not only corroborate prior evidence but also offer insights into novel diagnostic and therapeutic strategies. For instance, IL-10R antagonists could be used to control immunosuppressive signals in the tumor microenvironment to accomplish targeted therapy [[92]36, [93]39]. We expect to have the chance to focus on elucidating the mechanistic pathways governing these biomarkers and exploring their potential as targets for precision medicine in leukemia treatment. Recent advances have illuminated novel molecular mechanisms driving leukemogenesis and therapeutic resistance. Cystatin D exerts potent anti-leukemic effects, with its potential therapeutic benefit in the treatment of leukemia [[94]40]. MiR-140-3p levels inversely correlate with relapsed acute promyelocytic leukemia (APL), with HGF identified as a direct target [[95]41]. A study identified that AML patients had significantly lower HGF mRNA levels than healthy donors [[96]42]. According to a research, cytokine-induced memory-like (ML) natural killer (NK) cells demonstrated strong anti-tumor responses and safely brought leukemia patients into full remission after being activated by IL-12 [[97]43]. Adult T-cell leukemia/lymphoma (ATL) cells treated with the anti-tumor HDAC (AR-42) have higher levels of MIP-1α mRNA [[98]44]. A study indicated that AML patients’ plasma had higher levels of tumor necrosis factor alpha (TNF-α) than those of healthy controls [[99]45]. Research indicates a strong correlation between the prognosis and development of leukemia and higher VEGF-A levels in these individuals [[100]46]. Our findings consolidates these discoveries, establishing leukemia risk associations with six inflammatory mediators: cystatin D, HGF, IL-12, MIP-1α, TGF-α, and VEGF-A. These findings corroborate and expand on earlier research that highlights the roles these variables play in the development of leukemia, offering novel insights for creative diagnostic and therapeutic strategies. The first case of T-cell large granular lymphocyte leukemia (T-LGLL) with dual CD4+/CD8dim and CD4-/CD8 + phenotypes has been reported [[101]47]. CD45RA/CD45RO is reciprocally expressed by normal blood CD4(+), CD8(+), and CD8(dim+) lymphocytes. Differences in CD45RA/CD45RO in CD4(+) proliferation potentially connects to the differentiating of both benign and malignant CD4(+) expansions [[102]48]. The pathogenesis and progression of T-cell lymphocytic leukemia (T-ALL) are influenced by BACH2-mediated CD28 regulation [[103]49]. In this study, CD4 + CD8dim T cell absolute count was found to be a high-risk factor for leukemia based on IVW analysis, whereas CD28 + CD45RA-CD8 + T cell absolute count showed a protective effect against the development of leukemia. Research in this field is still scarce, nevertheless. Our findings offers theoretical support for future research by shedding more light on the connection between leukemia and immunophenotypes. Assessing gut microbiota traits may be a useful indicator for hematological recovery in AML patients after induction treatment [[104]50]. A study discovered that patients with high Lactobacillaceae abundances had higher pre-transplant levels of the antimicrobial peptide human beta-defensin 2 (hBD2), which was linked to a higher risk of moderate or severe acute graft-versus-host disease (aGvHD) and a higher mortality [[105]51]. After immunization, patients with chronic lymphocytic leukemia (CLL) and follicular lymphoma (FL) had higher levels of Ruminococcus and Lactobacillus, according to a research [[106]52]. A bidirectional MR analysis found a strong correlation between the Butyrivibrio genus’s abundance and the risk of lymphocytic leukemia [[107]53]. According to the IVW data of this study, f_Lactobacillaceae, g_Lactobacillus, f_Veillonellaceae, and s_Butyrivibrio_crossotus* shown protective benefits, while increased levels of s_Ruminococcus_lactaris was positively connected with leukemia risk in this study. However, there is still a lack of study on the relationships between these gut microbiota subtypes and leukemia, especially with regard to Veillonellaceae. To confirm these results and investigate the underlying processes, more investigation is required. A study reported that young women with breast cancer (BC) who have increased mRNA expression of KIAA1644 in both their normal breast tissues and BC had considerably worse overall survival (OS) and recurrence-free survival (RFS) [[108]54]. TMEM150C may be a biomarker for diabetic nephropathy (DN) [[109]55]. BDKRB2 is essential for the interaction of oncogenic and anti-cancer processes in bladder cancer [[110]56]. Although the connection between these genes and leukemia has not been investigated before, our results theoretically suggest their possible role in the pathophysiology of leukemia. It is worthwhile to carry out more study to confirm these findings. The p200CUX1-BMP8B-MAPK axis is essential for preserving the survival of AML cells [[111]57]. A unique approach that targets the ROCK2/AKT/MAPK signaling pathway in order to overcome AML treatment resistance was revealed [[112]58]. Chenodeoxycholic acid (CDCA) inhibits M2 macrophage polarization and the development of AML by synergistically promoting lipid droplet accumulation and lipid peroxidation through the ROS/p38 MAPK/DGAT1 pathway triggered by mitochondrial dysfunction in leukemia cells [[113]59]. There is currently little research examining the connection between the MAPK signaling pathway and genes linked to gut microbiotas and leukemia, and none that connect the MAPK signaling pathway to genes linked to inflammatory factors or immune cells and leukemia. The KEGG pathway analysis in our study showed a significant enrichment of genes connected to MAPK signaling pathway across all of the genes linked to immune cells-leukemia, inflammatory factors-leukemia, and gut microbiotas-leukemia. Notably, the genes linked to inflammatory proteins-leukemia showed the highest enrichment, indicating that this pathway may play a crucial role in regulating the relationships between leukemia and these genetic factors. These results serve as a reference for further study in addition to highlighting the significance of the MAPK signaling pathway in leukemia etiology. Our understanding of the intricate molecular networks underlying the development of leukemia may be improved by more research into the mechanistic functions of MAPK signaling pathway in these settings, which may also provide new treatment targets. There are many limitations to this study that should be taken into account. First, a more thorough investigation of the causal links between circulating inflammatory proteins, immune cells, and gut microbiotas and leukemia was impeded by the absence of comprehensive demographic data and clinical presentation information. These restrictions, together with the lack of experimental confirmation and external validation, might compromise the findings’ accuracy and generalizability. As such, the results should be interpreted with caution. Furthermore, the possible impact of unmeasured confounding factors cannot be ruled out even with a thorough literature review and the inclusion of numerous variables. Finally, experimental validation is still required to corroborate these findings and show their biological importance, even though the enrichment analysis of SNPs mapping genes offers insights into potential pathways by which immune cells and gut microbiotas may impact leukemia. This study has a number of noteworthy advantages. First of all, it is the first study to use two-sample MR analysis to investigate possible causal links of circulating inflammatory proteins, immune cells, gut microbiotas and leukemia. Confounding variables and reverse causation are two common sources of bias in traditional observational research. Second, the study made use of the largest GWAS dataset to date, which covered a variety of populations and improved the findings’ generalizability. Furthermore, MR analysis has substantial epidemiological importance, and its application is anticipated to grow in the future. MR analysis will remain a useful approach for comprehending the causal linkages between risk variables and disease outcomes as long as genetic data becomes more widely available and technology advances. The direction of causal networks was successfully demonstrated by the study’s application of bidirectional MR analysis [[114]60]. Conclusion Our study thoroughly explored the relationship among inflammatory markers, immune cells, and gut microbiotas and leukemia, investigating causal links through a two-way MR analysis. By utilizing bidirectional MR analysis, we effectively addressed issues of reverse causality, thereby enhancing the accuracy of our causal inferences and reducing the impact of confounding factors. This approach not only provides practical insights for leukemia treatment but also offers robust genetic data that advance our understanding of leukemia’s pathophysiology and biological mechanisms. Electronic supplementary material Below is the link to the electronic supplementary material. [115]Supplementary Material 1^ (224.9KB, xlsx) Abbreviations aGvHD Acute graft-versus-host disease AML Acute myeloid leukemia APL Acute promyelocytic leukemia ATL Adult T-cell leukemia/lymphoma B-ALL B-cell acute lymphoblastic leukemia BC Breast cancer CDCA chenodeoxycholic acid CDCP1 CUB domain-containing protein 1 CLL Chronic lymphocytic leukemia CML Chronic myelogenous leukemia CXCL9 C-X-C motif chemokine 9 DN Diabetic nephropathy DLBC Diffuse large B-cell lymphoma EA Effective allele EAF Effective allele frequency eQTL Expression quantitative trait loci FL Follicular lymphoma GWAS Genome-wide association study hBD2 Human beta-defensin 2 HGF Hepatocyte growth factor IL-6 Interleukin-6 IL-8 Interleukin-8 IL-10 Interleukin-10 IL-12 Interleukin-12 IV Instrumental variable IVs Instrumental variables IVW Inverse variance weighting KEGG Kyoto Encyclopedia of Genes and Genomes LD Linkage disequilibrium LGL Large granular lymphocyte MAPK Mitogen-activated protein kinase MIP-1α Macrophage inflammatory protein 1α ML Memory-like MR Mendelian randomization MR-Egger regression Mendelian randomization-Egger regression NK Natural killer OS Overall survival OSM Oncostatin M RFS Recurrence-free survival SNP Single nucleotide polymorphism SNPs Single nucleotide polymorphisms STAT3 Signal transducer and activator of transcription 3 T-ALL T-cell lymphocytic leukemia TGF-α Transforming growth factor-alpha T-LGLL T-cell large granular lymphocyte leukemia TNF-α Tumor necrosis factor alpha VEGF-A Vascular endothelial growth factor A WM Weighted median Author contributions Conceptualization: LY.Z, XY.R, and YL.Z. Formal analysis: LY.Z, XY.R, YL.Z, SK.Y, and LZ.X. Writing-original draft preparation: LY.Z, XY.R, and YL.Z. Writing-review and editing: LY.Z and LZ.X. Supervision: LZ.X. Funding The study was funded by GuangDong Basic and Applied Basic Research Foundation (2023A1515010587). Data availability The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors. Declarations Ethics approval and consent to participate The sample for the study included human individuals and involved secondary analysis of previously published data, thus ethical approval was not necessary. Competing interests The authors declare no competing interests. Footnotes Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Linying Zhu, Xiaoyi Ruan and Yilun Zou have contributed equally to this work. References