Abstract Objectives Nasopharyngeal carcinoma (NPC) is an aggressive malignancy with high rates of morbidity and mortality, largely because of its late diagnosis and metastatic potential. Lactate metabolism and protein lactylation are thought to play roles in NPC pathogenesis by modulating the tumor microenvironment and immune evasion. However, research specifically linking lactate-related mechanisms to NPC remains limited. This study aimed to identify lactate-associated biomarkers in NPC and explore their underlying mechanisms, with a particular focus on immune modulation and tumor progression. Methods To achieve these objectives, we utilized a bioinformatics approach in which publicly available gene expression datasets related to NPC were analysed. Differential expression analysis revealed differentially expressed genes (DEGs) between NPC and normal tissues. We performed weighted gene coexpression network analysis (WGCNA) to identify module genes significantly associated with NPC. Overlaps among DEGs, key module genes and lactate-related genes (LRGs) were analysed to derive lactate-related differentially expressed genes (LR-DEGs). Machine learning algorithms can be used to predict potential biomarkers, and immune infiltration analysis can be used to examine the relationships between identified biomarkers and immune cell types, particularly M0 macrophages and B cells. Results A total of 1,058 DEGs were identified between the NPC and normal tissue groups. From this set, 372 key module genes associated with NPC were isolated. By intersecting the DEGs, key module genes and lactate-related genes (LRGs), 17 lactate-related DEGs (LR-DEGs) were identified. Using three machine learning algorithms, this list was further refined, resulting in three primary lactate-related biomarkers: TPPP3, MUC4 and CLIC6. These biomarkers were significantly enriched in pathways related to “immune cell activation” and the “extracellular matrix environment”. Additionally, M0 and B macrophages were found to be closely associated with these biomarkers, suggesting their involvement in shaping the NPC immune microenvironment. Conclusion In summary, this study identified TPPP3, MUC4 and CLIC6 as lactate-associated clinical modelling indicators linked to NPC, providing a foundation for advancing diagnostic and therapeutic strategies for this malignancy. Supplementary Information The online version contains supplementary material available at 10.1186/s12967-024-05935-9. Keywords: Nasopharyngeal carcinoma, Lactylation-related genes, Infiltrating immunocytes, Machine learning algorithms, GEO Introduction Nasopharyngeal carcinoma (NPC) is an aggressive malignancy arising from the epithelial cells of the nasopharynx, with a particularly high incidence in East Asia. NPC is characterized by complex interactions involving genetic predispositions, Epstein‒Barr virus (EBV) infection, and environmental factors [[46]1]. Despite advancements in NPC treatment, survival rates remain low, largely due to the high metastatic potential and limited biomarkers for early detection of this disease [[47]2, [48]3]. Metabolic reprogramming is a hallmark of many cancers, including NPC, where it manifests as a pronounced shift toward glycolysis even in oxygen-rich conditions‒a phenomenon known as the Warburg effect [[49]4]. This shift leads to an accumulation of lactate, which contributes to an acidic tumor microenvironment (TME), thereby promoting tumor growth, immune escape and therapy resistance. Recent studies have shown that lactate, in addition to being a byproduct of glycolysis, serves as a key signalling molecule within tumors and inflamed tissues [[50]5, [51]6]. Of particular interest is the role of lactate in a novel posttranslational modification (PTM) known as protein lactylation, where lactate groups bind to lysine residues on proteins, impacting gene expression and cellular function. Protein lactylation has been shown to modulate immune responses, tumor cell proliferation and metabolic adaptation [[52]7]. In NPC, lactylation may contribute to immune evasion and tumor aggressiveness by reprogramming immune cells within the TME, including M0 macrophages and regulatory B cells, to support a more immunosuppressive environment [[53]8]. Lactate-driven metabolic shifts in NPC are facilitated by glycolytic enzymes such as hexokinase 2 (HK2) and pyruvate kinase M2 (PKM2), which drive glycolytic flux toward lactate production [[54]9–[55]11]. The rerouting of glycolysis intermediates to the pentose phosphate pathway (PPP) further supports tumor proliferation by providing ribose-5-phosphate for nucleotide synthesis and NADPH to counter oxidative stress [[56]12]. This metabolic reprogramming is crucial for sustaining NPC growth and progression. Given the significance of lactate metabolism and lactylation in NPC, identifying lactate-related hub genes (LRGs) could offer new insights into NPC pathogenesis and potential diagnostic biomarkers. In this study, we employed bioinformatics and machine learning techniques to analyse transcriptomic data from NPC samples, aiming to identify LRGs closely associated with glycolysis and immune modulation. By integrating immune infiltration analysis and constructing transcriptional networks, we aimed to elucidate the underlying mechanisms linking lactate metabolism to NPC progression, thus paving the way for innovative approaches for NPC diagnosis and treatment. Materials and methods Patients, tissue Collection, and tissue microarray From 2019 to 2021, a total of 85 tissue specimens, included 45 healthy nasal mucosa tissues and 40 NPC tissues, were collected from Shenzhen Hospital of Southern Medical University. A tissue microarray was constructed using these 85 tissue samples, which included 45 healthy nasal mucosa tissues and 40 NPC tissues. These NPC patients ranged in age from 33 to 78 years, comprising 28 male and 12 female patients. This study was approved by the Ethics Committees of Shenzhen Hospital of Southern Medical University. Written informed consent was obtained from all participants or their legal guardians. Data acquisition The research workflow is illustrated in Fig. [57]1. Gene expression profiles and clinical data from three east Asian population cohorts of NPC and normal samples ([58]GSE12452, [59]GSE64634 and [60]GSE102349) were obtained from the Gene Expression Omnibus (GEO) database ([61]http://www.ncbi.nlm.nih.gov/geo/). The [62]GSE12452 dataset comprises 10 normal and 31 NPC samples, whereas the [63]GSE64634 dataset includes 4 normal and 12 NPC samples. These two datasets were combined and processed with batch correction via the ‘sva’ package to create a unified training set. For validation, the [64]GSE102349 dataset, containing 113 NPC samples, was utilized. A total of 335 lactate-related genes (LRGs) were identified from the Molecular Signatures Database ([65]http://www.gsea-msigdb.org/gsea/index.jsp) and supplemented by relevant literature. Fig. 1. [66]Fig. 1 [67]Open in a new tab Flowchart of the study design Analysis of differentially expressed genes Principal component analysis (PCA) was performed via the ‘FactoMineR’ and ‘factoextra’ packages to evaluate the integrity of the training set. DEGs between the normal and NPC groups were identified via the ‘limma’ package with a threshold of |log2-fold change (FC)| > 0.5 and an adjusted p value < 0.05. Volcano plots were generated via the ‘ggplot2’ package to visualize the DEGs, and a heatmap of the top 50 DEGs was created via the ‘Pheatmap’ package. To assess the activation or inhibition of biological pathways associated with these DEGs, pathways were classified as activated with a Z score > 2 and inhibited with a Z score < -2. The GSEA provided insights into functionally relevant pathways, highlighting potential mechanisms underlying NPC progression and aiding in the identification of key targets for further study [[68]13]. RNA extraction and quantitative real-time PCR (qPCR) Total RNA was extracted using the RNA Trizol reagent (Invitrogen, Cat. No. 15596-026CN) following the manufacturer`s protocol. cDNA synthesis was performed using the All-in-One First-Strand cDNA Synthesis Kit (TransGen, Cat. No. AE341-02) according to the manufacturer`s instructions.The qPCR reactions were prepared using the PerfectStart Green qPCR SuperMix Kit (with Dye II, TransGen, Cat. No. AQ602-01), following the manufacturer`s guidelines, and run on an ABI Prism 7500 system (ABI). The expression levels of target mRNAs were normalized to GAPDH as the internal control, and the relative expression differences were calculated using the 2^^−ΔΔCt method, expressed as fold changes relative to the control group. The primer sequences used for RT-qPCR (5′ to 3′) were as follows: TPPP3 * Forward: AAGTCTGCTCGGGTCATCAAC. * Reverse: GAGCCCGTGTATCTGCTGG. MUC4 * Forward: CAGGCCACCAACTTCATCG. * Reverse: ACACGGATTGCGTCGTGAG. CLIC6 * Forward: GGGACCCAACATCCCGAATC. * Reverse: TCAGGCAGAGGGCTATTTAAGT. GAPDH * Forward: GTCAGCCGCATCTTCTTT. * Reverse: AGGCTGTTGTCATACTTCTC. Immunohistochemical staining Tissue samples were fixed in 10% buffered formalin and subsequently processed into paraffin-embedded sections of 3–4 μm thickness. After antigen retrieval, the sections were incubated with specific primary antibodies followed by species-specific HRP-conjugated secondary antibodies. Signal development was performed using freshly prepared DAB solution, and nuclei were counterstained with hematoxylin [[69]14]. The antibodies and their working concentrations used for IHC were as follows: CLIC6 Antibody (AFSBio, Cat. No. DF3934): 1:500. TPPP3 Rabbit pAb (ABclonal, Cat. No. A6775): 1:1000. MUC4 Rabbit mAb (ABclonal, Cat. No. A3438): 1:1000. WCGNA WGCNA was performed on the top 25% most variable genes to identify gene coexpression modules linked to nasopharyngeal carcinoma (NPC). The soft-thresholding power was set to 12 to achieve a scale-free network with R2 ≥ 0.85 [[70]15, [71]16]. Modules were defined with a minimum size of 30 genes, each represented by a module eigengene (the primary expression component), which was subsequently correlated with NPC-related clinical traits. Acquisition of LR-DEGs The overlap of differentially expressed genes (DEGs), key module genes, and lactate-related genes (LRGs) was identified and termed lactate-related differentially expressed genes (LR-DEGs), which were visualized through a Venn diagram. Chromosomal locations of the LR-DEGs were mapped via RCircos (version 0.4.15), which marks target gene positions on the genome following its initialization. To investigate LR-DEG interactions, a protein‒protein interaction (PPI) network was built in the STRING database, applying a confidence score threshold of > 0.75. Functional enrichment analysis, including Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses, was conducted on the LR-DEGs via the ‘clusterProfiler’ package (version 4.4.4), with statistical significance set at p.adjust < 0.05. Machine-learning methods Support vector machine recursive feature elimination (SVM-RFE), friend, and random forest (RF) algorithms were employed to identify key genes in the training set [[72]17–[73]20]. The intersection of genes identified by these three methods was considered to represent critical genes. The diagnostic performance of these key genes was then evaluated by constructing a receiver operating characteristic (ROC) curve via the ‘pROC’ package [[74]20–[75]22]. Genes with an area under the curve (AUC) greater than 0.7, which also displayed consistent expression patterns between the training set and the external validation set ([76]GSE102349), were classified as key biomarkers with high diagnostic potential for NPC. Clinical nomogram model Nomograms are extensively applied in clinical research to estimate the likelihood of individual clinical events. In this study, a nomogram incorporating selected biomarkers was constructed via the ‘rms’ package to predict NPC risk. The model’s clinical utility was assessed through decision curve analysis (DCA) [[77]23, [78]24]. To evaluate the nomogram’s predictive accuracy, calibration curves and receiver operating characteristic (ROC) curves were generated. GSEA and immune infiltration analysis Single-sample gene set enrichment analysis (GSEA) was performed via the ‘clusterProfiler’ package to identify potential KEGG pathways associated with the biomarkers, with a significance threshold of p.adjust < 0.05. Additionally, the CIBERSORT algorithm was utilized to determine the relative abundances of 22 immune cell types present in the NPC microenvironment. Spearman correlation analysis was then conducted to explore the relationships between the biomarkers and the differentially abundant immune cells [[79]25, [80]26]. The ‘estimate’ package was employed to calculate and compare the immune, stromal and ESTIMATE scores between the normal and NPC groups. Spearman correlation analysis between the biomarkers and these scores was performed via the ‘ggExtra’ package [[81]27–[82]29]. Construction of ‘TF-miRNA-gene’ networks The TRRUST database was utilized to predict transcription factors (TFs) associated with key biomarkers, whereas the miRWalk database was used to identify miRNAs linked to these biomarkers [[83]30]. The resulting TF‒miRNA‒gene interaction network was refined and visualized via Cytoscape software. Statistical analysis All statistical analyses were conducted using R software (version 4.1.1) and Perl. A p-value of < 0.05 was considered statistically significant for all tests. Results Analysis of differentially expressed genes and functional roles in NPC Box plot analysis indicated effective correction of the data samples, whereas principal component analysis (PCA) confirmed that batch effects were successfully removed in the merged dataset (training set) (Supplementary Figure [84]S1A, B). As shown in Fig. [85]2A, B and a total of 1,058 differentially expressed genes (DEGs) were identified between the normal and NPC groups, with 682 genes downregulated and 376 genes upregulated. KEGG pathway analysis further revealed that these DEGs were involved in multiple pathways (Fig. [86]2C-F), including pathways related to the “cell cycle”, “mitosis” and “extracellular matrix and ECM receptor interaction”. Functional and disease pathway analyses revealed that the DEGs primarily participated in “intercellular signalling and interaction” as well as the “immune microenvironment”. Fig. 2. [87]Fig. 2 [88]Open in a new tab Identification and analysis of lactate-related DEGs in nasopharyngeal carcinoma. (A) Volcano plot showing the DEGs between the NPC and normal groups, including 682 downregulated and 376 upregulated genes. (B) Heatmap showing the top 50 DEGs. (C-F) KEGG pathway analysis further revealed that these DEGs were involved in multiple pathways Identification of key module genes associated with NPC To identify NPC-associated modules, weighted gene coexpression network analysis (WGCNA) was applied. Sample clustering indicated no outliers (Supplementary Figure [89]S2). A soft threshold power of 12 was selected as optimal, achieving a signed R² of 0.85, which supported a scale-free network topology as the average connectivity approached zero (Fig. [90]3A, B). Using a dynamic tree-cutting algorithm with subsequent merging of similar modules, five distinct modules were identified (Fig. [91]3C-E). Among these, three modules—MEblue, MEbrown and MEyellow—were significantly correlated with NPC (|cor| > 0.6, P < 0.05) (Fig. [92]3C). From these genes, 372 key genes associated with NPC were extracted for further analysis (Fig. [93]3F-H). Fig. 3. [94]Fig. 3 [95]Open in a new tab (A-B) Analysis of the scale-free topology fit index and mean connectivity across various soft-thresholding powers. (C‒D) Hierarchical clustering produced a gene dendrogram, yielding 12 modules through the dynamic tree cut algorithm and subsequent merging of similar modules. (E) Heatmap indicating significant correlations between the MEblue, MEbrown, and MEyellow modules and NPC (|cor| > 0.3, P < 0.05). (F-H) Scatterplots of gene significance (GS) versus module membership (MM) reveal 372 key module genes associated with NPC Identification and Functional Enrichment Analysis of Lactate-related DEGs Seventeen lactate-related differentially expressed genes (LR-DEGs) were identified in NPC by intersecting the DEGs, key module genes, and lactate-related genes (LRGs) (Fig. [96]4A). Protein‒protein interaction (PPI) analysis revealed that SLPI, LCN2, BPIFB1 and PIGR extensively interact with other proteins (Fig. [97]4B). These genes are distributed across chromosomes except for chromosomes 1, 3, 7, 9, 12, 16, 17, 18, 20 and 21 (Fig. [98]4C). To investigate the functional roles of the LR-DEGs in NPC, functional enrichment analysis was performed. KEGG pathway analysis revealed significant enrichment in pathways such as “metabolic pathways”, “cytokine signalling” and the “extracellular microenvironment” (Fig. [99]4D). Gene Ontology (GO) analysis indicated that the LR-DEGs were associated mainly with “immune response”, “immune receptor activation” and “extracellular matrix organization” (Fig. [100]4E‒G). Fig. 4. [101]Fig. 4 [102]Open in a new tab Acquisition and functional analysis of lactate-related differentially expressed genes (LR-DEGs). (A) Venn diagram illustrating the overlap of 17 LR-DEGs among DEGs, key module genes and lactate-related genes (LRGs) in NPC. (B) Protein‒protein interaction (PPI) network of LR-DEGs. (C) Chromosomal distribution of the LR-DEGs visualized in a circular format. (D) KEGG pathway enrichment analysis of LR-DEGs. (E-G) Gene Ontology (GO) enrichment analysis of LR-DEGs Development and validation of an LR-DEG-based Signature for NPC To identify key genes, support vector machine recursive feature elimination (SVM-RFE) analysis was conducted on the 17 lactate-related DEGs (LR-DEGs), resulting in nine prominent genes: TPPP3, DNALI1, CHST9, MUC4, CLIC6, SLPI, KIF18B, FAM3D and PIGR (Fig. [103]5A, B). Moreover, the random forest (RF) algorithm identified the top 10 characteristic genes, including CLIC6, MUC4, CHST9, TPPP3, DNALI1, RACGAP1, FAM3D, PARPBP, INHBA and SLPI (Fig. [104]5C, D). Additionally, the Friend algorithm selected 10 critical genes: INHBA, LTF, LCN2, SLPI, FAM3D, RACGAP1, BPIFB1, MUC4, KIF18B and CD1C (Fig. [105]5E). The cross-referencing results from these three algorithms highlighted three core genes: TPPP3, MUC4 and CLIC6 (Fig. [106]5F). In external validation TPPP3, MUC4 and CLIC6 showed strong diagnostic and predictive values for NPC, with AUC values over 0.7, and their expression trends were consistent with those of the training set, establishing them as lactate-related biomarkers for NPC (Fig. [107]5G, H). Analysis of these genes in both the training and validation datasets revealed their downregulation in NPC samples compared with normal controls, with further decreased expression observed in the PD (progressive disease) group relative to the SD (stable disease) group (Fig. [108]5I, J). We also validated the expression of these three genes in clinical specimens. Both qPCR (including 10 paired NPC and normal tissues) and tissue array (including 40 NPC and 45 normal tissues) experiments indicated that the expression levels of TPPP3, MUC4 and CLIC6 were significantly downregulated in tumour tissues compared to normal nasopharyngeal tissues (Fig. [109]5K, L). Fig. 5. [110]Fig. 5 [111]Open in a new tab Screening of lactate-related biomarkers for NPC. (A, B) Core gene identification within LR-DEGs via the SVM algorithm. (C, D) Core gene screening of the LR-DEGs via the RFE algorithm. (E) Core gene screening with the Friends algorithm. (F) Venn diagram showing three key LR-DEGs identified at the intersection of all three machine learning algorithms. (G, H) ROC curves for the three key genes in both the training set and the external validation set. (I, J) Expression levels of TPPP3, MUC4 and CLIC6 across groups in the training and validation sets. (K) qPCR validation of TPPP3, MUC4 and CLIC6 mRNA expression levels in nasopharyngeal carcinoma (NPC) (n = 10) and normal nasopharyngeal tissues (NP)(n = 10). (L) Immunohistochemical validation of TPPP3, MUC4 and CLIC6 protein expression levels in nasopharyngeal carcinoma(n = 40) and normal nasopharyngeal tissues(n = 45). ns, not significant. *p < 0.05, **p < 0.01, ****P < 0.0001 Clinical relevance and functional enrichment analysis of NPC biomarkers To further investigate the link between the identified biomarkers and NPC, a nomogram incorporating these biomarkers was constructed (Fig. [112]6A). The model’s performance and accuracy were validated through calibration, receiver operating characteristic (ROC) and Kaplan‒Meier (KM) curves (Fig. [113]6B‒D). Additionally, decision curve analysis (DCA) was employed to assess the clinical utility of the 1-year and 3-year prognostic models (Fig. [114]6E, F). To explore the roles of TPPP3, MUC4 and CLIC6 in NPC pathogenesis, single-gene gene set enrichment analysis (GSEA) was performed for each biomarker. KEGG pathway enrichment revealed associations with key pathways, including “antigen processing and presentation”, “extracellular matrix organization” and “immune microenvironment regulation” (Fig. [115]6G-I). Fig. 6. [116]Fig. 6 [117]Open in a new tab Clinical and functional enrichment analysis of NPC biomarkers. (A) Nomogram incorporating identified biomarkers for clinical prediction. (B-D) Calibration, receiver operating characteristic (ROC) and Kaplan‒Meier (KM) curves were used to validate the nomogram’s predictive accuracy. (E-F) Decision curve analysis (DCA) confirmed the clinical utility of the prognostic model. (G-I) Single-gene GSEA for TPPP3, MUC4 and CLIC6 revealed associations with pathways related to “antigen processing and presentation”, “extracellular matrix organization” and “immune microenvironment regulation.” Influence of TPPP3, MUC4, and CLIC6 on the immune microenvironment in NPC Considering the link between NPC pathophysiology and the immune microenvironment, we further investigated immune cell distributions within NPC. Using three computational algorithms—Cibersort, ssGSEA and xCell—we analysed immune cell composition (Fig. [118]7A-C). The expression levels of immune cells in the training set were assessed with CIBERSORT and ssGSEA (Fig. [119]7D, E). Notably, 14 immune cell types, including naïve and memory B cells, CD8 + T cells, naïve and resting CD4 + T cells, activated memory CD4 + T cells, follicular helper T cells, M1 macrophages, resting dendritic cells, aDCs, Th1 cells, NK CD56 dim cells, Th17 cells, Th2 cells, NK CD56 bright cells and mast cells, were significantly different in abundance between the NPC samples (Fig. [120]7D, E). Fig. 7. [121]Fig. 7 [122]Open in a new tab Immune infiltration analysis of lactate-related biomarkers in NPC. (A-C) Relative proportions of infiltrating immune cells in NPC, as determined by the CIBERSORT, ssGSEA and xCell algorithms. (D-E) Significant differences in the abundances of 23 distinct immune cell types in NPC samples. (F-I) Correlation analysis between lactate-related biomarkers and 23 immune cell types, illustrating their associations. ns, not significant. *p < 0.05, **p < 0.01, ***p < 0.001, ****P < 0.0001 Further analysis revealed strong correlations between key biomarkers (MUC4, CLIC6 and TPPP3) and specific immune cells, particularly B cells and M0 macrophages (Fig. [123]7F-I). Furthermore, we observed that the stroma, immune, estimated and microenvironment scores were significantly greater in the SD group than in the PD group (Fig. [124]8A). These scores were positively correlated with the expression levels of TPPP3, MUC4 and CLIC6, suggesting an association between these biomarkers and the tumor microenvironment (Fig. [125]8B). Fig. 8. [126]Fig. 8 [127]Open in a new tab Analysis of biomarker roles within the NPC immune microenvironment. (A) Matrix, immune, ESTIMATE and microenvironment scores were significantly higher in NPC patients in the SD group than in those in the PD group. (B) Correlation analysis revealed that these four scores were positively correlated with the expression of TPPP3, CLIC6 and MUC4, suggesting that these biomarkers are not significantly associated with immune microenvironment characteristics. *p < 0.05, **p < 0.01, ***p < 0.001, ****P < 0.0001 Analysis of the regulatory network and drugs used in NPC To investigate the regulatory mechanisms of TPPP3, MUC4 and CLIC6, a “TF-miRNA-gene” network comprising 28 nodes and 34 edges was constructed (Fig. [128]9A). In this network, CREB1 was identified as a likely regulator of MUC4 and TPPP3, whereas hsa-miR-574-5p appeared to regulate CLIC6 expression. Additionally, drug candidates targeting TPPP3, MUC4 and CLIC6 were identified through the DrugBank and ChEMBL databases. The resulting biomarker‒drug interaction network, shown in Fig. [129]9B, includes 32 nodes and 71 edges. Notably, gefitinib and bortezomib were found to be common therapeutic targets for all three biomarkers, suggesting their potential significance in NPC treatment. Fig. 9. [130]Fig. 9 [131]Open in a new tab Analysis of the role of biomarkers in NPC. (A) ‘TF‒miRNA‒gene’ network presenting the regulatory mechanisms of TPPP3, MUC4 and CLIC6. (B) Relationships between biomarkers and drugs predicted from the DrugBank and ChEMBL databases Discussion Nasopharyngeal carcinoma (NPC) is a malignant tumor originating from the epithelial cells of the nasopharynx; its distinct geographic distribution is strongly linked to Epstein–Barr virus (EBV) infection [[132]3, [133]12, [134]31]. Clinically, NPC poses significant challenges because of its typical late-stage diagnosis, aggressive behavior and high metastatic potential [[135]32]. Uncovering the molecular mechanisms underlying NPC is essential for advancing diagnostic and therapeutic strategies. Recent studies have highlighted the role of metabolic reprogramming—specifically lactate modification—and immune dysfunction in NPC pathogenesis. Lactate, which has long been considered a byproduct of anaerobic metabolism, has emerged as a key factor in the tumor microenvironment. The Warburg effect, which is commonly observed in many cancers, describes the preference of cancer cells for glycolysis over oxidative phosphorylation, even in oxygen-sufficient conditions [[136]33–[137]35]. This shift results in elevated lactate levels, leading to an acidic microenvironment that can modulate immune responses [[138]36]. Elevated lactate in tumors has been shown to promote immune evasion by altering the function of immune cells, including macrophages and T cells, underscoring the importance of understanding the influence of lactate on immune regulation within NPC. Our study focused on the lactate-related modifications of three key genes—TPPP3, CLIC6 and MUC4—each of which plays an important role in NPC progression. TPPP3, which is involved in microtubule dynamics, cell proliferation, and apoptosis, has been associated with various cancers, including NPC, where it may contribute to tumor growth and metastasis [[139]37–[140]41]. CLIC6, a chloride intracellular channel protein, plays a role in ion transport and cellular signalling, potentially influencing immune cell activation and inflammation [[141]42–[142]44]. Its altered expression in cancers suggests that it may contribute to immune evasion mechanisms in NPC. MUC4, a mucin family member, supports epithelial cell protection and signalling [[143]45–[144]48]. In NPC, MUC4 is associated with tumor growth and metastasis through its roles in cell adhesion and immune evasion. The involvement of lactate in lactate metabolism could provide insight into how lactate influences immune suppression in NPC by affecting antigen presentation and immune recognition. Our research identified TPPP3, CLIC6 and MUC4 as core lactate-related genes with diagnostic potential in NPC [[145]24, [146]49, [147]50]. These genes are significantly associated with pathways such as “antigen processing and presentation”, “extracellular matrix organization” and “immune microenvironment regulation”, which are crucial for initiating immune responses against tumors [[148]51–[149]53]. The influence of lactate on these pathways may enable NPC cells to evade immune detection, thus facilitating tumor progression. Additionally, we found a significant correlation between these biomarkers and immune cell populations, particularly M0 macrophages and regulatory B cells [[150]54]. M0 macrophages, which exhibit phenotypic plasticity, can differentiate into subtypes on the basis of the microenvironment [[151]4, [152]53, [153]55, [154]56]. In NPC, elevated lactate may skew M0 macrophages toward an immunosuppressive phenotype, further promoting tumor growth. B cells, which are known for maintaining immune homeostasis, suppress effector T-cell responses and contribute to immune evasion in tumors. The associations of TPPP3, CLIC6 and MUC4 with these immune cell types highlight the role of lactate in modulating the immune microenvironment of NPC. This study provides key insights into the role of lactate in NPC and its effects on immune function. By identifying TPPP3, CLIC6 and MUC4 as lactate-related biomarkers [[155]37–[156]41, [157]57–[158]66], we highlight their importance in understanding NPC pathogenesis. Furthermore, the involvement of these genes in immune modulation presents new therapeutic opportunities. Targeting lactate metabolism or the pathways associated with these biomarkers could improve treatment efficacy and patient outcomes. Despite our best efforts, we acknowledge that future research will need to address three significant aspects [[159]1]. Given that the mechanisms of diseases related to lactate and lactylation are emerging hot topics in recent years, we should plan to perform lactate treatment experiments in cell lines to study the underlying molecular regulatory mechanisms, when we secure research funding support [[160]2]. Biomarkers should be closely linked to clinical applications. To better verify the reliability and potential significance of our data, we have initially validated the expression levels of TPPP3, MUC4 and CLIC6 in NPC specimens and normal tissues using qPCR (Fig. [161]5K), potentially supporting the transcriptional regulation of these genes in NPC. Furthermore, we have thoroughly validated our results by using tissue microarrays (Fig. [162]5L). We believe that this should provide useful insights and suggestions for the future research of our team and other colleagues on the regulation mechanism of lactylation in NPC [[163]3]. biomarkers should be closely linked to clinical applications. Among the three proteins mentioned, we found that MUC4 is a secreted protein and can potentially be detected in serum samples. The other two proteins are not secretory, but they may potentially have clinical utility in immunohistochemistry analyses by pathologists. They could likely be used for clinical biopsies or postoperative assessment of prognosis or treatment effectiveness in the future. In conclusion, this research identified three lactate-related core genes—TPPP3, CLIC6 and MUC4—with significant diagnostic potential in NPC. These genes may contribute to NPC pathogenesis through immune modulation mechanisms, particularly those involving M0 macrophages and B cells. By clarifying the relationships among lactate, immune function, and these biomarkers, this study lays the groundwork for the development of new diagnostic and therapeutic strategies for NPC. Further studies are needed to explore how lactate impacts the immune microenvironment in NPC and to validate the clinical relevance of these biomarkers for diagnosing and treating this challenging malignancy. Conclusion In summary, this study identified three lactate-related hub genes—TPPP3, MUC4 and CLIC6—as promising biomarkers for diagnosing and predicting disease progression in nasopharyngeal carcinoma (NPC). These genes appear to play roles in NPC pathogenesis through pathways related to “antigen processing and presentation”, “extracellular matrix organization” and “immune microenvironment regulation”. Furthermore, associations between these biomarkers and immune cells, particularly M0 macrophages and regulatory B cells, were observed. These findings underscore the critical influence of lactate on NPC progression, offering a foundation for the future development of diagnostic and therapeutic approaches targeting these biomarkers for NPC management. Electronic supplementary material Below is the link to the electronic supplementary material. [164]Supplementary Material 1^ (6.6MB, tif) [165]Supplementary Material 2^ (5.8MB, tif) Acknowledgements