Abstract Recent research emphasised the indispensable role of histone lactylation in the activation of hepatic stellate cells. The VHL mutation is extremely common in clear cell renal cell carcinoma, which normally causes a metabolic shift in cancer cells and increases lactate production, eventually creating a lactate-enriched tumour microenvironment. Cancer-associated fibroblasts (CAFs) promote tumour progression, which is also vital in clear cell renal cell carcinoma. Therefore, this study investigated histone lactylation in CAFs and its impact on patient survival. Multiomics technology was employed to determine the role of histone lactylation-related genes in the evolution of CAFs which correlated with the function and molecular signatures of CAFs. The results suggested that TIMP1 was the hub gene of histone lactylation-related genes in clear cell renal cell carcinoma. Keywords: Histone lactylation, Cancer-associated fibroblasts, Tumour microenvironment, TIMP1 Highlights * • This work provided a novel angle into the connection of histone lactylation and the evolution of CAFs. * • This work defined three histone lactylation stages of CAFs, which had distinct molecular and functioning modalities. * • Based on multi-omics data, this work illustrated the distribution and molecular patterns of high-CAF infiltrated area. * • Using machine learning methods, this work defined TIMP1 as the hub gene in the histone lactylation related genes. 1. Introduction In 2022, kidney and renal pelvis cancer comprised 5 per cent of estimated new cases, ranking sixth among all cancer types [[49]1]. Clear cell renal cell carcinoma (ccRCC) is the most common type of renal cell carcinoma, accounting for more than 70 per cent of RCC. VHL mutations occur in most ccRCC cases and induce cancer cells to undergo metabolic reprogramming to an aerobic glycolysis phenotype [[50]2,[51]3], generating a lactate-enriched tumour microenvironment (TME). A recent study disclosed the significance of histone lactylation in hepatic stellate cell activation and identified 1126 H3K18la-marked genes [[52]4]. In addition, Yang et al. elucidated that the coupling of inactive VHL and histone lactylation induced platelet-derived growth factor receptor β (PDGFRβ)promoting tumour progression, and inhibition of histone lactylation and PDGFRβ impeded tumour growth and improved therapeutic efficacy in ccRCC [[53]5]. Since the concept of histone lactylation was proposed [[54]6], its importance has been corroborated in numerous cancer types including ocular melanoma [[55]7], prostate cancer [[56]8], liver cancer [[57]9] and ccRCC [[58]10]. However, few studies have investigated the role of histone lactylation in cancer-associated fibroblasts (CAFs) in ccRCC. Considering their great value in shaping the TME [[59]11,[60]12] and assisting tumour resistance to immunotherapy [[61]13,[62]14], it is necessary to investigate the plasticity of CAFs to histone lactylation. This study employed multiomic technologies to investigate the significant role of histone lactylation-related genes in the evolution of CAFs. The evolutionary process of CAFs with histone lactylation-related genes was explored using single-cell RNA sequencing (scRNA-seq) data and three stages were defined based on the gene expression trajectories. The distribution modalities and gene expression patterns were explored in different molecular clusters, where common patterns of histone lactylation-related genes were observed in the high myoCAF infiltrated region. Moreover, patients in the TCGA-KIRC cohort were clustered into high and low histone lactylation subgroups with significant differences in clinical status and immune signatures. Finally, TIMP1 was identified as the hub gene, and it alone or together with its receptors had significant value in predicting patient overall survival which was gender-related. 2. Methods 2.1. Data acquisition The bulk RNA sequencing results and clinical data of TCGA-KIRC (tumour: 541, normal: 72) were downloaded from the Cancer Genome Atlas (TCGA) database ([63]https://www.cancer.gov/tcga) with the E-MTAB-1980 cohort containing 101 ccRCC samples utilised for validation. Notably, the raw count format data of TCGA-KIRC was used for the detection of differentially expressed genes (DEGs), while the transcripts per kilobase million (TPM) format data was utilised for other analyses. For genes with multiple sequencing results, the highest RNA expression was utilised. The single-cell RNA sequencing (scRNA-seq) data, single-cell Assay for Transposase-Accessible Chromatin using sequencing (scATAC-seq) data and Spatial Transcriptomics (ST) data can be accessed in [64]GSE207493 [[65]15], [66]GSE210042 [[67]10] and [68]GSE175540 [[69]16]. The immunomodulatory genes were generated and revised from previous research [[70]17]. 2.2. Differentially expressed gene extraction The Wilcoxon test and R package Limma [[71]18] were used to detect differentially expressed histone lactylation-related genes in tumour and normal tissues, with the criteria set as the absolute logFC greater or equal to 0.2 and a false discovery rate (FDR) lower than 0.05. Random forest was performed using the R package randomForest [[72]19], and the uniCox and Least absolute shrinkage and selection operator (LASSO) was based on the R package glmnet [[73][20], [74][21], [75][22]]. 2.3. Calculation of enrichment scores Genes used to calculate the histone lactylation score were obtained from the study by Rho et al. [[76]4] which identified 1126 genes marked by H3K18la in the activation of hepatic stellate cells. These gene symbols were transformed to yield 1118 genes with corresponding gene symbols, referred to as the histone lactylation-related gene (HLRG) list, which was used to calculate the histone lactylation score (see Supplemental materials). The normal fibroblast and CAF-related signatures were generated from R package IOBR [[77]23] including Normal.Fibroblast, ecm.myCAF, TGFb.myCAF, wound.myCAF, detox.iCAF, IL.iCAF and IFNG.iCAF. The histone lactylation scores and CAF-related signature scores in scRNA-seq and ST data were calculated using AddModuleScore based on R package Seurat [[78][24], [79][25], [80][26], [81][27]], while the histone lactylation scores were calculated using Gene Set Variation Analysis (GSVA) in the bulk RNA sequencing data based on the 380 upregulated genes in tumour tissue according to the R package GSVA [[82]28]. The estimated CAF-related signature scores in bulk RNA sequencing data were obtained using R package IOBR based on the single sample gene set enrichment analysis (ssGSEA) algorithm. 2.4. Survival analysis The prognostic analysis was implemented using the R packages survival and survminer [[83]29]. The statistically significant criterion was set as p-value <0.05 and the surv_cutpoint was used to calculate the optimal cutoff value. A nomogram was constructed based on the R package nomogramFormula which was drawn using DynNom [[84]30]. 2.5. Correlation analysis The correlation analysis was conducted using the R package ggplot2 [[85]31] based on spearman correlation method, and the statistically significant criterion was set as p-value <0.05. 2.6. Processing of the single-cell RNA sequencing data The R package Seurat [[86][24], [87][25], [88][26], [89][27]] was used to filter out low-quality cells in [90]GSE210038, retaining cells with feature counts <3500 and a percentage of mitochondrial reads <20 %. Then, FindIntegrationAnchors (with the reduction parameter set as “rpca”) and the IntegrateData command were used for data integration and batch effect removal before the RunPCA and RunUMAP commands were used for dimensionality reduction. The 30 most remarkable principal components and the top 2000 highly variable genes were filtered and clustered using the FindNeighbors and FindClusters commands with a resolution of 0.8. The canonical marker genes to identify clusters were obtained from the literature. For the re-clustering of MSCs, the resolution parameter was set to 0.1 to avoid excessive distinction. Notably, for scRNA-seq data of [91]GSE207493, the raw data was first filtered with the following criteria: 1) after clustering, the space between each cell cluster should be distinct; 2) the absolute cell number of ACTA2 positive clusters should be greater than 200; 3) the cell proportion of ACTA2 positive clusters should be greater than 0.07. Finally, 9 of 19 samples were selected with a comparatively sufficient number of MSCs for further study ([92]Fig. S1). The filtering criteria were set as a feature count <7500, a unique molecular identifier (UMI) count ranging from 1500 to 10000 and a percentage of mitochondrial reads <10 %. The integration and dimensionality reduction methods were consistent with [93]GSE210038 and the resolution parameter was set to 0.6. 2.7. Evaluation of tissue distribution preferences