Abstract This study aims to investigate metabolic reprogramming heterogeneity in hepatocellular carcinoma (HCC) cells and identify novel therapeutic targets for HCC treatment. Single-cell RNA sequencing data from public databases were used to analyze the TME of HCC and reveal the characteristics of different cell subsets, including mononuclear phagocytes, epithelial cells, endothelial cells, NK/T cells, B cells, and unknown cells. The analysis revealed that these cell subsets play their own unique roles in tumor progression and immune escape. Analysis of copy number variations (CNVs) was performed on tumor-derived epithelial cells, with the epithelial cells in Cluster 3 subgroup showing the highest CNV levels. Gene Ontology (GO) enrichment analysis revealed that these cell subsets were involved in a variety of biological processes such as immune response, cell communication, and metabolic pathways, which were consistent with their functional roles. Pseudotemporal analysis further delineated the malignant trajectory of HCC cells, with Cluster 3 exhibiting enhanced phosphatidylinositol metabolism, suggesting a critical role for metabolic reprogramming in tumor invasion and proliferation. Furthermore, a diagnostic model incorporating metabolic reprogramming-associated gene signatures was established, which effectively distinguished HCC from normal tissues. Among these signatures, splicing factor 3a subunit 3 (SF3A3) was identified as both diagnostic and independent prognostic biomarker. Mechanistically, SF3A3 knockdown in HCC cell lines significantly suppressed proliferation, migration, PI3K/AKT signaling, and EMT marker expression, thereby demonstrating its role in driving HCC aggressiveness. In conclusion, these findings elucidate novel molecular characteristics of HCC based on metabolic reprogramming, while establishing SF3A3 as a promising multi-faceted target for HCC diagnosis, prognostic assessment, and therapeutic intervention. Introduction Hepatocellular carcinoma (HCC) is one of the malignancies with a high incidence and mortality rate worldwide [[42]1]. According to global cancer statistics, HCC ranks first in incidence among liver tumors, and its associated mortality has been steadily increasing over the years [[43]2]. The development of HCC is related to various risk factors, including chronic hepatitis virus infection, cirrhosis, alcohol abuse, and metabolic syndrome [[44]3]. Metabolic reprogramming refers to extensive adjustments made by cells to their energy and material metabolism under specific physiological or pathological conditions [[45]4]. In recent years, metabolic reprogramming has been increasingly recognized for its significance in the initiation and progression of HCC [[46]5]. Metabolic reprogramming leads to the biological heterogeneity of HCC, causing it to exhibit distinct biological characteristics and clinical outcomes in different patients [[47]6]. In HCC, metabolic reprogramming is primarily manifested as significant alterations in glucose, lipid, and amino acid metabolic pathways [[48]7,[49]8]. These metabolic changes enable tumor cells to favor aerobic glycolysis, even under normoxic conditions [[50]9]. Phosphatidylinositol and its phosphorylated derivatives serve as pivotal signaling molecules that orchestrate diverse cellular processes via dynamic metabolic remodeling [[51]10]. Aberrant phosphatidylinositol remodeling promotes tumor progression through constitutive activation of oncogenic pathways, particularly the PI3K/AKT/mTOR axis, and extensive metabolic reprogramming [[52]11,[53]12]. The PI3K/AKT pathway plays a crucial role in metabolic reprogramming in HCC [[54]13]. This signaling axis promotes the Warburg effect and metabolic reprogramming in HCC through coordinated activation of glycolytic enzymes HK2, PFK-1 and PKM2, concurrent suppression of PDK1-dependent mitochondrial oxidative metabolism, and synergistic integration with oncogenic signals including CD36/Src upregulation, SHH overexpression and PTEN loss [[55]14]. These oncogenic metabolic reprogramming are mediated by the precisely regulated phosphorylation of phosphatidylinositol by specific phosphatidylinositol kinases, with phosphoinositide 3-kinase (PI3K) serving as the central effector that coordinates metabolic alterations with proliferative signaling cascades in HCC [[56]15]. The rapid advancement of single-cell RNA sequencing (scRNA-seq) technology has paved the way for systematic dissection of distinct metabolic landscapes in diverse cell populations within HCC [[57]16]. In our study, through integrating single-cell multi-omics data to systematically identify critical metabolic reprogramming-associated gene signatures and verify SF3A3’s function in HCC cells, we found that obvious metabolic heterogeneity exists within tumor epithelial cells, and SF3A3 can serve as a biomarker for diagnosis and prognosis assessment as well as a potential therapeutic target in HCC. Materials and methods Tumor data acquisition The single-cell sequencing data and gene expression data used in this study were obtained from the Gene Expression Omnibus (GEO), The Cancer Genome Atlas (TCGA), and the International Cancer Genome Consortium (ICGC). From the GEO database, we downloaded the following gene expression datasets: [58]GSE22058 (100 tumor samples, 97 normal samples), [59]GSE112790 (183 tumor samples, 15 normal samples), [60]GSE101685 (24 tumor samples, 8 normal samples), [61]GSE84402 (14 tumor samples, 14 normal samples), [62]GSE62232 (81 tumor samples, 10 normal samples), [63]GSE121248 (70 tumor samples, 37 normal samples), [64]GSE36376 (240 tumor samples, 193 normal samples), [65]GSE14520 (225 tumor samples, 220 normal samples), [66]GSE76427 (115 tumor samples, 52 normal samples), and [67]GSE10143 (80 tumor samples, 307 normal samples). Single-cell datasets [68]GSE149614 and [69]GSE151530 were also retrieved from the GEO database. Gene expression profiles and clinical information of 369 HCC samples and 50 adjacent normal tissue samples were obtained from the TCGA database, and expression profiles and clinical information for 243 tumor tissues and 202 normal tissues were downloaded from the ICGC database. Preliminary scRNA-seq analysis Single-cell RNA sequencing (scRNA-seq) analysis was conducted using datasets [70]GSE149614 and [71]GSE151530. During the processing of the scRNA-seq data, we retained high-quality cells that meet the following criteria: mitochondrial gene expression accounted for less than 10%, red blood cell gene expression was under 3%, and the number of expressed genes ranged from 200 to 8000. We used the three-step “NFS” strategy (NormalizeData, FindVariableFeatures, and ScaleData) for data normalization. Cell cycle gene normalization was performed using the “cc.genes.updated.2019” dataset from the “Seurat” package, and potential doublets were removed using the “DoubletFinder” package. The remaining cells were processed to detect 2000 highly variable genes using the “FindVariableFeatures” function. Subsequently, principal component analysis (PCA) was used to reduce the dimensionality of the scRNA-seq data. To eliminate batch effects between samples, the “Harmony” package was employed to perform soft k-means clustering. The single-cell data were split by origin into tumor and normal cells, which were then clustered separately. Cell clustering was carried out using the “FindClusters” function with a resolution parameter set to 0.31. Ultimately, 28,906 cells were classified as tumor cells, and 7,694 cells were categorized as normal cells. Cells were annotated and divided into various subgroups based on the results of dimensionality reduction and clustering. Epithelial cell subgroups were extracted, and epithelial subgroups derived from tumor cells and normal cells were included in the “inferCNV” package for computation. To accurately identify malignant cells that exhibit clonally extensive chromosomal copy number variations (CNV), the epithelial subgroups derived from normal cells were used as references to infer CNV profiles. K-means