Abstract

   Hepatocellular carcinoma (HCC) is a major cause of cancer‐related
   mortality, while the hepatocyte mechanisms driving oncogenesis remains
   poorly understood. In this study, single‐nucleus RNA sequencing of
   samples from 22 HCC patients revealed 10 distinct hepatocyte subtypes,
   including beneficial Hep0, predominantly malignant Hep2, and
   immunosuppressive Hep9. These subtypes were strongly associated with
   patient prognosis, confirmed in TCGA‐LIHC and Fudan HCC cohorts through
   hepatocyte composition deconvolution. A quantile‐based scoring method
   is developed to integrate data from 29 public HCC datasets, creating a
   Quantile Distribution Model (QDM) with excellent diagnostic accuracy
   (Area Under the Curve, AUC = 0.968‐0.982). QDM was employed to screen
   potential biomarkers, revealing that PDE7B functions as a key gene
   whose suppression promotes HCC progression. Guided by the genes
   specific to Hep0/2/9 subtypes, HCC is categorized into metabolic,
   inflammatory, and matrix classes, which are distinguishable in gene
   mutation frequencies, survival times, enriched pathways, and immune
   infiltration. Meanwhile, the sensitive drugs of the three HCC classes
   are identified, namely ouabain, teniposide, and TG‐101348. This study
   presents the largest single‐cell hepatocyte dataset to date, offering
   transformative insights into hepatocarcinogenesis and a comprehensive
   framework for advancing HCC diagnostics, prognostics, and personalized
   treatment strategies.

   Keywords: hepatocellular carcinoma classification, hepatocyte subtype,
   metabolic pathway, quantile distribution model, single‐cell
   transcriptomics
     __________________________________________________________________

   By combining snRNA‐seq and bulk RNA‐seq, this study identified key
   hepatocyte subtypes, genes, and pathways in HCC. It established a
   Quantile Distribution Model, providing potential targets for HCC
   diagnosis and treatment. It categorized HCC into metabolic,
   inflammatory, and matrix classes, each with sensitive drugs identified.
   The findings offer insights into HCC's oncogenic landscape and future
   applications.

   graphic file with name ADVS-12-2412944-g007.jpg

1. Introduction

   Hepatocellular carcinoma (HCC) is the predominant histological subtype
   of primary liver cancer, accounting for over 80% of cases, and ranks as
   the third leading cause of cancer‐related mortality worldwide.^[ [60]^1
   ^] HCC is a malignant tumor characterized by multi‐gene dysregulation
   and significant heterogeneity, posing challenges to its diagnosis and
   therapy. Therefore, precise characterization of the oncogenic
   mechanisms in HCC is urgently needed in clinical practice.

   Traditional bulk RNA sequencing (bRNA‐seq) provides an overall
   transcriptomic profile of cancer tissues but often obscures gene
   expression information from rare cell subpopulations. Recently,
   single‐cell RNA sequencing (scRNA‐seq) has emerged as a powerful tool
   for exploring liver cancer heterogeneity, particularly in studies of
   the tumor immune microenvironment and tumor‐associated macrophages and
   fibroblasts. However, due to their fragility and large size,
   hepatocytes are often sparsely captured by scRNA‐seq, limiting detailed
   investigations of hepatocellular carcinogenesis at the single‐cell
   level. Single‐nucleus RNA sequencing (snRNA‐seq) unbiasedly captures
   all nuclei within a sample, thereby providing a more accurate
   representation of the natural cell type composition across various
   tissues. This method is particularly suited for frozen or
   challenging‐to‐dissociate samples.^[ [61]^2 ^] Consequently, snRNA‐seq
   enables a comprehensive profiling of hepatocytes beyond immune cells,
   facilitating more targeted investigations of hepatocytes.^[ [62]^3 ^]

   Serum alpha‐fetoprotein (AFP), despite its widespread use as a pivotal
   biomarker in HCC diagnosis (specificity 80%–90%), demonstrates limited
   sensitivity (40%–60%).^[ [63]^4 ^] Conversely, multi‐parametric
   serological models incorporating AFP shows superior diagnostic
   performance compared to individual biomarkers alone. The GALAD model,^[
   [64]^5 ^] which utilizes three serological markers (AFP, AFP‐L3, and
   DCP) along with patient age and sex achieved an area under the receiver
   operating characteristic curve (AUROC) exceeding 0.90 across multiple
   international cohorts. In a multi‐institutional cohort study, the GALAD
   model demonstrated a diagnostic AUROC of 0.935, outperforming
   individual biomarkers AFP (0.863), AFP‐L3 (0.810), PIVKA‐II (0.872),
   and the BALAD‐2 model (0.883).^[ [65]^6 ^] In recent years, various
   novel diagnostic models independent of serum protein have emerged,
   including those driven by DNA methylation (AUROC: 0.978–0.981),^[
   [66]^7 ^] tumor mutational burden (AUROC: 0.928–0.950),^[ [67]^8 ^] and
   salivary protein glycopatterns (AUROC: 0.857‐0.886).^[ [68]^9 ^] While
   these new models were clinically constrained, they demonstrated
   significant potential in HCC diagnosis.

   The selection of modeling methodologies is a pivotal factor in defining
   the attributes of HCC diagnostic models, with logistic regression and
   machine learning algorithms being predominantly utilized. Logistic
   regression‐based HCC diagnostic models are characterized by simplicity,
   speed, and strong interpretability, making them well‐suited for
   linearly separable problems and amenable to facile adjustments.
   However, their diagnostic performance is relatively constrained (AUROC:
   0.862–0.94118)^[ [69]^7 , [70]^10 ^] and they exhibit sensitivity to
   outliers. Conversely, machine learning‐based HCC diagnostic models
   demonstrate robust predictive capabilities (AUROC: 0.893–0.99)^[
   [71]^11 ^] and efficient data handling capacities, adept at managing
   diverse data types and complex features. However, these models are
   highly complicated, necessitating specialized technical expertise and
   high‐performance computing platforms for continual training and
   refinement, and they frequently lack transparency in their outcomes.

   In this study, we applied snRNA‐seq and bRNA‐seq to 77 HCC patients
   (SMU cohort), conducting a comprehensive analysis of the oncogenic
   landscape and molecular regulation of HCC. Building on these findings
   and integrating 29 public human HCC datasets comprising 4192 samples,
   we developed a quantile‐based scoring method and constructed a Quantile
   Distribution Model (QDM) for HCC. Additionally, we define three classes
   of HCC: metabolic‐HCC, inflammatory‐HCC, and matrix‐HCC. We elucidated
   their clinical characteristics and further identified preferential
   therapeutic agents for each class of HCC.

2. Results

2.1. Revealing Major Cell Types and Overall Transcriptomic Profile of HCC
Based on snRNA‐Seq

   To comprehensively understand the cellular composition and diversity of
   HCC, we collected 209 liver tissue samples from a cohort of 77 HCC
   patients (SMU cohort, Table [72]S1, Supporting Information), including
   cancer, paracancer, and normal tissues (Figure [73]1A). We combined
   single‐nucleus RNA sequencing (snRNA‐seq; 22 cancer and 22 paracancer
   samples) with bulk RNA sequencing (bRNA‐seq; 165 samples) to
   investigate the mechanisms underlying hepatocellular carcinogenesis.
   Through computational analyses, we identified the major cell types and
   hepatocyte subtypes in HCC. Based on the characterization of hepatocyte
   subtypes, we integrated 29 HCC datasets to construct a diagnostic model
   for HCC.

Figure 1.

   Figure 1
   [74]Open in a new tab

   Revealing major cell types and overall transcriptomic profile of HCC
   based on snRNA‐seq. A) Workflow for detection and analysis of HCC. B)
   UMAP visualization of major cell types based on snRNA‐seq data, with
   colors representing different cell types. Samples were obtained from
   liver tissues of HCC patients undergoing surgical resection, including
   both cancer and paracancer tissues. C) Heatmap visualization of marker
   genes for major cell types, with colors representing gene expression
   levels. D) Sample‐to‐sample correlation across different tissues based
   on bRNA‐seq data. E) DEGs between different tissues based on bRNA‐seq
   data, with green indicating down‐regulated genes and red indicating
   up‐regulated genes. F) KEGG enrichment analysis of co‐downregulated and
   co‐upregulated DEGs.

   Using snRNA‐seq data, we obtained single‐cell transcriptomic profiles
   for 174756 cells and annotated six major cell types using known
   lineage‐specific marker genes: Hepatocyte (Hep), Liver immune cell
   (LIC), Endothelial cell (EC), Hepatic stellate cell (HSC), Kupffer cell
   (KC), and Cholangiocyte (Cho) (Figure [75]1B,C; Figure [76]S1A‐C,
   Supporting Information). The proportion of hepatocytes in scRNA‐seq
   data was ≈ 4.68%–24.48%,^[ [77]^12 ^] while it reached 85.23%
   (148 182/174 756, Table [78]S2, Supporting Information) in our
   snRNA‐seq data, closely reflecting the liver cellular composition under
   normal physiological conditions (Figure [79]S1D, Supporting
   Information).

   Using bRNA‐seq data, we conducted a sample‐to‐sample correlation
   analysis, showing that paracancer and normal tissues share highly
   similar molecular characteristics compared to cancer tissues
   (Figure [80]1D). Differentially expressed genes (DEGs) analysis using
   thresholds of |log2FoldChange| > 1.5 and p < 0.05 (Table [81]S3,
   Supporting Information) identified 543 downregulated and 1828
   upregulated genes in cancer vs paracancer tissues, and 604
   downregulated and 2038 upregulated genes in cancer vs normal tissues.
   Very few DEGs were found between paracancer and normal tissues,
   underscoring their high similarity (Figure [82]1E; Figure [83]S1E,
   Supporting Information). KEGG enrichment analysis of the intersection
   genes from the cancer vs paracancer group and the cancer vs normal
   group (Table [84]S4, Supporting Information) revealed that
   co‐downregulated DEGs were involved in pathways such as “Retinol
   metabolism, Tryptophan metabolism, Cytokine‐cytokine receptor
   interaction, Malaria, and Chemical carcinogenesis‐DNA adducts”.
   Conversely, co‐upregulated DEGs were associated with pathways including
   “Cell cycle, Systemic lupus erythematosus, Alcoholism, Neutrophil
   extracellular trap formation, and Cytoskeleton in muscle cells”. Many
   of these pathways are relevant to HCC and are significant pathogenic
   factors in its development.^[ [85]^13 ^] It is worth noting that GHR's
   role in HCC development is not fully elucidated, however, some studies
   have shown that GHR expression is downregulated in liver cancer
   tissues.^[ [86]^14 ^]

2.2. Deciphering the Genomic Evolution of Hepatocytes of HCC Regarding its
Pathogenesis

   To elucidate the oncogenic landscape of hepatocytes in HCC, we
   extracted 148 182 hepatocytes from snRNA‐seq data (Figure [87]2A),
   which were reclustered into 10 hepatocyte subtypes (Hep0–Hep9; Table
   [88]S5, Supporting Information). We examined the tissue origin of these
   subtypes and found that the majority of cells in Hep7/8/9 originated
   from cancer tissues (87%, 100%, and 89%, respectively; Table [89]S6,
   Supporting Information), while most cells in Hep0 originated from
   paracancer tissues (65%). Further analysis revealed that Hep7 and Hep8
   are intertumor heterogeneous subtypes, predominantly derived from
   individual patients (76% and 87%, respectively; Table [90]S6,
   Supporting Information). Using the Tricycle^[ [91]^15 ^] method based
   on transfer learning to infer cell cycle stages, we discovered that
   Hep1 is primarily in the G1/G0 phase, while Hep0/2 are mainly in the S
   phase (Figure [92]2B). RNA velocity analysis using scVelo^[ [93]^16 ^]
   revealed that Hep1 exhibits the least active transcriptional dynamics,
   characterized by the lowest levels of unspliced RNA (Figure [94]2C).
   These results suggest that Hep1 represents a quiescent hepatocyte
   subtype.

Figure 2.

   Figure 2
   [95]Open in a new tab

   Deciphering the genomic evolution of hepatocytes of HCC regarding its
   pathogenesis. A) Ten hepatocyte subtypes were identified from the
   snRNA‐seq data. The circular diagram shows the tissue origin of each
   hepatocyte subtype. B) Cell cycle status for the hepatocyte subtypes.
   C) RNA velocity analysis of hepatocyte subtypes. The direction and
   length of the arrows represent the trend and rate of change in the
   unspliced‐to‐spliced mRNA ratio. D) Copy number variations (CNVs) in
   the hepatocyte subtypes. Red indicates copy number gain, and blue
   indicates copy number loss. E) CNV regions in Hep1‐chr12, Hep2‐chr3,
   Hep9‐chr6, and Hep9‐chr11. Regions with CNV gains are highlighted in
   red, and regions with CNV losses are highlighted in blue. Circle
   symbols represent differential genes, while rectangle symbols denote
   diagnostic genes from this study; colors indicate fold change
   differences. F) CNV scores across 10 hepatocyte subtypes relative to
   endothelial cells. *p < 0.05; ****p < 0.0001. G) Correlation between
   the proportion of Hep9 in cancer tissues and serum AFP levels of HCC
   patients. H) Correlation between the proportion of Hep0/9 and the
   proportion of Tregs in the TCGA cohort. I) Correlation between the
   proportions of Hep0/9 and patient prognosis in the TCGA cohort.

   We employed inferCNV^[ [96]^17 ^] to identify copy number variations
   (CNVs) across various hepatocyte subtypes, revealing distinct CNV
   region patterns. For instance, Hep8 exhibits large‐scale CNV gains on
   chromosome 1 (position: 146.94–235.65 million), while Hep7 shows
   large‐scale gains (position: 116.37–186.42 million) and losses
   (position: 226.90–247.17 million) (Figure [97]2D). These differences in
   CNV patterns may be associated with cellular pathologic heterogeneity
   and clonal diversity within HCC. We specifically analyzed CNV patterns
   in Hep0/2/9 in relation to HCC progression. Examination of CNV regions
   in Hep1‐chr12, Hep2‐chr3, Hep9‐chr6, and Hep9‐chr11 (Figure [98]2E)
   revealed corresponding changes in DEGs (top 10 genes) consistent with
   CNV patterns, notably the PDE7B gene within a high‐frequency loss CNV
   region on chr6, which is downregulated in Hep9. PDE7B was included as a
   diagnostic gene in subsequent analyses based on systematic criteria.

   To explore the genomic evolution pattern, we constructed subclonal
   architectures for each patient using the inferCNV pipeline and
   visualized tumor evolution with UPhyloplot2.^[ [99]^18 ^] This analysis
   revealed consistency in early CNV events, including chromosomal segment
   gains at 1q, 7q, 9q, 11p, 12p, 16p, 19p, and 19q, observed in over 30%
   of patients (Figure [100]S2, Supporting Information). Notably, a gain
   in 19q was present in 72.7% (16/22) of patients, highlighting its
   potential role in HCC onset.

   CNV scores showed that Hep0–9 were significantly higher than the
   reference endothelial cells, with Hep9 ranking at the top
   (Figure [101]2F). We further examined the clinical relevance of Hep9 in
   the 22 snRNA‐seq HCC cases from the SMU cohort. Patients were
   categorized into Hep9‐High and Hep9‐Low groups according to the
   proportion of Hep9 in their cancer tissues. The results revealed that
   the serum AFP levels were significantly higher in the Hep9‐High group
   compared to the Hep9‐Low group (p = 0.013, Figure [102]2G). Using
   CIBERSORTx,^[ [103]^19 ^] we decomposed 22 immune cell types from the
   TCGA‐LIHC cohort^[ [104]^20 ^] and found that HCC tissues with a high
   proportion of Hep9 had increased levels of regulatory T cells (Tregs)
   (p = 0.013, Figure [105]2H), which inhibit cancer immune surveillance
   and promoted tumor development.^[ [106]^13 ^] Conversely, a lower
   proportion of Hep0 was associated with a higher proportion of Tregs (p
   < 0.0001). Importantly, leveraging the characteristic spectra of
   hepatocyte subtypes identified in snRNA‐seq, we decomposed the cellular
   composition of the TCGA cohort and found that a high proportion of Hep9
   (p = 0.045) and a low proportion of Hep0 (p = 0.0013) were associated
   with poor prognosis in HCC (Figure [107]2I). Finally, we identified
   Hep9 as an immunosuppressive hepatocyte subtype, Hep0 as a beneficial
   hepatocyte subtype, Hep1 as a quiescent subtype, Hep2 as a
   predominantly malignant subtype, Hep3–6 as transitional phase
   hepatocyte subtypes, and Hep7–8 as intertumor heterogeneous subtypes.

2.3. Decoding Molecular Pathways of Hepatocytes of HCC with snRNA‐Seq

   Using Hep0 as the reference, we performed multi‐pathway enrichment
   analysis^[ [108]^21 ^] to identify pathways with downregulated gene
   expression across Hep1 to Hep9. Results showed a generalized
   suppression of hepatic metabolic functions, particularly in energy
   metabolism and amino acid metabolism pathways, including “Pyruvate
   metabolism, Butanoate metabolism, Fatty acid degradation, and
   Tryptophan metabolism” (Figure [109]S3A, Supporting Information).
   Additionally, downregulation of “ABC transporters” and the “PPAR
   signaling pathway” was observed in multiple hepatocyte subtypes,
   suggesting their important role in HCC progression, particularly in
   cases associated with “Alcoholic liver disease”.^[ [110]^22 ^]

   The multi‐pathway enrichment analysis of upregulated genes across
   Hep1–9 revealed distinct differences in pathway activation (Figure
   [111]3A). Hep1 was primarily characterized by upregulation of
   “Cholesterol metabolism”, while Hep2 exhibited significant alterations
   in “Fatty acid metabolism”. Hep3 showed strong activation of the “PPAR
   pathway”, along with features related to bacterial infections,
   including “Shigellosis, Salmonella infection, Bacterial invasion of
   epithelial cells, and Pathogenic Escherichia coli infection”. Hep4 was
   similar to Hep3 but also demonstrated upregulation in “Endocytosis and
   Platelet activation”. Hep5 exhibited activation of the “Hippo pathway
   and Glycosaminoglycan biosynthesis”. Hep6 was marked by upregulation in
   “Motor proteins, Phagosome, and Cholinergic synapse” pathways. Hep7
   displayed increased “Bile secretion”, activation of the “cAMP signaling
   pathway”, and dysregulated “folate metabolism”. Hep8, a more complex
   subtype, exhibited a combination of features from Hep1/2/3/4/7,
   indicating higher malignancy. Hep9 showed similarities to Hep8, with
   upregulation in “Oxidative phosphorylation” pathways, but lacked
   increased “Bile secretion and cAMP signaling pathway activation”.
   Additionally, gene intersection analysis of the top 20 DEGs across
   Hep1–9 revealed several HCC‐related genes that were consistently
   downregulated in more than three hepatocyte subtypes, including CDA,
   DDI2, IFNLR1, MAN1C1, PEX14, and PHACTR4 (Figure [112]3B). Notably,
   genes such as ERRFI1 ^[ [113]^23 ^] and PER3 ^[ [114]^24 ^] were
   significantly downregulated in Hep1/3/4/6. While most of these genes
   have been previously implicated in HCC, their specific roles across
   different hepatocyte subtypes remain poorly understood. Further
   functional validation and comprehensive studies are required to clarify
   their contributions to HCC initiation, progression, and clinical
   consequence.

Figure 3.

   Figure 3
   [115]Open in a new tab

   Decoding molecular pathways of hepatocytes of HCC with snRNA‐seq. A)
   Multi‐pathway enrichment analysis of upregulated genes in Hep1‐9. B)
   Upset plot of the top 20 DEGs in Hep1‐9. The bubble plot in the
   upper‐right corner shows the expression levels of these genes across
   the corresponding hepatocyte subtypes. C) KEGG analysis of marker genes
   in the Hep0 group, using Hep2 and Hep9 as controls (top/bottom). *
   indicate pathways shared by both groups. D) GSEA analysis comparing the
   Hep0 vs Hep2 subset (top) and Hep0 vs Hep9 subset (bottom). The first
   circle represents enriched pathways, with colors indicating secondary
   classifications. The second circle shows the number of enriched genes
   and their p‐values. The third circle depicts the ratio and number of
   upregulated/downregulated genes. The fourth circle illustrates the
   enrichment scores of pathways, with outward direction indicating
   upregulation and inward direction indicating downregulation.

   Carlessi et al.^[ [116]^25 ^] used snRNA‐seq in combination with two
   mouse models to characterize the cellular microenvironment of healthy
   and precancerous livers. They identified a disease‐associated
   hepatocytes (daHep) subtype whose proportion progressively increased
   with chronic liver disease progression. These cells, characterized by
   elevated ANXA2 expression, have been proposed as potential biomarkers
   for predicting HCC risk. Our snRNA‐seq analysis identified Hep3/4/5/6
   as transitional‐phase hepatocyte subtypes, prompting us to investigate
   their potential connection to daHep. Further analysis of the gene
   expression profiles of the 10 hepatocyte subtypes revealed that Hep5
   exhibits the highest ANXA2 expression, while Hep0 shows the lowest
   (Figure [117]S3B, Supporting Information). Additionally, upregulation
   of “Endocytosis” pathways observed in daHep was also found in Hep4
   (Figure [118]3A). Downregulated pathways in daHep, such as “Complement
   and coagulation cascades, Valine, leucine, and isoleucine degradation,
   Fatty acid metabolism, Peroxisome and PPAR signaling pathway”, were
   also enriched in Hep3/4/5/6 (Figure [119]S3A, Supporting Information).
   These findings strongly suggest that Hep3/4/5/6 represent
   transitional‐phase hepatocytes with potential roles in HCC progression,
   paralleling the characteristics of daHep.

   Subsequently, we investigated the oncogenic landscape and molecular
   mechanisms associated with Hep0 compared to Hep2/9. Given the increased
   tumor heterogeneity in post‐cancerous hepatocytes, we defined Hep0 as a
   non‐cancerous state. Using the FindMarkers function in the Seurat
   package, we identified marker genes for Hep0 vs Hep2/9. Among the top
   10 KEGG‐enriched pathways, five were commonly enriched in both Hep0 vs
   Hep2 and Hep0 vs Hep9 group (Table [120]S7, Supporting Information):
   “Complement and coagulation cascades, Retinol metabolism, Peroxisome,
   Biosynthesis of amino acids, and Tryptophan metabolism”
   (Figure [121]3C). Downregulation of these pathways promotes liver
   cancer development,^[ [122]^26 ^] especially “Retinol metabolism and
   Tryptophan metabolism”, which were also validated in our bRNA‐seq data
   (Figure [123]1F). Similarly, comparisons of Hep9 vs Hep0 and Hep2 vs
   Hep0 revealed enrichment in tumor‐related pathways, including “Axon
   guidance,^[ [124]^27 ^] ECM‐receptor interaction,^[ [125]^28 ^]
   Chemical carcinogenesis reactive oxygen species,^[ [126]^29 ^] and
   Proteoglycans in cancer”^[ [127]^30 ^] (Figure [128]S3C, Supporting
   Information). However, no shared pathways were found between these two
   groups, highlighting tumor cell heterogeneity in HCC.

   Gene Set Enrichment Analysis (GSEA) further revealed changes in gene
   expression and their associated biological processes between the Hep0
   vs Hep2 and Hep0 vs Hep9 groups (Figure [129]3D; Table [130]S8
   Supporting Information). Key metabolic pathways related to HCC, such as
   “Glycolysis/Gluconeogenesis, Carbon Metabolism, and Purine Metabolism,”
   were identified, providing insight into the biological mechanisms
   underlying HCC development and potential therapeutic targets.

2.4. Exploration of the Oncogenic Landscape of Hepatocytes by Integration of
snRNA‐Seq with bRNA‐Seq

   To dissect the oncogenic regulatory mechanisms in HCC, we integrated
   snRNA‐seq and bRNA‐seq data from the SMU cohort, elucidated the
   molecular features driving the transition of hepatocytes to HCC. In
   snRNA‐seq, Hep0 was used as the baseline for identifying DEGs, whereas
   paracancer and normal tissues served as controls in bRNA‐seq. Venn
   analysis of these DEGs identified 74 commonly downregulated genes and 5
   commonly upregulated genes relevant to HCC (Figure [131]4A). Enrichment
   analysis using Metascape^[ [132]^31 ^] revealed that the commonly
   downregulated DEGs are associated with various chronic liver
   conditions, including “fatty liver disease, drug‐induced liver disease,
   and non‐alcoholic steatohepatitis.”

Figure 4.

   Figure 4
   [133]Open in a new tab

   Exploration of the oncogenic landscape of hepatocytes by integration of
   bRNA‐seq with snRNA‐seq. A) Venn analysis of downregulated (left) and
   upregulated (right) DEGs. * indicate DEGs included in the QDM. B, C)
   Metabolic flux analysis derived from snRNA‐seq (B) and bRNA‐seq (C)
   data using METAFlux. D) Identification of differentially expressed
   regulators in various tissues from bRNA‐seq data using SCENIC. E)
   Identification of differentially expressed regulatory factors in
   hepatocyte subtypes from bRNA‐seq data using SCENIC. F) Identification
   of Transcription factors targeting PDE7B using bRNA‐seq and snRNA‐seq
   data. X‐axis represents the activity scores of the regulators.

   During HCC progression, we observed suppressed expression of multiple
   CYP family genes (e.g., CYP2A7, CYP2B6, CYP2C8, and CYP2C19) and
   significant perturbations in critical physiological metabolic processes
   such as “carboxylic acid metabolism, olefinic compound metabolism, and
   steroid metabolism.” Notably, we identified five significantly
   upregulated DEGs highly associated with HCC progression: ACSL4, GPC3,
   IGF2BP2, PPP1R9A, and ROBO1. All but PPP1R9A showed elevated expression
   in the TCGA cohort, suggesting their potential as biomarkers for HCC
   (Figure [134]S4A, Supporting Information).

   METAFlux, a computational framework capable of predicting cancer
   metabolic flux from bulk or single‐cell transcriptomic data, revealed
   metabolic heterogeneity and interactions within the tumor
   microenvironment.^[ [135]^32 ^] Using METAFlux, we assessed metabolic
   alterations across the Hep0/2/9 subtypes and identified progressive and
   significant differences in metabolic pathways including
   “glycolysis/gluconeogenesis, nucleotide metabolism, glycosphingolipid
   biosynthesis‐ganglio series, and glycosphingolipid biosynthesis‐globo
   series” (Figure [136]4B). In parallel, bRNA‐seq data showed significant
   upregulation of these four metabolic pathways in cancer tissues,
   suggesting their potential impact on HCC development and progression
   (Figure [137]4C; Figure [138]S4C, Supporting Information).

   The pathogenesis of HCC is linked to the aberrant activation of various
   oncogenes, with the regulatory network of transcription factors (TFs)
   offering a crucial avenue for exploring the disease's molecular
   mechanisms. By applying the SCENIC^[ [139]^33 ^] algorithm to bRNA‐seq
   data, we identified several upregulated regulons in cancer tissues,
   including PDX1, MAZ, and SOX4 (Figure [140]4D). We then assessed
   regulon activity scores across Hep0/2/9 and compared them with
   regulators identified by bRNA‐seq. This analysis revealed five notable
   regulators: E2F1,^[ [141]^34 ^] SOX4,^[ [142]^35 ^] BPTF,^[ [143]^36 ^]
   EP300,^[ [144]^37 ^] IKZF1,^[ [145]^38 ^] and RXRA ^[ [146]^39 ^]
   (Figure [147]4E; Figure [148]S4E, Supporting Information). While these
   regulators are known to be involved in liver cancer pathogenesis and
   progression, fully elucidating their regulatory targets remains
   elusive, highlighting the need for further detailed research.

   Our study aimed to contribute to elucidating gene regulatory networks
   in HCC. Focusing on PDE7B, we utilized both bRNA‐seq and snRNA‐seq to
   predict transcription factors targeting PDE7B, ranking them by their
   importance coefficients (Figure [149]4F). From the intersection of
   these analyses, we identified three transcription factors as regulators
   of PDE7B: ESR1, MSRA, and PLG. Currently, there is a lack of studies
   examining the regulatory relationship between PDE7B and these
   transcription factors. Unraveling these regulatory mechanisms offers
   novel insights into the pathogenesis and progression of liver cancer.

2.5. Constructing the QDM Based on HCC Cohorts Referring snRNA‐Seq Findings

   To explore the potential clinical value of snRNA‐seq data, we
   integrated our snRNA‐seq results with public RNA‐seq data to construct
   a QDM applicable to HCC (Figure [150]5A). We first decomposed^[
   [151]^40 ^] the cellular composition of the TCGA HCC cohort (n = 371)
   based on the molecular profiles of 10 hepatocyte subtypes (Hep0–9)
   obtained from snRNA‐seq data (Figure [152]S5A, Suppurting Information).
   Kaplan‐Meier survival analysis revealed that the patients with a higher
   proportion of the Hep0 subtype had significantly longer survival times
   (p = 0.0013), while those with a higher proportion of the Hep9 subtype
   had significantly shorter survival times (p = 0.045, Figure [153]S5B,
   Supporting Information). Similarly, in the Fudan‐HCC cohort^[ [154]^41
   ^] (n = 159, Figure [155]S5A, Supporting Information), we observed a
   strong positive correlation between the proportion of the Hep0 subtype
   and survival time (p = 0.0011), as well as a strong negative
   correlation between the proportion of the Hep9 subtype and survival
   time (p = 0.0038, Figure [156]S5B, Supporting Information). These
   findings align with our expectations, as Hep0 primarily represented
   non‐cancerous components from paracancer tissues, which may be
   relatively beneficial for long‐term survival, while Hep9 represented
   highly malignant subtype associated with poorer prognosis.
   Additionally, although Hep2 showed only a weak negative trend with
   survival time in the TCGA cohort, it demonstrated a strong negative
   correlation with survival time in the Fudan cohort (p < 0.0001).

Figure 5.

   Figure 5
   [157]Open in a new tab

   Constructing the QDM based on HCC cohorts referring snRNA‐seq findings.
   A) Workflow for constructing the QDM. B) AUC values for different genes
   in diagnosing liver cancer tissues across 10 datasets. C) Regression
   coefficients for each gene in QDM. D) ROC curves generated using the
   QDM for diagnosing HCC in Training Set 1. E) ROC curves generated using
   the QDM for diagnosing HCC in Validation Set 1.

   Next, we employed mlr3verse^[ [158]^42 ^] machine learning R package to
   extract Hep0, Hep2, and Hep9 subtypes from all hepatocytes (training
   samples:test samples = 8:2) using the Random Forest algorithm. In a
   fivefold cross‐validation, the average classification accuracies for
   Hep0, Hep2, and Hep9 were 89.59%, 93.43%, and 99.44%, respectively.
   Notably, several established tumor suppressor genes (e.g., ESR1,^[
   [159]^43 ^] GHR,^[ [160]^14 ^] ALDOB,^[ [161]^44 ^] BAAT,^[ [162]^45 ^]
   and ALDH1A2 ^[ [163]^46 ^]) were among the top 10 ranked feature genes
   in Hep0 (Figure [164]S5C; Table [165]S9, Supporting Information).
   Conversely, Hep9 was enriched with known tumor‐promoting genes (e.g.,
   AFP, PEG10,^[ [166]^47 ^] GPC3,^[ [167]^48 ^] IGF2BP2,^[ [168]^49 ^]
   and ACSL4 ^[ [169]^50 ^]). We extracted the top 30 feature genes from
   Hep0/2/9 (total = 90) and plotted the receiver operating characteristic
   (ROC) curves for HCC diagnosis across ten datasets. Genes with an
   average area under the curve (AUC) below 0.70 were filtered out,
   resulting in 29 high‐performance diagnostic genes used to construct the
   diagnostic model (Figure [170]5B).

   During the construction of the QDM, we faced two main challenges. The
   first challenge was the selection of a method for model construction.
   Although machine learning algorithms often achieve high diagnostic
   accuracy, they require high quality data and have limited
   generalization ability to new data. Therefore, we opted for the
   logistic regression algorithm due to its simplicity, ease of
   application to new data, and clinical utility. The second challenge was
   data integration. Despite the availability of algorithms for data
   integration (e.g., ComBat‐seq,^[ [171]^51 ^] sva,^[ [172]^52 ^]
   Ratio‐G,^[ [173]^53 ^] etc.), they often failed to effectively
   eliminate differential effects across datasets (including variations in
   detection platforms, methods, genes, and operations). To address this,
   we proposed a quantile‐based scoring algorithm: 1) Within each dataset,
   all original expression values for each gene were ranked, and the
   decile of each gene's expression value was calculated. 2) Each original
   expression was scored based on the decile (1–10 points). 3) The scored
   values from different datasets were combined directly. As scoring was
   independently conducted within each dataset, it effectively reduced
   background differences among different datasets. Using this quantile
   scoring algorithm, we integrated 29 human HCC datasets (4192 samples,
   Table [174]S10, Supporting Information) from various countries,
   detection methods, and instrument platforms, regardless of the presence
   or type of hepatitis virus infection.

   Starting with 29 genes that showed high diagnostic performance, we
   constructed a model using 10 scoring datasets (Training Set 1, n =
   2314). During this process, we eliminated genes without significant
   differences and ultimately developed a QDM containing 16 genes. In the
   QDM regression formula, we defined the event of HCC tissue as 1 and
   non‐HCC tissue as 0. The regression coefficients for each gene
   indicated whether a gene acts as a protective factor (regression
   coefficient < 0) or a risk factor (regression coefficient > 0) for HCC
   occurrence (Figure [175]5C): QDM_score = 5.5942620 ─ 0.3870143×PDE7B ─
   0.3409017×GHR + 0.3371936×GPC3 ─ 0.3187214×SLCO1B3 ─ 0.2603766×ESRRG +
   0.2396385×SLC38A4 + 0.2373530×ROBO1 ─ 0.2361836×RALYL ─
   0.1768778×IGF2BP2 ─ 0.1744291×SERPINA1 ─ 0.1508238×APOA1 +
   0.1467488×MEP1A + 0.1377588×ACSL4 + 0.1109942×FTL ─ 0.1078583×IGF2. The
   ROC analysis of Training Set 1 yielded an AUC of 0.982
   (Figure [176]5D). When applied to the remaining 19 datasets (Validation
   Set 1, n = 1878), the QDM predicted HCC and non‐HCC tissues with an
   overall AUC of 0.968 (Figure [177]5E). Additionally, in the SMU cohort,
   the diagnostic AUC of QDM reached 0.9868. The QDM demonstrated
   exceptionally high diagnostic performance in both Training Set 1 and
   Validation Set 1 (total = 4192), supporting integrated, evaluated, and
   diagnosed cross‐dataset and cross‐platform applications.

2.6. Investigation of QDM for its Biological and Clinical Significance

   The QDM demonstrated robust predictive performance in diagnosing HCC
   tissues and provided valuable insights into the molecular mechanisms
   underlying HCC development, highlighting its significant impact on both
   scientific research and clinical practice. The 16 diagnostic genes
   identified by the QDM have been largely validated as crucial to HCC
   pathogenesis, exhibiting either tumor‐suppressive or oncogenic
   properties. For example, QDM identified protective factors such as
   GHR,^[ [178]^14 ^] SLCO1B3,^[ [179]^54 ^] and APOA1,^[ [180]^55 ^]
   alongside risk factors including GPC3,^[ [181]^48 ^] SLC38A4,^[
   [182]^56 ^] and ROBO1.^[ [183]^57 ^] We analyzed the expression
   patterns of the top four genes (PDE7B, GHR, GPC3, and SLCO1B3) based on
   the absolute values of their regression coefficients. In the bRNA‐seq
   data from our SMU cohort, these genes exhibited markedly different
   expression levels in cancer tissues compared to paracancer or normal
   tissues (Figure [184]6A). We further examined their expression in eight
   large‐scale HCC datasets ([185]GSE14520, [186]GSE22058, [187]GSE102079,
   [188]GSE144269, PMID35382356, [189]GSE89377, [190]GSE17856, and
   [191]GSE45267) and found significant difference in cancer tissues vs
   their controls (Figure [192]6B). Additionally, the upregulation or
   downregulation direction matched the regression coefficients. The
   protein‐level expression of PDE7B and GHR, validated using the Human
   Protein Atlas (HPA) database,^[ [193]^58 ^] confirmed their
   downregulation in HCC cancer tissues compared to normal tissues
   (Figure [194]6C).

Figure 6.

   Figure 6
   [195]Open in a new tab

   Investigation of QDM for its biological and clinical significance. A)
   The RNA expression levels of PDE7B, GHR, GPC3, and SLCO1B3 in bRNA‐seq
   data from the SMU cohort were assessed and compared using paired
   t‐tests. ^ns p > 0.05; *p < 0.05; **p < 0.01; ***p < 0.001; ****p <
   0.0001. B) The RNA expression levels of PDE7B, GHR, GPC3, and SLCO1B3
   were evaluated across different datasets using t‐tests. The number of
   samples is indicated above each dataset. ^ns p>0.05; *p < 0.05; **p <
   0.01; ***p < 0.001; ****p < 0.0001. C, D) Protein expression levels of
   PDE7B (C) and GHR (D) were investigated using the HPA database. E)
   Correlation between QDM score and serum AFP and AST levels in the SMU
   cohort. F) Correlation of IGF2 and GHR with serum AFP levels in the SMU
   cohort. G) Assessment of the tumor microenvironment in bRNA‐seq data
   from the SMU cohort using ESTIMATE to separately evaluate immune
   scores, stromal scores, and tumor purity. H) Correlation between PDE7B
   and tumor purity in the SMU cohort.

   To investigate the clinical significance of the QDM, we collected
   clinical assays from the SMU cohort, including AFP and AST (indicator
   of liver function impairment). We observed a weak correlation between
   the QDM score and serum AFP levels (Spearman's coefficient = 0.33, p =
   0.02) and serum AST levels (Spearman's coefficient = 0.3, p = 0.02)
   (Figure [196]6E). The Spearman's coefficient between IGF2 (a diagnostic
   gene in QDM) and AFP was 0.45 (p = 0.00102), while for GHR (another
   diagnostic gene in QDM) and AFP, it was ─ 0.42 (p = 0.00195),
   suggesting potential for developing clinical biomarkers through QDM
   (Figure [197]6F). Using the ESTIMATE^[ [198]^59 ^] algorithm, we
   assessed the tumor microenvironment in bRNA‐seq data from the SMU
   cohort and found lower immune and stromal scores, and higher tumor
   purity in cancer tissues compared to paracancer tissues
   (Figure [199]6G). Furthermore, we identified a negative correlation
   between PDE7B and tumor purity (Spearman's coefficient = ─ 0.41, p
   <0.0001) (Figure [200]6H), which was highly expressed in Hep0.

   To validate the reliability and comprehensibility of QDM, we
   investigated the tumor biological behavior of PDE7B in HCC cells.
   PDE7B, a member of the phosphodiesterase family, regulates various
   physiological processes by hydrolyzing cAMP and cGMP, including cell
   proliferation, differentiation, inflammation, and metabolic
   functions.^[ [201]^60 ^] It is implicated in tumor development of
   colorectal cancer,^[ [202]^61 ^] leukemia,^[ [203]^62 ^] and breast
   cancer;^[ [204]^63 ^] however, its specific role in HCC pathogenesis
   requires further investigation. Initially, we analyzed the gene
   expression of PDE7B in four HCC cell lines (Hep G2, PLC/PRF/5,
   MHCC‐97H, and SNU‐387) and found that PDE7B expression was highest in
   the SNU‐387 cell line (Figure [205]S6A, Supporting Information),
   consistent with data from the HPA database (Figure [206]S6B, Supporting
   Information). Therefore, we selected SNU‐387 cells for further
   experimental investigations. To assess the functional role of PDE7B, we
   designed two siRNA sequences (siPDE7B‐1 and siPDE7B‐2) and evaluated
   their interference efficiency using cell transfection and qRT‐PCR. The
   results indicated that the interference efficiency of siPDE7B‐1 was
   70.0%, while siPDE7B‐2 showed an efficiency of 68.8% (Figure [207]S6C,
   Supporting Infromation). Since no significant difference was observed
   between the two sequences, we selected siPDE7B‐1 (hereafter referred to
   as siPDE7B) for subsequent experiments. Following the transfection of
   SNU‐387 cells with siPDE7B, we performed a cell scratch assay 48 h
   post‐transfection and captured images at 0 and 24 h during the scratch
   assay. The results showed that the migration rate of the siPDE7B group
   was significantly higher than that of the control group (p <0.01,
   Figure [208]S6D, Supporting Information). Additionally, CCK‐8 assays
   demonstrated that the cell viability of the siPDE7B group was notably
   higher than that of the control group (p <0.05, Figure [209]S6E,
   Supporting Information). To summarize, inhibition of PDE7B
   significantly enhanced the migration ability and viability of HCC
   cells, which is consistent with the predictions made by the QDM. Future
   research will further explore the molecular mechanisms of PDE7B in HCC
   and assess its potential as a clinical biomarker for HCC.

   Overall, the QDM demonstrated strong diagnostic performance and a
   significant correlation with clinical indicators. Most importantly, it
   holds great potential for identifying novel biomarkers with clinical
   relevance.

2.7. Classification of HCC Using Feature Genes Identified in snRNA‐Seq

   To improve the current classification of HCC in clinics, benefiting
   from the advanced scRNA‐seq message, we extracted the molecular
   features of the three distinct hepatocyte subtypes (Hep0/2/9) (Figure
   [210]7A). We began by extracting feature genes from the snRNA‐seq data,
   selecting 30 genes for each subtype, resulting in a total of 90 genes.
   We then compiled eight human HCC datasets (n = 1350, Table [211]S11,
   Supporting Information), which were processed to remove batch effects
   and divided into Training Set 2 and Validation Set 2. After excluding
   non‐expressed genes, we retained 50 common genes. Applying unsupervised
   consensus clustering to Training Set 2, we identified three distinct
   HCC classes: Class_1, Class_2, and Class_3.

Figure 7.

   Figure 7
   [212]Open in a new tab

   Classification of using feature genes identified in snRNA‐seq. A)
   Workflow for HCC classification using feature genes from snRNA‐seq
   data. B) Mutation frequencies of genes in Training Set 2. C, D)
   Kaplan‐Meier survival analysis for Training Set 2 (C) and Validation
   Set 2 (D). E) Immune cell composition in Validation Set 2. F) TIDE
   analysis for Training Set 2. G) Enrichment analysis of DEGs in Class_1,
   Class_2, and Class_3. H) Identification of sensitive drugs for Class_1,
   Class_2, and Class_3 through integration of OncoPredict and CMap
   analyses.

   Genomic studies have revealed that CTNNB1 mutations are linked to
   well‐differentiated HCC tumors and extended overall survival,^[
   [213]^64 ^] AXIN1 mutations are associated with Wnt/β‐catenin pathway
   activation,^[ [214]^65 ^] frequent TP53 mutations are correlated with
   poor prognosis,^[ [215]^66 ^] and TSC2 mutations are often connected
   with significant matrix fibrosis and sclerosis in HCC.^[ [216]^67 ^] We
   examined gene mutation frequencies across the three HCC classes and
   found that Class_1 exhibited a significantly higher frequency of CTNNB1
   mutations compared to Class_2 and Class_3 (27.3%, 15.6%, and 16.3%,
   respectively) (Figure [217]7B). Conversely, the frequency of TP53
   mutations was notably lower in Class_1 (30.6%, 48.1%, and 48.8%,
   respectively), suggesting a lower risk of adverse prognosis for
   Class_1. Class_2 showed a higher frequency of AXIN1 mutations (5.4%,
   22.0%, and 2.3%, respectively), while Class_3 had a higher frequency of
   TSC2 mutations (2.1%, 6.5%, and 11.6%, respectively).

   To further investigate the clinical significance of these three
   classes, we performed survival analysis using integrated datasets (TCGA
   and PMID31585088) and observed that Class_1 exhibitedsuperior clinical
   outcomes (Figure [218]7C). Using TCGA and PMID31585088 as Training Set
   2, we developed an HCC classification model with an accuracy of 91.58%.
   This model was then applied to Validation Set 2, confirming a
   significantly longer survival time in Class_1 compared to Class_2 and
   Class_3 (Figure [219]7D). CIBERSORTx analysis of the immune cell
   composition in Training Set 2 revealed that Class_2 exhibited a
   significantly higher abundance of various immune cell types (memory B
   cells, monocytes, plasma cells, and follicular helper T cells) compared
   to Class_1 and Class_3 (Figure [220]7E). In addition, TIDE^[ [221]^68
   ^] analysis indicated that Class_3 exhibited higher levels of
   cancer‐associated fibroblasts (CAF) and T cell exclusion scores
   (exclusion) compared to Class_1 and Class_2, suggesting a highly
   immunosuppressive tumor microenvironment in Class_3 (Figure [222]7F).

   DEGs analysis using Training Set 2 identified 164, 68, and 327
   upregulated genes in Class_1, Class_2, and Class_3, respectively, with
   thresholds set at p < 0.05 and log2FoldChange > 1.5. Enrichment
   analysis revealed that Class_1 was predominantly associated with
   metabolic pathways, including “carboxylic acid metabolic process,
   biological oxidations, small molecule catabolic process, and steroid
   metabolic process” (Figure [223]7G). Class_2 was enriched in
   cancer‐related pathways, such as “regulation of Wnt signaling pathway,
   gastric cancer network 1, regulation of MAPK cascade, and lncRNA in
   canonical Wnt signaling and colorectal cancer” (Figure [224]7G). In
   contrast, Class_3 was associated with extracellular matrix‐related
   pathways, including “extracellular matrix organization, positive
   regulation of cell adhesion, and cell‐cell adhesion” (Figure [225]7G).
   Based on these findings, we classify Class_1, Class_2, and Class_3 as
   metabolic‐HCC, inflammatory‐HCC, and matrix‐HCC, respectively.

   Classifying HCC enhances the optimization of individualized therapeutic
   strategies. We employed OncoPredict^[ [226]^69 ^] to assess the drug
   responsiveness of the three HCC classes to sorafenib and found that
   Class_3 exhibited a higher half‐maximal inhibitory concentration (IC50)
   compared to Class_1 and Class_2, indicating reduced sensitivity to
   sorafenib (Figure [227]7H). The Connectivity Map (CMap)^[ [228]^70 ^]
   is a gene expression‐based database designed to elucidate functional
   relationships between drugs and disease states. We conducted CMap
   analysis for each HCC class and integrating these findings with the
   OncoPredict, results identified the most sensitive drugs for Class_1,
   Class_2, and Class_3 as ouabain, teniposide, and TG‐101348,
   respectively. All three drugs demonstrated lower IC50 values than
   sorafenib, making them suitable for combination dosing regimens in the
   precision treatment of HCC.

3. Discussion

   This study recruited an HCC cohort (SMU cohort) comprising 209 samples
   from 77 patients, and integrated snRNA‐seq and bRNA‐seq data to
   elucidate the oncogenic landscape of hepatocytes across subtypes,
   functional roles, and molecular regulations. We established the largest
   HCC snRNA‐seq dataset (174 756 cells) to date, providing valuable
   insights into hepatocyte oncogenesis. Using single‐cell results as a
   reference, we developed a quantile‐based scoring method to integrate 29
   human HCC datasets from diverse sources, including different countries,
   hepatitis virus infections, detection methods, and instrument
   platforms. Based on this scoring method, we constructed a diagnostic
   model for HCC, termed QDM, which demonstrated excellent diagnostic
   performance across 4 192 samples (AUC = 0.982 for Training Set 1 and
   AUC = 0.968 for Validation Set 1). Notably, QDM not only correlates
   closely with clinical indicators but also enhances our understanding of
   HCC pathogenesis and facilitates the exploration of potential
   biomarkers. Through QDM, we identified numerous genes specifically
   associated with HCC, which could be useful for developing novel
   diagnostic biomarker combinations. The newly discovered genes warrant
   further investigation into their roles and mechanisms in HCC.
   Additionally, we identified several key metabolic pathways (including
   “glycolysis/gluconeogenesis, nucleotide metabolism, glycosphingolipid
   biosynthesis‐ganglio series, and glycosphingolipid biosynthesis‐globo
   series”) and transcription factors (particularly E2F1, SOX4, BPTF,
   EP300, IKZF1, and RXRA), providing important directions for HCC
   research. Based on the characteristics of snRNA‐seq data combined with
   multicenter cohorts, we classified HCC into metabolic‐HCC,
   inflammatory‐HCC, and matrix‐HCC classes and identified potential
   therapeutic agents for each class. This classification aims to enhance
   diagnostic accuracy, optimize personalized treatment strategies, and
   potentially uncover novel biomarkers or therapeutic targets, thereby
   improving patient prognosis and treatment outcomes. Future studies will
   further evaluate the robustness and reliability of this classification,
   explore the biological characteristics and molecular mechanisms of each
   class, assess their impact on therapeutic responses, and validate its
   clinical applicability.

   Currently available transcriptomic datasets are primarily derived from
   microarray and RNA sequencing technologies, which differ significantly
   in detection principles, platforms, target genes, and data formats,
   complicating direct integration. In contrast, QDM effectively overcomes
   these obstacles by employing a quantile‐based scoring method, which
   converts gene expression values into relative expression values based
   on their rankings. This approach improves the robustness of the model,
   being interpretable, versatile, and iteratively extendable. We envision
   its clinical adoption to establish a multicenter molecular diagnostic
   system for QDM. Offering exceptional value as stated above, QDM has
   limitations. It requires a balanced number of HCC and non‐HCC samples
   to avoid skewed score distributions and necessitates complete gene
   expression data; while minor missing data can be managed through
   imputation methods, significant data loss remains problematic. The
   primary non‐HCC samples used in QDM are adjacent non‐tumorous tissues,
   while the discriminatory power may be further elevated when healthy
   liver tissues are applied as references. Moreover, large‐scale serum