Abstract Background & Aims Intratumour heterogeneity (ITH) fosters the vulnerability of RNA expression-based biomarkers derived from a single biopsy to tumour sampling bias, and is regarded as an unaddressed confounding factor for patient precision stratification using molecular biomarkers. This study aimed to identify an ITH-free predictive biomarker in hepatocellular carcinoma (HCC). Methods We interrogated the confounding effect of ITH on performance of molecular biomarkers and quantified transcriptomic heterogeneity utilising three multiregional HCC transcriptome datasets involving 142 tumoural regions from 30 patients. A de novo strategy based on the heterogeneity metrics was devised to develop a surveillant biomarker (a utility gadget using RNA; AUGUR) using three datasets involving 715 liver samples from 509 patients with HCC. The performance of AUGUR was assessed in seven cross-platform HCC cohorts that encompassed 1,206 patients. Results An average discordance rate of 39.9% at the level of individual patients was observed applying 13 published prognostic signatures to classify tumour regions. We partitioned genes into four heterogeneity quadrants, from which we developed and validated a reproducible robust ITH-free expression signature AUGUR that showed significant positive associations with adverse features of HCC. High AUGUR risk increased the risk of disease progression and mortality independent of established clinicopathological indices, which maintained concordance across seven cohorts. Moreover, AUGUR compared favourably to the discriminative ability, prognostic accuracy, and patient risk concordant rates of 13 published signatures. Finally, a well-calibrated predictive nomogram integrating AUGUR and tumour-node-metastasis (TNM) stage was established, which generated a numerical probability of mortality. Conclusions We constructed and validated an ITH-free AUGUR and nomogram that overcame sampling bias and provided reliable prognostic information for patients with HCC. Impact and Implications Intratumour heterogeneity (ITH) is prevalent in hepatocellular carcinoma (HCC), and is regarded as an unaddressed confounding factor for biomarker design and application. We examined the confounding effect of transcriptomic ITH in patient risk classification, and found existing molecular biomarkers of HCC were vulnerable to tumour sampling bias. We then developed an ITH-free expression biomarker (a utility gadget using RNA; AUGUR) that overcame clinical sampling bias and maintained prognostic reproducibility and generalisability across multiple HCC patient cohorts from different commercial platforms. Furthermore, we established and validated a well-calibrated nomogram based on AUGUR and tumour-node-metastasis (TNM) stage that provided an individualised prognostic information for patients with HCC. Keywords: Hepatocellular carcinoma, Sampling bias, Discordant risk classification, Intratumour heterogeneity-free, Prognostic prediction Graphical abstract [29]graphic file with name ga1.jpg [30]Open in a new tab Highlights * • The performance of existing transcriptomic prognostic biomarkers of HCC is vulnerable to intratumour heterogeneity (ITH). * • An ITH-free expression biomarker (AUGUR) overcoming clinical sampling bias was developed by interrogating heterogeneity. * • AUGUR exhibited prognostic reproducibility and generalisability across multiple HCC patient cohorts from different commercial platforms. * • AUGUR compared favourably to the discriminative ability, prognostic accuracy and patient risk concordant rate of 13 analysed signatures. * • We established a well-calibrated quantitative nomogram that provides reliable prognostic information tailored to the individual patient. Introduction Primary liver cancer is the sixth most commonly diagnosed cancer and the third leading cause of cancer-related death worldwide in 2020, with approximately 906,000 new cases and 830,000 deaths.[31]^1 Hepatocellular carcinoma (HCC) is the most frequent form of liver cancer, accounting for 75–85% of cases.[32]^1^,[33]^2 Although advancements have been made in the treatment of HCC in the past two decades, the prognosis of patients with HCC is dismal and varies significantly among individuals, with a relative 5-yr survival rate of approximately 18%.[34]^2 An accurate stratification reflecting the prognosis of patients with HCC is crucial for disease surveillance and treatment strategies selection, thus considerable effort has been devoted to establishing such a stratification model for HCC using patient clinical and pathological characteristics. Currently, several classification systems, such as the American Joint Committee on Cancer TNM (tumour-node-metastasis) system, the Cancer of the Liver Italian Program (CLIP) and the Barcelona Clinic Liver Cancer (BCLC) staging, have been developed and used in clinics. These assessment approaches have proved useful,[35]^3 although they have various limitations in patient stratification and provide limited predictive accuracy.[36]^4 Besides, they failed to provide biological characteristics of HCC that might account for the clinical heterogeneity, thus requiring improvement. Over the past two decades, advances in genome-wide expression profiling technology have considerably enhanced our understanding of cancer biology. With the hope that the proposed gene-lists can aid clinicians in cancer therapy decisions, many studies have assessed the prognostic abilities of gene signatures established from high-throughput expression data of HCC tumours and/or adjacent non-tumour tissues.[37][5], [38][6], [39][7], [40][8], [41][9] These investigations have identified a wide variety of multigene signatures that were able to predict prognosis, however, none has entered clinical practice, perhaps because of their poor reproducibility in independent patient cohorts and lack of standardised determination methods.[42]^10^,[43]^11 Intratumour heterogeneity (ITH) is one of the main causes of failure in clinical practice in most types of cancers including HCC.[44]^11^,[45]^12 Previous investigations focused on transcriptomes using multiregional or single-cell analyses, revealing astonishing ITH of gene expression.[46][13], [47][14], [48][15] However, these attempts are not directly considered and translated to the clinical setting of HCC. Transcriptomic ITH has been shown to confound existing expression-based biomarkers across multiple cancer types,[49][16], [50][17], [51][18] and can lead to sampling biases that may account for the lack of clinically qualified biomarkers in HCC. Therefore, addressing ITH as a confounding factor for prognostic signature design is crucial for precision medicine. In the current study, through multiregional transcriptomic data of HCC, we examined the confounding effect of transcriptomic ITH in patient risk prediction, and found existing prognostic gene expression signatures of HCC are vulnerable to tumour sampling bias. We then devised a de novo strategy to develop an ITH-free expression biomarker that overcomes clinical sampling bias and provides more reliable risk estimates for patients with HCC by integrating three RNA sequencing (RNA-seq) expression datasets with different features that encompass 715 liver tissue samples from 509 patients with HCC. We assessed and validated the prognostic and predictive accuracy of this classifier in six independent cohorts totally involving 883 patients with HCC, including three RNA-seq-based and three microarray-based transcriptome datasets. We also compared its prognostic and predictive efficacy with 13 previously reported HCC prognostic models, and established and validated a well-calibrated nomogram based on this classifier and TNM staging system to provide a more individualised method to predict prognostic information for patients with HCC. Materials and methods Multiregional gene expression data of HCC RNA-seq data of multiregional samples from 11 patients with HCC in total involving 42 tumoural regions (denoted as the MultiRRnaSeq1 cohort; median of five tumour regions per patient [range: two to six regions]; [52]Table S1) were collected from a previous study.[53]^14 For validation, the RNA-seq data of 75 tumour regions from 14 patients with HCC (denoted as the MultiRRnaSeq2 cohort, median of five tumour regions per patient [range: three to 10 regions]; [54]Table S2) and the Agilent mRNA transcriptome profiles of 25 tumour samples from five patients with HCC (denoted as the MultiRArray cohort; five regions per patient, [55]Table S3) were curated from another two studies.[56]^15^,[57]^19 Details of data collection, filtering, and normalisation are available in the Supplementary materials. Acquisition of HCC cohorts with survival information Seven HCC cohorts with clinical follow-up information involving in total 1,177 HCC specimens and 693 normal specimens from 1,206 patients (i.e. TCGA-LIHC,[58]^20 ICGC-LIRI-JP,[59]^21 CHCC-HBV,[60]^22 Mongolia-HCC,[61]^23 FULCI-HCC,[62]^24 NCI-HCC,[63]^25 INSERM-HCC[64]^26; [65]Table S4) were collected. Transcriptome data of TCGA-LIHC, ICGC-LIRI-JP, CHCC-HBV, and Mongolia-HCC were RNA-seq data, whereas transcriptome data of FULCI-HCC, NCI-HCC, and INSERM-HCC were microarray data. Details of data collection, filtering, and normalisation are available in the Supplementary materials. Collection of previously published HCC prognostic gene expression signatures Thirteen previously published HCC prognostic gene expression signatures (ProGESigs) were collected. Each of the 13 signatures is associated with a specific formula. The detailed information of each signature and the detailed strategy to apply it is presented in [66]Table S5 and in the Supplementary material. Establishment and validation of the ITH-free prognostic signature A five-step analysis pipeline was developed to construct and validate an ITH-free prognostic model based on three independent RNA-Seq HCC cohorts. First, to restrict gene-ITH, gene heterogeneity scores were calculated using data from the MultiRRnaSeq1 cohort (see the Supplementary materials for detailed methods), and genes with high inter- and low intratumour heterogeneity scores were selected. Second, dysregulated genes were identified using 175 paired tumour-normal samples from the ICGC-LIRI-JP cohort. Differential expression analysis was performed using R package DESeq2 (version 1.28.1, University of North Carolina, Chapel Hill, United States), wherein genes with absolute log 2-fold change >1.0 and adjusted p value <0.05 were considered differentially expressed. Third, univariate Cox regression analysis was used to identify genes significantly associated with overall survival (OS) in the TCGA-LIHC cohort using R package survival (version 3.2.10, Mayo Clinic, Rochester, United States). A value of p <0.05 was considered statistically significant. Last, least absolute shrinkage and selection operator (LASSO) penalised Cox regression against OS based on the TCGA-LIHC training cohort was utilised to construct the ITH-free prognostic signature from the genes selected by above three criteria using the R package glmnet (version 4.1.1, Stanford University, Stanford, United States). The risk score for each patient was calculated as a linear combination of gene expression values, weighted by the model coefficients fitted in the training cohort. Patients were subsequently median-dichotomised into high- and low-risk groups based on the risk scores in each cohort. The internal (ICGC-LIRI-JP), external RNA-Seq (CHCC-HBV and Mongolia-HCC) and external microarray (FULCI-HCC, NCI-HCC, and INSERM-HCC) cohorts were used to assess the predictive performance of signature via the Harrell’s concordance index (C-index) and time-dependent receiver operating characteristic (ROC) curve analysis. Statistical analysis Contingency table variables were analysed by the Fisher’s exact test. Pearson’s and Spearman’s correlation test were used to assess correlation between two variables, as appropriate. Survival analysis was conducted using the Kaplan–Meier method in the R package survival (version 3.2.10, Mayo Clinic, Rochester, United States), with p values determined using the log-rank test. Hazard ratio, univariate analysis, and multivariate analysis adjusting for age, gender, AJCC stage, cirrhosis, alpha-foetoprotein (AFP) and histological grade (if available) were determined through a Cox proportional hazards model. The nomogram and corresponding calibration maps were constructed via R software rms (version 6.2.0, Vanderbilt University Medical Center, Nashville, United States). Calibration plots calculated via a bootstrap method with 1,000 re-samples were drawn to evaluate the concordance between actual and predicted survival. The C-index was calculated using the R package Hmisc (version 1.38.0, Vanderbilt University Medical Center, Nashville, United States). Time-dependent receiver ROC analysis and the AUC values were calculated using the R package timeROC (version 0.4, University of Copenhagen, Copenhagen, Denmark). The frequencies and types of mutations were analysed and visualised using the R package Maftools (version 2.10.5, National University of Singapore, Queenstown, Singapore). All statistical analyses were two-tailed and performed using the R statistical software (version 4.0.2), a value of p <0.05 was considered statistically significant. Results Intratumour heterogeneity broadly affects signature performance Multiregional tumour sequencing could help to characterise the tumour spectrum more adequately than using a single biopsy, including clonality, ITH, and tumour evolution.[67]^14 Unsupervised hierarchical clustering on the most variant genes across 42 multiregion primary HCC samples from 11 patients[68]^14 (MultiRRnaSeq1 cohort), 75 multiregion primary HCC samples from 14 patients[69]^19 (MultiRRnaSeq2 cohort), or 25 multiregion primary HCC samples from 5 patients[70]^15 (MultiRArray cohort) all exhibited separate clusters by patients, which suggested clustering concordance in regions from the same tumour, and revealed stronger RNA inter-tumour heterogeneity than ITH ([71]Fig. 1A, [72]Fig. S1A and B). Nevertheless, dimensionality reduction for the entire transcriptome profiles clearly revealed RNA ITH between different regions of the same HCC, such as samples from H2, H3, H9, and H11 in MultiRRnaSeq1 cohort that without peculiar geographic distribution of sampling[73]^14 ([74]Fig. 1B, [75]Fig. S1C and D). Fig. 1. [76]Fig. 1 [77]Open in a new tab ITH disturbs HCC molecular biomarkers. (A) The heat map (top) shows the unsupervised hierarchical clustering of HCC samples (columns) in the MultiRRnaSeq1 cohort according to the top 50 variable expression genes (rows). The sparse heat map (bottom) shows HCC samples per patient. (B) Principal component analysis (PCA) of the entire transcriptome profiles of the MultiRRnaSeq1 cohort. (C) A published epithelial-related prognostic signature for HCC is evaluated in the MultiRRnaSeq1 cohort. Each point represents a single tumour sample (left). Points are coloured according to the concordance of risk classification for samples from each patient: concordant low, concordant high, and discordant risk are marked in violet, green, and grey, respectively. The horizontal dashed line represents median of risk scores of these samples. The bar chart (right) represents the percentage of patients who are classified as concordant low, concordant high, and discordant risk. (D) The bar chart represents the percentage of patients who are classified as concordant low, concordant high, and discordant risk using 13 published HCC prognostic signatures in the MultiRRnaSeq1 cohort. The x-axis shows the corresponding articles (PubMed ID). HCC, hepatocellular carcinoma; ITH, intratumour heterogeneity. ITH could cause the bias of clinical strategies when using molecular features derived from a single biopsy, including biopsy strategy, molecular pathological diagnosis, target therapy, and prognosis prediction.[78]^14^,[79]^27 To assess the influence of RNA ITH on performance of molecular biomarkers, we firstly investigated patient risk bias using previously published epithelial-[80]^8 ([81]Fig. 1C) and immune-related[82]^7 ([83]Fig. S1E) HCC ProGESigs in the MultiRRnaSeq1 cohort. Using the same risk score calculating methods executed in the original studies to dichotomise tumour regions as either high risk or low risk, 55% and 46% of patients, respectively, exhibited discordant risk classification. Similarly, using pyroptosis-related,[84]^28 DNA methylation-driven,[85]^9 mutation-derived,[86]^29 microvascular invasion-related,[87]^30 neoangiogenesis-related,[88]^6 the most extensively cited five-gene signature[89]^31 and other HCC ProGESigs, an average discordance rate of 39.9% (range: 18–73%) was observed ([90]Fig. 1D). The discordant risk classification rate was not significantly correlated with the gene size of the signature (rho = -0.02, p = 0.95 in the MultiRRnaSeq1 cohort; rho = 0.13, p = 0.68 in the MultiRRnaSeq2 cohort). These results indicated that prognostic risk of patients with HCC evaluated by ProGESigs without considering tumour heterogeneity were frequently influenced by sampling bias, which might contribute to the limited validity and reproducibility of ProGESigs in independent cohorts,[91]^17 thus potentially limiting the clinical utility of ProGESigs and emphasising the importance of considering ITH in signature design. A de novo strategy to develop ITH-free HCC prognostic signature We hypothesised that circumventing RNA ITH and magnifying RNA inter-tumour heterogeneity might improve the ability to discriminate between patients and simultaneously maintain the stability of biomarker risk prediction for each patient when using different tumour regions. Thus, we devised a de novo five-step strategy to construct and validate a HCC prognostic RNA marker to test this hypothesis ([92]Fig. 2A). In the discovery phase, three RNA-seq-based expression datasets with different features that encompass 715 liver tissue samples from 509 patients with HCC were used: (1) the aforementioned MultiRRnaSeq1 dataset, to derive candidate RNA molecules with high inter-tumour heterogeneity and low ITH; (2) the ICGC-LIRI-JP HCC dataset (350 samples from 175 paired tumour-normal tissues), to calculate differentially expressed genes; and (3) the TCGA-LIHC dataset (323 HCC patients with prognostic information), to deduce genes associated with survival. In the validation phase, a total of 883 patients with HCC from six cohorts were utilised, including three RNA-seq-based (ICGC-LIRI-JP, CHCC-HBV, and Mongolian-HCC cohorts; n = 226, 159, and 70 patients, respectively) and three microarray-based (FULCI-HCC, NCI-HCC, and INSERM-HCC cohorts; n = 231, 140, and 57 patients, respectively) transcriptome datasets ([93]Fig. 2A, [94]Table S4). Fig. 2. [95]Fig. 2 [96]Open in a new tab A five-step de novo strategy to develop AUGUR. (A) Schematic depicting the summary of strategy to construct and validate the ITH-free HCC prognostic signature. (B) Gene expression inter- and intratumour heterogeneity quadrants calculated using the MultiRRnaSeq1 cohort. The plot is divided into four quadrants (Q1, Q2, Q3, and Q4 Set) by the mean inter- (dashed vertical line) and intratumour (dashed horizontal line) heterogeneity scores. The density curves of inter- and intratumour heterogeneity scores are plotted on the x- and y-axis, respectively. (C) Venn diagram illustrating candidate reproducible prognostic genes that free of ITH. Q4 Set is inherited from (B) that represents genes with high inter-tumour and low intratumour heterogeneity. DE Set represents differential expression genes calculated using paired tumour and normal samples in the ICGC-LIRI-JP cohort. Surv Set represents genes associated with survival in TCGA-LIHC cohort. (D) Heat map showing the expression of 34 candidate genes in TCGA-LIHC paired tumour and normal samples. (E) Diagram of 10-fold cross-validated error under the value of tuning parameter lambda in the LASSO Cox regression model. The red dashed vertical line represents the lambda that gives minimum mean error. AUGUR, a utility gadget using RNA; ITH, intratumour heterogeneity; LASSO, least absolute shrinkage and selection operator. To define RNA heterogeneity, we derived inter- and intratumour heterogeneity metrics for each gene using multiregion HCC samples, and split both heterogeneity metrics into a high or low group by their mean, which resulted in four RNA heterogeneity quadrants for HCC ([97]Fig. 2B).[98]^16 Genes in the Q4 Set (1,477 genes with high inter-tumour heterogeneity and low ITH) exhibited highly variable between tumours and highly homogeneous within tumours, thus restricting sampling bias and potentially facilitating patient stratification ([99]Fig. 2B). To obtain genes with reproducible survival associations in the Q4 gene set, we combined differentially expressed genes and prognostic genes of HCC obtained from two independent datasets (ICGC-LIRI-JP and TCGA-LIHC, respectively), which resulted in a candidate gene set containing 34 genes ([100]Fig. 2C). Differential expression analysis in five cohorts containing paired HCC and normal liver tissues showed that all 34 genes were significantly differentially expressed in all cohorts (a total of 660 paired tumour-normal tissues), including four RNA-seq-based and one microarray-based dataset ([101]Fig. 2D, [102]Fig. S2, [103]Tables S6–S10), validating these genes were generally dysregulated in HCC regardless of the original profiling platform, thus they might play critical roles in HCC progression. Subsequently, we used the elastic-net algorithm that was proved to be more robust than other algorithms[104]^16 to narrow down the 34 candidate genes to remove redundancies and select the most useful prognostic markers of HCC, which generated a 12-gene prognostic signature that was ITH-free and we termed a utility gadget using RNA (AUGUR) ([105]Fig. 2E, [106]Table S11 presents the ingredients of AUGUR). ITH-free signature AUGUR performs robust prognostic efficacy in HCC To investigate the prognostic performance of AUGUR, we first dichotomised TCGA-LIHC patients (training cohort of AUGUR) using the median risk score, and found that the AUGUR risk score was significantly positively associated with mortality (n = 323 patients, log rank: p <0.0001; univariate Cox regression: p <0.0001, HR = 2.5, [107]Fig. 3A, [108]Table S12). The median OS interval of patients with HCC in the high AUGUR group was 3.48 yr (95% CI: 2.07–5.07), whereas that of patients with low AUGUR scores was 6.72 yr (95% CI: 4.90–NA). Subsequently, we applied AUGUR weights trained in the TCGA-LIHC cohort to three additional independent RNA-seq-based HCC datasets, and the same results were observed in all three datasets that patients with HCC in the high AUGUR risk group had a shorter median survival time ([109]Fig. 3B–D, [110]Tables S13–S15, log rank: p < 0.0001; univariate Cox regression: p = 0.0003, HR = 4.38 in the ICGC-LIRI-JP cohort; log rank: p = 0.0011; univariate Cox regression: p = 0.0016, HR = 2.47 in the CHCC-HBV cohort; log rank: p = 0.02; univariate Cox regression: p = 0.026, HR = 2.74 in the Mongolian-HCC cohort). Moreover, three microarray-based HCC expression datasets were also included to evaluate the prognostic power of AUGUR. We expected the performance of AUGUR to be poorer, considering the cross-platform usage of weights trained in TCGA-LIHC transcriptome sequencing data. However, AUGUR significantly associated with survival in all three microarray datasets ([111]Fig. 3E, [112]Fig. S3A and B, [113]Tables S16 and S17, log rank: p = 0.00013; univariate Cox regression: p = 0.0002, HR = 2.33 in the FULCI-HCC cohort; log rank: p = 0.00044; univariate Cox regression: p = 0.0006, HR = 2.33 in the NCI-HCC cohort; log rank: p = 0.015; univariate Cox regression: p = 0.02, HR = 3.18 in the INSERM-HCC cohort). In the meta-analysis considering all training and testing cohorts (combined: n = 1,127 HCC patients), AUGUR also showed significant association with outcome ([114]Fig. 3F, univariate Cox regression: p = 8.66 × 10^-19; HR = 2.56 [2.08–3.15]). These results demonstrate the prognostic reproducibility and concordance of AUGUR across multiple cohorts from different profiling platforms. Fig. 3. [115]Fig. 3 [116]Open in a new tab AUGUR provides robustly reproducible prognostic efficacy for HCC across cohorts and profiling platforms. (A–E) AUGUR risk scores are used in the Kaplan–Meier estimations to assess mortality of patients with HCC utilising TCGA-LIHC training cohort (A), independent RNA-seq-based validation cohorts ICGC-LIRI-JP (B), CHCC-HBV (C) and Mongolian-HCC (D) and independent microarray-based validation cohort FULCI-HCC (E). Patients in each dataset are divided into high-risk and low-risk groups by the median of AUGUR risk scores. The bar charts (top) exhibit the distribution of AUGUR risk scores for each cohort. (F) Prognostic power of AUGUR evaluated in a meta-analysis across all seven cohorts. Univariate Cox regression analysis is performed in each dataset and hazard ratios with a 95% confidence interval and p values are shown. The diamond indicates the hazard ratio for the meta-analysis. Statistical analysis applied log-rank test (A–E), univariate Cox regression analysis (F, top) and fixed-effects meta-regression analysis (F, bottom). A value of p <0.05 is defined as statistical significance. AUGUR, a utility gadget using RNA; HCC, hepatocellular carcinoma; PMID, PubMed ID; RNA-seq, RNA sequencing. In five out of six training and validation datasets with clinicopathological factors, AUGUR was significantly associated with OS in multivariate Cox proportional hazards analysis adjusting for age, sex, TNM stage, histological grade, cirrhosis, and AFP ([117]Tables S12–S17, TCGA-LIHC: p = 0.0006, HR = 3.23 [1.66–6.29]; ICGC-LIRI-JP: p = 0.0031, HR = 3.52 [1.53–8.12]; CHCC-HBV: p = 0.0175, HR = 2.02 [1.13–3.61]; FULCI-HCC: p = 0.014, HR = 1.81 [1.13–2.91]; and INSERM-HCC: p = 0.0099, HR = 3.86 [1.38–10.75]). The insignificant results obtained in the Mongolian-HCC cohort might be a result of the small sample size when multiple variables were considered (less than 50 samples when considering multivariate). AUGUR remained a significant prognostic model when replacing TNM stage with BCLC stage or CLIP stage that were collinear with TNM stage ([118]Tables S18–S20). This analysis suggested that AUGUR could provide independent prognostic values from established clinicopathological indices. Furthermore, we found that HCC patients with higher AUGUR risk scores showed significantly faster disease progression than those with lower scores in four datasets ([119]Fig. S3C–F, log rank: p <0.05 in TCGA-LIHC, CHCC-HBV, FULCI-HCC, and INSERM-HCC cohorts). Taken together, these results indicate that high AUGUR score can increase the risk of disease progression and mortality independently in patients with HCC regardless of the original profiling platform, implying that a prognostic signature resistant to differences in cohort and expression profiling technology could be achieved by restricting transcriptomic ITH. Comparison of ITH-free signature AUGUR with other gene expression signatures As previously reported expression signatures nominally have excellent performance in predicting the outcomes of patients with HCC, we evaluated the discrimination and prognostic accuracy of 13 established multigene signatures[120][5], [121][6], [122][7], [123][8], [124][9]^,[125][28], [126][29], [127][30], [128][31], [129][32], [130][33], [131][34], [132][35] and our AUGUR in parallel. The results of time-dependent ROC curves and C-index showed AUGUR was superior or comparable to the other 13 models (the majority of them [10/13] were also developed using TCGA-LIHC as a training cohort [[133]Table S5]) in terms of 1-, 3-, and 5-yr survival prediction ([134]Fig. 4A–C, [135]Table S21), especially in AUC for 1-yr survival and C-index, AUGUR unequivocally outperformed the others ([136]Fig. 4A, [137]Table S21, AUC for 1-yr survival of AUGUR: 0.81, 95% CI: 0.75–0.88; C-index of AUGUR: 0.72, 95% CI: 0.67–0.77). Additionally, only 9% (1/11) of MultiRRnaSeq1-enrolled ([138]Fig. 4D and E) and 14% (2/14) of MultiRRnaSeq2-enrolled ([139]Fig. S4A and B) patients with HCC exhibited discordant risk classification using AUGUR, which was even lower than the most extensively cited five-gene signature developed by Nault et al. (18% (2/11) in MultiRRnaSeq1 and 36% (5/14) in MultiRRnaSeq2 cohort, [140]Fig. 1D, [141]Fig. S4C). Also, AUGUR predicted the lowest median risk bias of multiregional samples from the same patient, which compared to the other signatures ([142]Fig. 4F). We corroborated this result in two other multiregional samples datasets – the MultiRRnaSeq2 and the MultiRArray cohorts ([143]Fig. 4G, [144]Fig. S4D). These results suggested that AUGUR compares favourably to the concordant rates of established ProGESigs ([145]Fig. 4E, [146]Fig. 1D, [147]Fig. S4A–C, 91% and 86% concordant rates for AUGUR in MultiRRnaSeq1 and MultiRRnaSeq2 cohorts), overcomes sampling bias more powerfully, and can be applied to a single biopsy to perform patient risk classification. Fig. 4. [148]Fig. 4 [149]Open in a new tab Comparison of prognostic accuracy and discrimination between AUGUR and established HCC signatures. (A–C) Time-dependent ROC curves are compared between AUGUR and the other 13 signatures in terms of 1-yr (A), 3-yr (B), and 5-yr (C) survival prediction. (D) AUGUR is evaluated in the MultiRRnaSeq1 cohort. (E) The bar chart represents the percentage of patients in the MultiRRnaSeq1 cohort who are classified as concordant low, concordant high, and discordant risk using AUGUR risk scores. (F and G) Boxplots showing the standard deviation of risk scores for multiregional samples per patient that derived from each signature, except for the signature of PMID23567350 (for this signature, the standard deviation of distance to good reference sample group is shown) in the MultiRRnaSeq1 cohort (F) and the MultiRArray cohort (G). Each point represents a patient, and each box represents a signature. The median of each signature is shown. (H) Percentage of genes located in each heterogeneity quadrant corresponding to [150]Fig. 2B. Each stacked bar in (H) represents one of the 13 collected signatures. (I) Stacked bar marked as ‘Observed’ represents percentage of merged genes of 13 signatures located in each heterogeneity quadrant. Stacked bar marked as ‘Expected’ represents expected percentage of genes in each heterogeneity quadrant calculated from [151]Fig. 2B. The p value represents Fisher test of gene percentage in Q4 Set between ‘Observed’ and ‘Expected’. A value of p <0.05 is defined as statistical significance. AUGUR, a utility gadget using RNA; PMID, PubMed ID; ROC, receiver operating characteristic. Given that the size of the signature was not correlated with the discordant risk classification of a patient, we asked what makes a signature more or less sensitive to intratumour heterogeneity. Considering that the components of AUGUR are derived from the Q4 quadrant, we wondered in which quadrant are the genes of other signatures mainly distributed? We evaluated the inter- and intratumour heterogeneity of all genes in the 13 signatures (n = 134 genes) using the MultiRRnaSeq1 cohort ([152]Fig. 4H) and found that 60% (80/134) of the genes that significantly exceeded the expected number (28% [37/134], p = 2.31 × 10^-7) were located in the Q1 quadrant, in which genes exhibited high inter- and intratumour heterogeneity ([153]Fig. 4I). The mutation-derived signature[154]^29 and the ceRNA regulatory network-related signature[155]^34 comprising 100% (9 of 9) and 75% (6 of 8) Q1 genes, respectively, exhibited 46% and 55% discordant risk classification ([156]Fig. 1D, [157]Fig.4H) that are higher than AUGUR that comprises 100% Q4 genes (low ITH) ([158]Fig. 4D and E), which suggested that the ITH levels of the genes in the signature affected the sensitivity of a signature to ITH. These phenomena can explain that the reported HCC ProGESigs possess strong discrimination abilities for patients, however, they also possess strong discrimination abilities for intratumour multiregional samples (vulnerable to sampling bias) because of the lack of circumvention of ITH, which may contribute to the low reproducibility rate in independent patient cohorts or using tumour samples from different regions of the same patient. To further determine what biological pathways are likely to be lost in AUGUR compared with other signatures, we performed pathway enrichment analysis on genes in Q1 and Q4 quadrants ([159]Tables S22 and S23). We found that Q1 genes were significantly enriched for pathways involved in cell adhesion (e.g. regulation of cell–cell adhesion; cell–substrate adhesion), extracellular matrix organisation, and immune system process (e.g. regulation of immune effector process; activation of immune response), while the top pathways enriched in Q4 showed involvement in regulation of small GTPase mediated signal transduction. The characteristics of high ITH and significant enrichment for immune-related pathways of Q1 genes are consistent with a previous study,[160]^13 which demonstrated significant immune-ITH within HCC tumours. Cell adhesion and extracellular matrix organisation are associated with invasion and metastasis which are considered as late events in tumorigenesis.[161]^36 Hence, the genes in Q4 seem to lose the regulation of tumoural evolutionary late-related and immune-related biological pathways. The small GTPase mediated signal transduction functions as signalling hubs that regulate many important physiological and pathophysiological processes such as cancer.[162]^37 Thus, the genes in Q4 are likely to function as nodal points that integrate broad upstream regulatory inputs and disseminate broad effector outputs. Further studies should be conducted to explore whether biomarkers developed from signalling hubs in Q4 were more robust to clinical sampling bias, and whether immune signatures were essentially heterogeneous. AUGUR is associated with tumour adverse features Exploring the clinicopathological and biological underpinnings of the AUGUR signature using TCGA-LIHC HCC cohort ([163]Table S24, [164]Fig. 5A), we observed that higher TNM stage ([165]Fig. 5B, proportion of patients at stage III/IV: 35% in high vs. 19% in low AUGUR risk group, p = 2.07 × 10^-3) and histological grade ([166]Fig. 5C, proportion of patients at grade 3/4: 50% in high vs. 26% in low AUGUR risk group, p = 1.98 × 10^-5) experienced significantly high enrichment rates in the high AUGUR risk group. In the ICGC-LIRI-JP RNA-seq-based ([167]Table S25) and FULCI-HCC microarray-based ([168]Table S26) cohorts, we validated that patients with advanced HCC with high histological grade, TNM stage, BCLC stage, or CLIP stage exhibited a significant increase in AUGUR risk scores ([169]Fig. S5A–D, p < 0.05). Moreover, we found that AUGUR risk score was significantly negatively correlated with HCC doubling times that were calculated based on imaging data in the [170]GSE54236 cohort (R = -0.36, p = 0.0014, [171]Fig. 5D) and patients with high AUGUR scores had significantly shorter HCC doubling times (log rank: p < 0.0001, [172]Fig. 5E).[173]^6 A high AUGUR risk score predicted a high risk of vascular invasion (p = 2.07 × 10^-2, [174]Fig. 5F) and metastasis (p = 2.09 × 10^-9, [175]Fig. 5G) in the TCGA-LIHC cohort, and the same results were obtained in the ICGC-LIRI-JP cohort ([176]Fig. S5E and F, [177]Table S25). The serum level of AFP, a well-established HCC biomarker, was also significantly positively associated with AUGUR risk score in both training and validation datasets and in both RNA-seq-based and microarray-based platforms ([178]Fig. 5H, [179]Fig. S5G and H, [180]Table S24, [181]Tables S26–S28, p < 0.01 for TCGA-LIHC; CHCC-HBV, and FULCI-HCC cohorts). These results indicated aggressive characteristics including advanced tumour status, faster tumour growth rate and high invasive and metastatic potential in HCC patients with high AUGUR scores. Fig. 5. [182]Fig. 5 [183]Open in a new tab A comprehensive view of clinicopathological and molecular features of tumour samples with high or low AUGUR risk scores. (A) Waterfall plot showing the SMGs with a mutation frequency higher than 3%, and the corresponding AUGUR risk and clinicopathological characteristics of HCC samples in the TCGA-LIHC cohort. The bar chart on the right exhibits the mutation frequency of each gene in all analysed samples. (B and C) Bar charts showing the distribution of HCC patients with different TNM stage (B) and histological grades (C) in AUGUR high- and low-risk groups of the TCGA-LIHC cohort. (D) Correlation of AUGUR risk score and calculated tumour volume doubling time of HCC based on imaging data in the [184]GSE54236 cohort. Correlation analysis applied Pearson’s product-moment correlation test. (E) Kaplan–Meier analysis for tumour volume doubling time of HCC in the [185]GSE54236 cohort according to the AUGUR risk. (F–H) Bar charts showing the distribution of HCC patients with different vascular invasion status (F), different metastatic risk (G), and different serum levels of AFP (H) in AUGUR high- and low-risk groups of the TCGA-LIHC cohort. (I) Stacked bar charts show genes with significantly different mutation frequencies between AUGUR high- and low-risk groups in the TCGA-LIHC cohort. Statistical analysis applied Fisher’s exact test (B and C, F–I) and log-rank test (E). A value of p <0.05 is defined as statistical significance. AUGUR, a utility gadget using RNA; SMG, significantly mutated gene; TNM stage, tumour-node-metastasis stage. Significantly mutated gene (SMG) analysis was performed in the TCGA-LIHC cohort. The mutational landscapes revealed that TP53 (31%) and CTNNB1 (27%) were the two most frequently mutated genes in HCC ([186]Fig. 5A). Intriguingly, patients with high AUGUR scores had a significantly higher mutation probability of TP53 (47% in the high vs. 15% in the low AUGUR risk group, p = 1.14 × 10^-9), whereas patients with low scores had a significantly higher mutation probability of CTNNB1 (34% in the low vs. 19% in the high AUGUR risk group, p = 2.05 × 10^-3). The significant mutational differences were further confirmed in four independent validation cohorts ([187]Fig. S5I–L, [188]Table S29, ICGC-LIRI-JP, CHCC-HBV, Mongolian-HCC, and INSERM-HCC cohorts). These results were consistent with previous studies that found TP53 mutations were significantly enriched in high-risk and poorly differentiated HCC, whereas CTNNB1 mutations were predominantly associated with low-risk and well-differentiated HCC,[189]^5^,[190]^38 which associated high AUGUR risk with TP53 mutation and high histological grade, whereas it associated low AUGUR risk with CTNNB1 mutation and low histological grade. Construction and validation of an outcome predictive nomogram integrating AUGUR and other independent predictive factors To provide clinicians with a quantitative model to predict the survival probability of a particular patient with HCC, we constructed a nomogram that integrated both AUGUR and clinicopathological risk factors to predict 1-yr, 3-yr, and 5-yr outcomes using the TCGA-LIHC dataset. AUGUR and TNM stage that also demonstrated independent prognostic capacities in multiple cohorts according to multivariate analysis ([191]Tables S12, S13, and S16) were incorporated ([192]Fig. 6A). Calibration plots for the 1-yr, 3-yr, and 5-yr survival rates were generated and showed that the outcomes predicted by the nomogram were approximated to actuality in all TCGA-LIHC training ([193]Fig. 6B) and three validation cohorts ([194]Fig. 6C–E, ICGC-LIRI-JP, CHCC-HBV, and FULCI-HCC cohorts). The discriminative ability of the nomogram (C-index: 0.75 [0.69–0.80]; 0.74 [0.66–0.82]; 0.66 [0.59–0.73]; and 0.69 [0.63–0.74] for TCGA-LIHC, ICGC-LIRI-JP, CHCC-HBV, and FULCI-HCC cohorts, respectively) was stronger than that of either AUGUR or tumour TNM stage alone in multiple cohorts ([195]Table S30). Time-dependent ROC curves further demonstrated that the specificity and sensitivity of the prognostic nomogram were superior to any single independent predictive factor for 1-yr ([196]Fig. 6F, AUC: 0.84, 0.82, and 0.69 for nomogram, AUGUR, and TNM stage, respectively), 3-yr ([197]Fig. 6G, AUC: 0.76, 0.70, and 0.66 for nomogram, AUGUR, and TNM stage, respectively) and 5-yr ([198]Fig. 6H, AUC: 0.75, 0.70, and 0.63 for nomogram, AUGUR, and TNM stage, respectively) survival, which were also validated in three independent cohorts ([199]Fig. S6A–G, [200]Table S30). Taken together, the cooperative nomogram based on AUGUR and TNM stage could enhance the survival prediction compared with a single prognostic factor. Fig. 6. [201]Fig. 6 [202]Open in a new tab Nomogram construction and performance evaluation. (A) Nomogram is generated according to multivariate analysis in the TCGA-LIHC training cohort to predict 1-yr, 3-yr, and 5-yr survival. (B–E) Calibration curves indicate the applicability of the nomogram for 1-yr, 3-yr and 5-yr survival in the TCGA-LIHC (B), ICGC-LIRI-JP (C), CHCC-HBV (D), and FULCI-HCC (E) cohorts. The dotted line represents the ideal nomogram, the violet, green, and orange solid lines represent the observed nomograms in terms of 1-yr, 3-yr and 5-yr survival prediction, respectively. (F–H) Time-dependent ROC curves are used to evaluate the sensitivity and specificity of the nomogram, AUGUR, and TNM stage in predicting 1-yr (F), 3-yr (G), and 5-yr (H) survival in the TCGA-LIHC training cohort. AUGUR, a utility gadget using RNA; OS, overall survival; ROC, receiver operating characteristic. Discussion ITH, which is prevalent across cancer types,[203]^14^,[204]^16^,[205]^39^,[206]^40 is regarded as an unaddressed confounding factor for biomarker discovery and application.[207]^16 Molecular biomarkers inferred by a single biopsy without considering tumour heterogeneity could either over- or underestimate the prognostic risk of a patient, as recently described in lung cancer, breast cancer, and clear cell renal cell carcinoma.[208]^17^,[209]^18^,[210]^41 Multiregional tumour analysis demonstrates discordant prognostic risks derived from different tumour regions assessed by established HCC ProGESigs from a single biopsy, including epithelial-related, immune-related, mutation-derived, epigenetic-related, the most extensively cited five-gene signature and other signatures, which emphasises the importance of multiregional sampling or spatial analyses and incorporating ITH in tumour biomarker design. Here, we leveraged multiregional HCC transcriptome profiles to minimise the confounding effects of ITH, and simultaneously combined differentially expressed genes and genes with prognostic value to develop a robust ITH-free RNA-based biomarker, from which we defined a signature AUGUR that consists of 12 core genes and maintains prognostic reproducibility and generalisability across multiple HCC patient cohorts from different countries (e.g. American [TCGA-LIHC and NCI-HCC], China [CHCC-HBV and FULCI-HCC], France [INSERM-HCC], Japan [ICGC-LIRI-JP] and Mongolia [Mongolia-HCC]) and across different commercial profiling platforms. Previous recommendations have suggested that multiregional tumour sequencing may be more informative for prognostication, thus we evaluated the discrimination and prognostic accuracy of AUGUR via comparing it with that of 13 established multigene expression signatures. The results demonstrated AUGUR was superior to the other 13 signatures. None of the 13 established signatures had overlapping genes with AUGUR. Further, we collected another 52 reported ProGESigs (totally 65 signatures involving 390 unique genes, [211]Table S31), and found that only one gene (RACGAP1) in one signature overlapped with the core gene set of AUGUR ([212]Table S11). We also detected whether core genes of AUGUR were included in the biomarkers of the CancerLivER database that included more than 594 liver cancer biomarkers,[213]^42 or appeared in all fields of a scientific literature that included ‘hepatocellular carcinoma’ in the title using CoCites (a citation-based method for searching scientific literature).[214]^43 Three of 12 genes (CBFA2T3, S100A10, and RACGAP1) are included in the CancerLivER database and six of 12 genes (CDKN2B, MSRA, RAP2A, S100A10, RACGAP1, and TMCO3) co-occur with HCC in the scientific literatures. Overall, although more than half of core genes of AUGUR have been reported to be associated with HCC, few of them have been reported as biomarkers for HCC, which suggested the different layers of tumour were mined when ITH was circumvented and inter-tumour heterogeneity was amplified. Intriguingly, we noticed that the signatures developed by Kim et al.,[215]^5 Xu et al.,[216]^32 and Nault et al.[217]^31 exhibited the lowest discordant risk classification rate (18%) among the 13 collected signatures. However, the signatures developed by Kim et al. and Xu et al. were two extreme cases that showed the highest and lowest median risk bias of multiregional samples from the same patient, respectively, and genes of these two signatures were mainly located in Q1 quadrant and completely located in the Q3 quadrant, respectively. As we described in the results, although signatures with high inter- and intratumour heterogeneity possessed strong inter-patient discrimination, they were highly susceptible to confounding by sampling bias, whereas signatures with low inter- and intratumour heterogeneity could maintain the concordant intratumour risk, but the ability to discriminate between patients was also reduced, which was prone to misclassification (e.g. patients H10 and H9 were respectively classified as low risk and high risk in our AUGUR and the 65-gene signature developed by Kim et al., whereas in the double low heterogeneity signature developed by Xu et al., patients H10 and H9 were classified as high risk and low risk, respectively; [218]Fig. S7). Our AUGUR neutralises the strengths and weaknesses of the above two signatures, provides the strongest inter-patient discrimination, and maintains intratumour stability. Actually, tumour heterogeneity performs in multiple dimensions, including genomic, epigenetic modification, transcriptomic and regulatory heterogeneity, etc. As we discussed in our previous study, RNA is the next-level executor of DNA according to the central dogma of genetics,[219]^44 thus the variations and modifications at the DNA level will be manifested through RNA expression, and genomic heterogeneity and DNA epigenetic modification heterogeneity will naturally be reflected through RNA heterogeneity. In the past, scientists have primarily focused on identifying the greatest level of mutation and copy number variation heterogeneity that fostered tumour evolution, we hypothesised that the transcriptomic heterogeneity might be more likely to reflect the heterogeneity of biological phenotypes and clinicopathological characteristics of tumours comparing to genomic heterogeneity. Therefore, biomarker design based on transcriptomic heterogeneity might be preferable, superior, more straightforward, and more able to circumvent total tumour heterogeneity than using other upstream omics heterogeneity. Of course, protein is the ultimate function executor, and proteomic heterogeneity has also been reported in tumours,[220]^45 however, the measurement techniques of protein are more expensive and not as sophisticated as transcriptome sequencing, which leads to the relatively scarce proteomic data and its limited application. Thus, we suggest AUGUR currently serves as a pragmatic solution for predicting outcomes of patients with HCC using single-region tumour samples, and biomarker design schemes based on protein heterogeneity should be further investigated in future studies. In summary, to the best of our knowledge, this study is the first to introduce, quantify and integrate inter- and intratumour heterogeneity, and circumvent ITH in HCC biomarker design. A machine-learning algorithm was utilised to extract ITH-free features that clonal expression in tumour and provided the greatest inter-patient discrimination, from which we developed a prediction signature AUGUR that overcame sampling bias. The higher AUGUR risk score was significantly associated with tumour adverse features and mortality of patients with HCC, which was validated in multiple cohorts and across profiling platforms with robust prognostic significance. A quantitative nomogram integrating AUGUR and TNM stage was generated and performance validated, which might provide reliable prognostic information tailored to the individual patient and help clinicians select a personalised therapeutic regimen for patients with HCC. Nevertheless, it will be necessary to further hone our ITH-free AUGUR prognostic model and integrated nomogram in a prospective study and a large cohort with multiregional tumour samples. Financial support This work was supported by the National Natural Science Foundation of China (62102122), the Tou-Yan Innovation Team Program of the Heilongjiang Province (Grant 2019-15), the China Postdoctoral Science Foundation (2020M681111 and 2021MD703831), the Heilongjiang Postdoctoral Young Talents Program (LBH-TZ2114), the Heilongjiang Postdoctoral Funds for Scientific Research (LBH-[221]Z20168 and LBH-[222]Z20170), and the Basic Scientific Research Funds of Heilongjiang Provincial Universities in 2021. Authors’ contributions Conceived the idea and conceptualised the study: SL, YZ, XZ. Conducted the bioinformatic analysis, generated the figures and tables and wrote the manuscript: SL, YZ. Collected and pre-processed data: SL, YZ, YJ. All authors have read and approved the final version. Data availability statement Data sources and handling of the publicly available datasets used in this study are described in the Materials and methods and Supplementary materials. The code and data used in this study is available from the following website(s): [223]https://github.com/BioinfoDriver/AUGUR. Further information is available from the corresponding authors upon request. Conflicts of interest The authors declare that they have no potential conflicts of interest. Please refer to the accompanying ICMJE disclosure forms for further details. Please refer to the accompanying ICMJE disclosure forms for further details. Acknowledgements