Graphical abstract graphic file with name fx1.jpg [47]Open in a new tab Highlights * • A large-scale multi-omics study of hepatocellular carcinoma (HCC) * • Robustness and simplified panel for HCC proteomic subtypes * • Proteomic model to personalize prediction of sorafenib response for HCC precision therapy * • Workflow from generating large-scale omics datasets to drug testing __________________________________________________________________ Xing et al. develop a workflow from generating large-scale omics datasets to drug testing. The robustness and simplified panel for hepatocellular carcinoma (HCC) proteomic subtypes, as well as effective prediction of sorafenib response, demonstrate that targeting multi-omics can improve the implementation of personalized therapeutic strategies for HCC. Introduction Hepatocellular carcinoma (HCC) accounts for about 75%–85% of all primary liver cancer, with limited therapies and accompanied by poor prognosis.[48]^1 The majority of HCC patients are diagnosed at an advanced stage and can only receive systemic anti-tumor therapy, which is less effective and has a low response rate.[49]^2^,[50]^3^,[51]^4^,[52]^5^,[53]^6^,[54]^7 In particular, the prognoses of patients with HCC at the same clinical stage have been remarkably different, making it challenging to predict the outcomes. Therefore, accurate staging and subsequent appropriate personalized treatment are the keys to improving clinical outcomes of HCC. Extensive efforts have been devoted to identify new subtypes for HCC using genomic and transcriptomic data,[55]^8^,[56]^9^,[57]^10^,[58]^11^,[59]^12 but with limited impact on clinical decisions. Proteins ultimately perform the real biological functions and represent the real world of complex diseases; in particular, the important post-translational modifications (PTMs) of proteins cannot be informed from genomic and transcriptomic data, such as phosphorylation-related kinases that are important drug targets. Proteomics, which reflects systemic changes in functional executors, has been extensively applied for biomarker discovery and therapy selection in cancers[60]^13^,[61]^14^,[62]^15^,[63]^16^,[64]^17 and other complex diseases,[65]^18^,[66]^19^,[67]^20^,[68]^21^,[69]^22^,[70]^23 and provides unlimited potential for precision medicine. Recently, Jiang et al. performed a comprehensive proteomics study and stratified early-stage HCC patients into three subtypes with different clinical outcomes (SI, SII, and SIII), and they identified a therapeutic target sterol O-acyltransferase 1 (SOAT1) for subtype SIII with the worst prognosis.[71]^24 Subsequently, Gao et al. performed the proteogenomic characterization of hepatitis B virus (HBV)-related HCC and also reported three similar HCC proteomic subtypes; meanwhile, they further identified two prognostic biomarkers involved in metabolic reprogramming.[72]^25 These two studies opened the avenue to precision typing and prognosis determination of HCC based on proteomics. To apply HCC proteomic subtypes in clinical precision therapy, further verifying the robustness and universality of HCC proteomic subtypes and identifying signatures for HCC subtypes are crucial. In addition, occurrence and progression of HCC is an extremely complex process, and the proteomics also needs to be integrated with genomics and transcriptomics to panoramically and comprehensively understand the molecular changes of HCC from the genetic level to the protein level and even the PTM level to better predict prognosis, find personalized drug targets, and evaluate drug efficacy.[73]^26^,[74]^27^,[75]^28^,[76]^29^,[77]^30^,[78]^31^,[79]^32^ ,[80]^33^,[81]^34^,[82]^35^,[83]^36^,[84]^37^,[85]^38^,[86]^39^,[87]^40 Here, we comprehensively analyzed the genomic, transcriptomic, proteomic, and phosphoproteomic profiles of an independent HCC cohort to reveal and verify the robustness and universality of the proteomic subtypes and have developed a simplified discriminating panel of proteomic subtypes using machine learning. Integrated multi-omics analysis further revealed the functional impact of the molecular alterations for different HCC subtypes. In addition, we identified actionable drug targets and corresponding drugs for different proteomic subtypes by integrating proteomics and phosphoproteomics and further assessed the therapeutic responses in patient-derived cells (PDCs). Overall, our study highlights the importance of proteomics and phosphoproteomics in clinical decisions for HCC patients, especially in precise prognosis evaluation and personalized therapeutic strategy selection. Results Overview of the study Proteomics-based precision typing of HCC leads us to the era of precision medicine. Regarding the robustness and universality of HCC proteomic subtypes, molecular characterization and a simplified panel for clinical application are the main obstacles to further advancement of proteomics to the clinic. To this end, we performed multi-omics profiling of primary tumor tissues (T) and paired para-tumor tissues (N) from 160 HCC patients (proteomics, N = 160; phosphoproteomics, N = 132 from 160; whole-exome sequencing [WES], N = 58 from 132; transcriptome sequencing [RNA_Seq], N = 57 from 58) ([88]Figure S1A and [89]Table S1). HCC subtypes were identified by proteomic analysis based on data-independent acquisition (DIA) ([90]Figure S1B) and validated in three independent publicly available external validation sets to confirm its robustness and universality. Machine learning was used to develop a simplified panel for discriminating HCC proteomic subtypes for facilitating clinical applications. Furthermore, to explore the molecular characteristics and screen the potential clinically actionable drugs for different HCC proteomic subtypes, integrated multi-omics analysis was performed, especially kinase abundance and kinase-substrate regulatory network analysis. Finally, 26 PDC models were established to investigate the personalized response to sorafenib in different proteomic subtypes, and a sorafenib therapeutic effect prediction model was constructed and validated by machine learning to provide precise clinical decisions for patients who should undergo sorafenib treatment. Proteomic characterization identified three HCC subtypes In total, we identified 6,512 proteins (false discovery rate <0.01) across all 152 paired samples that passed quality control, accounting for 76.33% of the hybrid spectral library ([91]Figures S1C and S1D; [92]Table S2). The high stability of the mass spectrometry (MS) platform (median Pearson’s correlation coefficient of 0.954 in data-dependent acquisition [DDA] mode, 0.976 in DIA mode) resulted in high quantitative stability and accuracy of HCC samples ([93]Figures S1E and S1F). As expected, the coefficient of variation values were higher in T than that in N (6.37% vs. 5.37%) ([94]Figure S1G), and the T and N were distinctly distinguished ([95]Figures S1H and S1I), highlighting the high inter-tumor heterogeneity of HCC, consistent with previously reported studies.[96]^24 These results strongly affirmed the high quality of our proteomic data. Unsupervised consensus clustering identified three proteomic subtypes in 152 HCC patients, which were designated as SI (n = 49), SII (n = 47), and SIII (n = 56) ([97]Figures 1A and [98]S2A–S2C). Furthermore, the three proteomic subtypes significantly differed in overall survival (OS) and recurrence-free survival (RFS) (log-rank test, p < 0.0001) ([99]Figure 1B). The three proteomic subtypes were further verified in two independent surgical specimen cohorts from Jiang et al. (N = 101)[100]^24 ([101]Figure 1C) and Gao et al. (N = 159)[102]^25 ([103]Figure 1D), and even in an independent biopsy specimen cohort from Ng et al. (N = 51)[104]^41 (p < 0.01) ([105]Figure 1E), which supported the reliability of our proteomic subtypes. Consistent with the current clinical consensus, the advanced BCLC stage (p = 0.041) and TNM stage (p = 0.032, Fisher’s exact test), positive α-fetoprotein (AFP) (p < 0.001), microscopic vascular invasion (MVI^+, p = 0.006), tumor low differentiation (p = 0.001), multiple tumor number (p = 0.027, Fisher’s exact test), and absence of tumor capsule (p = 0.045, Fisher’s exact test) were more prominent in SIII than in SI, and SII was intermediate ([106]Figures 1A and [107]S2D–S2J; [108]Table S2). In addition, the proteomic subtypes were also authenticated as an independent prognosis indicator by multivariate analysis (hazard ratio, 2.43; 95% confidence interval, 1.22–4.82; p = 0.011) ([109]Figure S2K). Figure 1. [110]Figure 1 [111]Open in a new tab Proteomic characterization identified three HCC subtypes (A) Consensus clustering of 152 HCC tumors. The associations of HCC proteomic subtypes with clinical characteristics are annotated in the upper panel (chi-squared test, ∗p < 0.05, ∗∗p < 0.01, ∗∗∗p < 0.001). The heatmap depicts the relative abundance of signature proteins (log[2]-transformed). Each column represents a patient sample, and rows indicate proteins. (B) Kaplan-Meier (KM) curves of OS and RFS for each proteomic subtype (log-rank test). (C–E) KM curves of OS for each proteomic subtype in Jing et al.’s cohort (C), Gao et al.’s cohort (D), and Ng et al.’s cohort (E). The p values are calculated by log-rank test. (F) ssGSEA reveals the pathways that are significantly enriched in the three respective proteomic subtypes. The specific enriched signaling pathways of each subtype are analyzed and summarized into four clusters. C1, upregulated in SI only; C2, upregulated in both SI and SII; C3, upregulated in SII and SIII; C4, upregulated in SIII only. (G) Signal pathway changing trend in the four clusters (C1–C4). See also [112]Figures S1 and [113]S2. Molecular features of three HCC proteomic subtypes To comprehensively characterize the molecular features of three HCC proteomic subtypes, the specific enriched signaling pathways of each subtype were analyzed and summarized ([114]Figure 1F). SI was characterized by the highest enrichment of metabolism-related pathways, which accounted for the majority of all enriched pathways and was highly consistent with Jiang et al.’s and Gao et al.’s (denoted as metabolism subgroup, S-Mb) results.[115]^24^,[116]^25 SII was a transitional subtype from SI to SIII with an increase in cell-growth- and immunity-related pathways but a decrease in metabolism-related pathways. The enrichment of immune-related pathways, such as antigen processing and presentation and Toll-like receptor, was consistent with the immune properties of SII in Jiang et al.’s and Gao et al.’s studies (denoted as microenvironment dysregulated subgroup, S-Me). SIII showed an increase in proliferation-, metastasis-, and immune-related pathways but almost no metabolic pathways, which validated the immune/metastatic characteristics of Jiang et al.’s SIII subtype and the proliferation characteristics of Gao et al.’s SIII subtype (denoted as proliferation subgroup, S-Pf), suggesting more aggressive characteristics in SIII than in both SI and SII ([117]Figure 1G and [118]Table S2). Thus, the molecular characteristics of the three proteomic subtypes in the three independent studies were highly consistent, indicating the stability and reliability of HCC proteomic subtypes. Meanwhile, the relevant or specific pathways of HCC significantly correlated with their proteomic subtypes, further highlighting their clinical implications. The HCC proteomic subtypes were robust and universal To investigate the universality of HCC proteomic subtypes, we performed a cross-validation with our subtypes and the proteomic subtypes from Jiang et al.’s cohort[119]^24 and Gao et al.’s cohort,[120]^25 with patient number >100. These three cohorts belonged to different clinical centers with different inclusion/exclusion criteria and MS strategies ([121]Figure 2A). The subtype-specific signatures (namely, signatures) contained 761 proteins in our cohort, which was significantly lower than the signatures of Jiang et al. (1,269 proteins) and Gao et al. (1,274 proteins) ([122]Figure S3A and [123]Table S3). However, to increase the potential for clinical application, these signatures need to be further simplified, as it is generally believed that fewer proteins are more applicable for clinical translation. For cross-validation, firstly, clustering our DIA data with Jiang et al.’s and Gao et al.’s signatures from DDA data using unsupervised consensus clustering also generated three subtypes with different OS (p = 0.00015 and p = 0.00047), and there was obvious concordance of patient allocation with our original subtypes (Jiang et al.: 85.5%; Gao et al.: 82.9%), supporting the reliable data-acquisition and subtyping procedure in this study ([124]Figure 2B and [125]Table S3). In turn, the consistency was up to 79.2% in Jiang et al.’s cohort (p = 0.0051) and 78.6% in Gao et al.’s cohort (p < 0.0001) when clustering their data using our signatures ([126]Figure 2C). Furthermore, in the mutual verification of the signatures from each of Jiang et al.’s or Gao et al.’s cohort, the consistency between these subtypes and the original subtypes reached up to 88.1% in Jiang et al.’s cohort (p = 0.05) and 77.9% in Gao et al.’s cohort (p = 0.00032) ([127]Figures 2D and [128]S3B; [129]Table S3). These results showed that the signatures from three proteomic studies were robust and universal. Figure 2. [130]Figure 2 [131]Open in a new tab The robustness and universality of HCC proteomic subtypes and their simplified discriminating panel (A) Workflow of cross-validation of proteomic signatures in three cohorts (Gao et al.’s cohort: N = 159; Jiang et al.’s cohort: N = 101; present cohort: N = 152). (B) Validation of Jiang et al.’s and Gao et al.’s signatures in our cohort. The left panel shows the KM curves of OS and RFS according to the proteomic subtypes (log-rank test). The right alluvial plot shows the comparison between these subtypes and the original subtypes. (C) Validation of our signatures in Jiang et al.’s and Gao et al.’s cohorts. The confusion matrices are provided. (D) Validation of Jiang et al.’s and Gao et al.’s signatures in each other’s cohorts. The confusion matrices are provided. (E) Concordance rate of three proteomic signatures in three cohorts. (F) Workflow for developing the simplified panel for discriminating proteomic subtypes. (G) Receiver-operating characteristic accuracy, sensitivity, and specificity of simplified panel for discriminating proteomic subtypes in the validation set. (H) Alluvial plot shows the comparison between proteomic subtypes identified by simplified panel and the original subtypes in the validation set. The predictor represents subtypes from the SP9, while the response represents the original subtypes from the full panel. See also [132]Figure S3. Generally, a strong concordance of three proteomic subtypes was validated in three independent cohorts (over 80% on average), with SIII having the highest concordance among the three subtypes in three independent cohorts, followed by SI, while SII showed the lowest consistency because of its transition properties ([133]Figure 2E). In addition, most of the inconsistencies by different signatures in the same cohort were found in SII or SIII because of the similar molecular features in SII and SIII as mentioned above ([134]Figures S3C–S3H). A machine-learning-based simplified panel for distinguishing HCC proteomic subtypes It is expressly necessary to reduce the number of signatures of HCC proteomic subtypes to facilitate potential clinical applications. For machine learning, our cohort and Jiang et al.’s cohort were used as the training set, and Gao et al.’s cohort was used as the independent external validation set, after removing the batch effect from these three cohorts ([135]Figures 2F and [136]S3I). The machine-learning analysis resulted in a simplified panel (SP9) comprising nine proteins (DCXR, EHHADH, ALDH4A1, ABAT, ALDH6A1, ALDH7A1, SULT2A1, SORD, and ACSM2B), which were all drastically downregulated in SIII ([137]Figure S3J) and significant for OS to stratify the patients ([138]Figure S3K). Mechanistically, they were involved with fatty acid metabolism (ACSM2B, EHHADH), amino acid metabolism (ALDH6A1, ALDH7A1, ALDH4A1, ABAT), glycometabolism (SORD, DCXR), and bile secretion (SULT2A1). SP9 showed high sensitivity and specificity to discriminate SI or SIII from the total HCC cohort ([139]Figures S3L and S3M). In the validation cohort, the sensitivity and specificity of SP9 for SI discrimination were 84.5% and 96.4%, respectively with an area under the curve (AUC) of 0.961; for SII discrimination the sensitivity and specificity were 72.7% and 89.4% with an AUC of 0.881; and the highest sensitivity (95.8%) and specificity (95.5%) with the highest AUC of 0.988 were obtained in discriminating SIII from SI and SII patients, respectively ([140]Figure 2G). Furthermore, the subtypes using SP9 showed a similarity up to 84.3% with the original subtypes, in conformance with the similarity of the three subtypes in different cohorts ([141]Figure 2H). These results indicated that SP9 was stable and reliable for HCC subtyping, which provided a possibility for clinical application of HCC proteomic subtypes. Mutation and copy-number variation landscapes of three HCC proteomic subtypes To further explore the molecular characteristics of these three proteomic subtypes, we performed WES (N = 58), RNA_Seq (N = 57), and phosphoproteomic (N = 132) analysis of paired T and N samples ([142]Figure 3A), whose clinicopathological characteristics were not significantly different from the total samples ([143]Figure S4A). The mutation landscape showed that TP53, TNN, OBSCN, CTNNB1, RYR2, PCLG, MUC16, RYR1, ARIDA1, and RB1 were the most frequently mutated genes in HCC ([144]Figures 3B and [145]S4B; [146]Table S4). CTNNB1 mutations were the most important feature in SI and SIII and were significantly associated with OS and RFS (p < 0.05) rather than WNT pathway alterations ([147]Figures 3C–3F and [148]S4C–S4G). These findings were consistent with previous reports, indicating the reliability of our results.[149]^42^,[150]^43 Figure 3. [151]Figure 3 [152]Open in a new tab Proteogenomic and immune landscape of three HCC proteomic subtypes (A) Illustration of 152 paired HCC cases used in the individual omics experiments. The omics experiments are colored blue, and the tumor tissues and non-tumor tissues were detected in pairs. (B) WES-based genomic landscape of the three HCC proteomic subtypes. The top panel represents the mutation profile, the middle panel represents TMB and TNB levels, and the bottom panel represents the CNV. (C–E) Feature importance ranking by the random forest algorithm for SI (C), SII (D), and SIII (E). The feature with the highest rank of importance score indicates the highest association (either positive or negative) with the proteomic subtype. (F) KM curves for OS of HCC patients with CTNNB1 mutation or wild type (log-rank test). (G) Heatmap shows the immune cell populations of HCC patients belonging to different proteomic subtypes. (H) Principal component analysis plot of immune scores of immune cell populations based on proteomic data in three proteomic subtypes. (I) Proteome-based immune scores in three proteomic subtypes (two-tailed Wilcoxon test). Box plots show median (central line), upper and lower quartiles (box limits), and 1.5× interquartile range (whiskers). (J) Proteomics-based immune scores of anti-tumor immunity and pro-tumor immune suppression in three proteomic subtypes (two-tailed Wilcoxon test). Box plots show median (central line), upper and lower quartiles (box limits), and 1.5× interquartile range (whiskers). (K) Correlation between anti-tumor immunity and pro-tumor immune suppression based on proteome in three proteomic subtypes. Each point represents a patient sample, the lines represent the fitted curves of correlation in each subtype, and the shaded area represents 95% confidence interval. Pearson’s correlation coefficient (r) and p values are present in the table. The p values were calculated using Pearson’s correlation method. See also [153]Figure S4. Meanwhile, the copy-number variation (CNV) landscape showed significant differences among different HCC subtypes, with different regulatory genes during hepatocarcinogenesis and progression. 1q was predominantly co-amplified in SII and SIII, suggesting that they may co-regulate gene expression ([154]Figure 3B). Additionally, the tumor mutation burden (TMB) and tumor neoantigen burden (TNB) in individual patients revealed the heterogeneity of HCC, consistent with previous reports.[155]^44 Nevertheless, there was no significant difference in TMB and TNB among the three proteomic subtypes ([156]Figure 3B and [157]Table S4). Immune landscape of three HCC proteomic subtypes To investigate the immune landscape of three HCC proteomic subtypes, we estimated the abundance of 28 immune cell populations using the single-sample gene set enrichment analysis (ssGSEA) method based on the proteomic and transcriptomic data. The immune landscape of three HCC proteomic subtypes showed high heterogeneity, with distinctly distinguished abundance of immune cell populations ([158]Figures 3G, 3H, [159]S4H, and S4I). The immune infiltration score was significantly higher in SIII than that in SI and SII (p < 0.05, median ± SD: SIII [162.74 ± 258.87] vs. SII [17.72 ± 154.67] vs. SI [−186.26 ± 163.07]) ([160]Figures 3I and [161]S4J). In detail, SI was characterized by increasing CD56 dim natural killer cells compared to SII and SIII. Meanwhile the abundance of most immune cells, including but not limited to central memory CD4 T cells, natural killer T cells, regulatory T cells, and myeloid-derived suppressor cells, were significantly higher in SIII than in the other two subtypes ([162]Figure S4K). The enriched ssGSEA score of immune activation cells (p < 0.05, median ± SD: SIII [1.25 ± 3.05] vs. SII [0.32 ± 2.64] vs. SI [−0.73 ± 4.05]) and immune suppression cells (p < 0.05, median ± SD: SIII [0.57 ± 3.33] vs. SII [−3.10 ± 2.37] vs. SI [−0.48 ± 2.32]) were both significantly higher in SIII than in SI and SII ([163]Figures 3J and [164]S4L). Meanwhile, anti-tumor immunity and pro-tumor immune suppression were significantly positively correlated in SIII, indicating a co-existence of immune activation and immunosuppression in aggressive HCC ([165]Figures 3K and [166]S4M). In addition, the abundance of key molecules of the immune microenvironment were significantly different among the three proteomic subtypes. As expected, SIII displayed significant upregulation of human leukocyte antigen (HLA) molecules (including HLA class I and II molecules), immune checkpoints (such as PDCD1 [PD-1], CD86, and CD47), immune-related CT antigens (such as ADAM28, CEP55, SPAG9, and PBK), and cytokines (such as IL-16, CCL3, TGFB-3, TGFB-1, CCL21, CFS1, and IL-18) compared with the other two subtypes ([167]Figure S4N). These results suggested that SIII HCC patients might be more suitable for immunotherapy, although this would need to be validated in other prospective clinical trials. Phosphoproteomic and kinase profiles of three HCC proteomic subtypes In phosphoproteomics analysis, the high inter-tumor heterogeneity in HCC that was shown in proteome was also clearly presented in phosphoproteome, and the characteristics of the phosphorylation modifications were similar to those in other reports ([168]Figures 4A and [169]S5A–S5G; [170]Table S5).[171]^45 The signaling pathways enriched in three proteomic subtypes at the phosphorylation level were also very similar to that at RNA level and protein level ([172]Figures 4B, 4C, and [173]S5H–S5J; [174]Table S6). Considering that protein kinases have been developed as viable drug targets for cancer therapies, we next investigated the kinase activity (reflected by the overall phosphorylation level of the substrates) in three proteomic subtypes and its correlation with the substrate phosphorylation level and kinase abundance. In total, 215 kinases were found in HCC, among which the activity of 75 kinases differed among three subtypes, and the enriched pathways also suggested the close association of kinase activity with the occurrence and development of HCC ([175]Figures 4D and 4E, [176]Table S5). Furthermore, the kinase-substrate correlation was significantly lower in SIII than that in SI and SII (p < 0.01, median ±SD: SIII [0.15 ± 0.22] vs. SII [0.17 ± 0.22] vs. SI [0.16 ± 0.23]) ([177]Figures 4F and 4G). Nevertheless, there was no correlation between kinase abundance and kinase activity ([178]Figure 4H). Moreover, the clinical value of kinase activity and kinase abundance in HCC was also significantly different. For example, the kinase activity of BCKDK, CDK16, and MAP3K2, but not the kinase abundance, was significant for the prognosis of HCC ([179]Figures 4I, 4J, and [180]S5K). Therefore, it is necessary to systematically analyze the abundance and activity of kinases by integrating proteome and phosphoproteome for drug screening and precise treatment. Figure 4. [181]Figure 4 [182]Open in a new tab Phosphoproteomic profile and kinase-substrate regulatory network of three HCC proteomic subtypes (A) Summary of the identification of phosphoproteins and kinases. (B) Pathway alterations of phosphoproteins in three proteomic subtypes. (C) Enriched functions of three proteomic subtypes by ssGSEA. (D) Kinase activation enrichment of differentially abundant phosphosites among three proteomic subtypes. Each column represents a patient sample, and each row indicates a kinase. (E) Kinase regulation-pathway network. Rhombus indicates signaling pathways, and roundness indicates kinases differentially expressed in phosphoproteomic data. (F) Kinase-substrate regulation networks in three proteomic subtypes. The edges represent Pearson’s correlation coefficient between kinases and the corresponding phosphosubstrates. (G) Distribution of Pearson’s correlation coefficients of kinase-substrate networks in (F) (two-tailed Wilcoxon test). Box plots show median (central line), upper and lower quartiles (box limits), and 1.5× interquartile range (whiskers). (H) Correlation between kinase abundance and kinase activity. r represents Pearson’s correlation coefficient. The p values were calculated using the Pearson’s correlation method. (I and J) KM curves of OS (I) and RFS (J) for BCKDK activity (log-rank test). See also [183]Figure S5. Multi-omics landscape of three HCC proteomic subtypes To explore the inter-omics correlation, we visualized the correlations of different omics using Pearson’s correlation coefficient. The results demonstrated that the correlation decreased along with the “central dogma,” which may be attributed to the enrichment of attenuated proteins, reflecting RNA splicing, protein synthesis, protein degradation, and even PTMs ([184]Figures 5A–5C, [185]S6A, and S6B). Notably, phosphorylation levels poorly correlated with CNV alterations, mRNA, and protein abundance (median Pearson’s r < 0.02), indicating the important role of the protein PTM in the development and progression of HCC ([186]Figures 5D and 5E). The inter-omics correlations differed among the three proteomic subtypes, especially the correlation between RNA abundance and protein abundance, while the correlation between protein abundance and phosphorylation level was not significantly different among the three proteomic subtypes, mainly due to the low overall correlation between them ([187]Figure S6C). Furthermore, we found that two-thirds of the significant mRNA-protein co-variates were positive while showing a higher number of negative regulatory interactions of protein phosphorylation, which were classified into three clusters ([188]Figures 5F and [189]S6D). Consistent with the characterization of proteomic subtypes, the SI-specific positive correlation cluster featured metabolism characteristics, and SIII-specific positive correlation clusters were significantly involved in proliferation and metastasis pathways ([190]Figures 5G and [191]S6E). Figure 5. [192]Figure 5 [193]Open in a new tab Integrated multi-omics analysis of three HCC proteomic subtypes (A) Correlation between WES, transcriptome, proteome, and phosphoproteome. The overlap patients of every two individual omics were analyzed. (B) Spearman’s correlation of CNV to mRNA and protein. Each dot represents a transcript/protein. Attenuated proteins are represented in red using a Gaussian mixture model with two mixture components. (C) Enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of the attenuated proteins. (D) Scatterplots depicting log[2] (T/N) of protein (x axis) and phosphoprotein (y axis) abundance. The red dots indicate positive correlations, and the blue dots indicate negative correlations. The p values were calculated using the Pearson’s correlation method. (E) Enriched KEGG pathways for negative-correlated proteins in (D). (F) Hierarchical clustering analysis map of significantly changed phosphosite-to-protein correlations among three proteomic subtypes. Pearson’s correlation coefficients between matched pairs of phosphosite abundances versus protein abundances were calculated. (G) Functional enrichment for significant phosphosite-to-protein correlations in each cluster. (H) Phosphosite-to-protein co-varying MCODE complexes/subnetworks of significantly changed phosphosite-to-protein correlations among three proteomic subtypes. Top five phosphosite-to-protein co-varying MCODE complexes/subnetworks are shown in different colors. (I–K) Top specific MCODEs in SI (I), SII (J), and SIII (K) positivity are combined with upstream kinases. See also [194]Figure S6. Furthermore, MCODE complex/subnetwork analysis showed the top five key co-varying phosphosite-to-protein MCODEs ([195]Figure 5H). We noticed that some kinases were shared within MCODEs; for example, AKT1 and CDK1, the top co-varying phosphosite-to-protein MCODEs, differed in the three proteomic subtypes. mRNA processing was the top MCODE in SI-positive cluster, and the kinase PRKCA involved in transcriptional regulation of upstream mRNA splicing was identified, consistent with previous reports ([196]Figure 5I).[197]^46 Furthermore, nucleocytoplasmic transport was the top MCODE in the SII-positive cluster, and the experimentally verified or predicted upstream kinases (CLK1, MAPKAPK2) were also mapped ([198]Figure 5J). In contrast, regulation of actin cytoskeleton was the top MCODE in the SIII-positive cluster, and the kinase AKT1 involved in the upstream regulation of actin cytoskeleton organization was also reported ([199]Figure 5K). Altogether, due to the high heterogeneity and patient specificity, the integrated analysis of multi-omics is more suitable for the comprehensive molecular characterization of HCC. In particular, the integrated analysis of proteomics and phosphoproteomics has promising prospects for drug screening and precision treatment of HCC. Drug-response prediction for three HCC proteomic subtypes To identify potential clinical drugs for the three HCC subtypes, we screened drug targets of solid tumor within the DrugBank database using quantitative proteomic and phosphoproteomic data. A total of 38 quantifiable drug targets (including 11 kinases) belonging to 33 clinically actionable drugs were identified, with significant difference in abundance among three proteomic subtypes ([200]Figure 6A). We identified targets of bevacizumab (C1QC, C1QB, C1QA, FCGR3A, FCGR2C), dasatinib (RSC, HSPA8, PPAT, CSK, LYN, MAPK14, FRK), and neratinib (ORM1, ORM2), with higher abundance in SIII than in SI and SII. Notably, RAF1 and LYN, key targets of sorafenib and regorafenib, had significantly higher kinase abundance in SIII than in SI and SII. However, enzymes affecting the drug activity (UGT1A9, UGT1A1, UGT1A3) or drug metabolism (CYP2D6, CYP3A4) of sorafenib and regorafenib were identified as highly ranked targets in SI and SII but not in SIII. Furthermore, phosphorylated substrates of kinase RAF1, MAP2K1, MAP2K2, and LYN were also significantly higher in SIII, suggesting abnormal regulation of kinase activity in SIII. In addition, the inconsistency between kinase abundance and substrate phosphorylation suggested that the aberrant regulation of kinases was caused by both kinase abundance and activity ([201]Figure 6A). Figure 6. [202]Figure 6 [203]Open in a new tab Drug prediction and key drug target screening for three HCC subtypes (A) Phosphosubstrates of kinases with clinical available drugs and fold change at proteomic and phosphoproteomic levels for kinases and substrates, respectively. The top section displays the abundance of drug-targeting proteins across the three subtypes, with each row representing a drug-targeting protein and “k” labeling representing drug-targeting kinase. The middle section shows the phosphorylation site abundance of substrate for differentially abundance kinases among the three subtypes, with each row representing the phosphorylation sites of substrate. The bottom section displays the abundance of the protein where the phosphorylation site is located, with each row representing a protein. The color gradient represents the abundance of drug-targeting proteins (kinases), phosphorylation sites, and substrate proteins, with green indicating low expression and red indicating high expression. Red labels indicate statistically significant differences in abundance among the three proteomic subtypes. (B) Kinase activity of FDA-approved drug targets in three proteomic subtypes (two-tailed Wilcoxon test). (C) Pathways based on the selected phosphosubstrates and kinases, with relevant drugs shown by targets. (D) Prognostic risk scores of each target from FDA-approved HCC clinical drugs. The x axis indicates log[2]-based hazard ratio for each target (log-rank test); y axis indicates log[2]-based T/N fold change for each target (two-tailed Wilcoxon test). (E) KM curves of OS and RFS for RAF1. The p values were calculated by log-rank test. (F) Abundance of RAF1 among three proteomic subtypes (two-tailed Wilcoxon test). (G) Correlation between substrate RAF1 and upstream kinase activity among three proteomic subtypes (two-tailed Wilcoxon test). See also [204]Figure S6. We next analyzed the kinase activity of the drug targets and observed high heterogeneity in kinase activity among the three proteomic subtypes ([205]Figure S6F). Specifically, the kinase activity of EGFR was significantly increased in SII compared to SI and SIII (p < 0.05); the kinase activity of ERBB2 was significantly decreased in SIII compared to SI and SII (p < 0.05); while the kinase activity of MAP2K1 was significantly increased in SIII compared to SI and SII (p < 0.05), highly consistent with the protein abundance of these kinases in the three subtypes. However, the kinase activity of RAF1 was significantly downregulated in SIII compared to SI and SII, which was completely opposite to the protein abundance ([206]Figures 6B and [207]S6G). Combined with the significant increase in protein abundance of BARF, MAP2K2, and MAPK1 in SIII, we speculated that the RAF-MEK-ERK signaling pathway was significantly upregulated in SIII ([208]Figure S6G). Additionally, the decrease in kinase activity of downstream mTOR and the increase of its substrate EIF4EBP1 phosphorylation in SIII suggested that combination therapy targeting both the RAF-MEK-ERK and mTOR pathways may be a more effective approach for SIII ([209]Figures 6C and [210]S6H). Therefore, a more comprehensive investigation of the signaling pathway regulatory networks could provide effective evidence for expanding treatment options beyond the current Food and Drug Administration (FDA)-approved HCC therapies. Furthermore, after stringent filtering processes, 19 out of 22 drug targets with clinically available drugs showed significant prognostic values for HCC ([211]Figures 6D and [212]S6I; [213]Table S7). Among these targets, RAF1 had the highest negative correlation with OS and RFS (p < 0.01) ([214]Figure 6E). The abundance of RAF1 was also significantly higher in SIII than in SI (p < 0.05) ([215]Figure 6F). Additionally, the correlation between RAF1 and its upstream kinases significantly increased in SII and SIII compared to SI (p < 0.01), suggesting the important regulating role of RAF1 in SII and SIII ([216]Figure 6G). These observations can explain the potential clinical benefits to SIII subtype HCC patients of RAF1-targeted therapies such as sorafenib, regorafenib, and dabrafenib. SIII subtype has more specific drug sensitivity to sorafenib than SI To further explore the potential drug response of HCC patients in three proteomic subtypes, we investigated the therapeutic effects of sorafenib on 26 PDC models ([217]Figures 7A and 7B; [218]Table S1). The clinicopathological features of these PDC models were not significantly different from those of the proteomic subtype discovery cohort ([219]Figure S7A). We assigned these 26 PDCs to their corresponding proteomic subtypes by consensus clustering of their proteomic data ([220]Figure S7B). Figure 7. [221]Figure 7 [222]Open in a new tab Subtype-specific drug sensitivities and machine-learning-based efficacy prediction model for sorafenib (A) Workflow of pharmacological tests using PDC models. (B) Immunofluorescence of PDCs. Blue represents DAPI staining of the cell nucleus, green represents the distribution of GPC3 on the cell membrane, and red represents the localization of α-fetoprotein (AFP) and albumin (ALB) in the cytoplasm. Scale bar, 20 μm. (C) Sorafenib sensitivity results of the PDC models in three proteomic subtypes (two-tailed Wilcoxon test). Box plots show median (central line), upper and lower quartiles (box limits), and 1.5× interquartile range (whiskers). (D) Sorafenib sensitivity results of the PDC models under different concentrations in three proteomic subtypes (two-tailed Wilcoxon test). Box plots show median (central line), upper and lower quartiles (box limits), and 1.5× interquartile range (whiskers). (E) Proportion of patients who achieved IC[50] response in three proteomic subtypes. IC[50], half-maximal inhibitory concentration. (F) Correlations between elastic-net-predicted and observed AUC in the validation set. Correlations were calculated by Pearson’s correlation method. (G) Regulation networks of selected protein features in sensitivity prediction model for sorafenib (Pearson’s correlation). (H) Correlations between enriched pathways and observed AUC in PDC models. (I) Enrichment of pathways associated with sorafenib sensitivity in three proteomic subtypes (two-tailed Wilcoxon test). See also [223]Figure S7. We then measured the drug response of each PDC to sorafenib by calculating the AUC of the dose-response curve after 4 days of treatment and compared the differences in response to sorafenib among the three proteomic subtypes ([224]Figure S7C). Overall, sorafenib effectively inhibited the proliferation of PDCs with a wide range of drug sensitivities for each subtype, indicating high heterogeneity of HCC PDCs ([225]Figure S7D). Notably, SII and SIII tumors showed high sensitivity to sorafenib while SI was mostly resistant (p < 0.05) ([226]Figures 7C and[227]S7D). In particular, the inhibition of PDC growth in SII and SIII was significantly stronger than in SI at sorafenib concentration of 32 μM or even higher ([228]Figure 7D). Moreover, the percentage of tumors achieving half-maximal inhibitory concentration (IC[50]) in SIII was significantly higher than that in SI ([229]Figure 7E), suggesting that SIII tumors may have more benefit from sorafenib. These findings demonstrated the strong potential of proteomic subtypes in predicting targeted drug sensitivity of HCC patients. Machine-learning-based efficacy prediction model for sorafenib We developed an elastic net model to predict the responses to sorafenib using 16 tumors as the training set, while the remaining 10 tumors were regarded as the validation set. Remarkably, we observed high correlations (Pearson’s r^2 = 0.753, p = 0.0011) between the predicted and observed AUC for sorafenib ([230]Figure 7F). In total, 19 protein features were selected by elastic net in the prediction of drug response to sorafenib, with most showing negative correlations with sorafenib sensitivity ([231]Figure 7G). Notably, most of the key proteins identified in the prediction model have not been reported to contribute to sorafenib response, such as EPHX2, ENPP1, and AKR7A3, which need further attention in future studies. As shown in [232]Figure 7H, metabolism-related pathways were highly enriched in the tumors resistant to sorafenib, while cell cycle and mRNA surveillance pathways were highly enriched in tumors sensitive to sorafenib ([233]Figures 7H, 7I, and [234]S7E). These findings were highly consistent with the molecular characteristics of SI and SIII, further demonstrating the higher sensitivity of SIII and SII tumors to sorafenib than SI. Interestingly, herpes simplex virus 1 infection, Salmonella infection, and viral life cycle HIV-1 showed high correlations with sorafenib sensitivity, suggesting that viral or bacterial infection might enhance the sorafenib sensitivity of HCC patients due to the close relationship between viral or bacterial infection and the immune microenvironment, which may also affect the efficacy of sorafenib ([235]Figure S7E).[236]^47 Collectively, these results suggest that proteomic subtypes can serve as proxies for predicting the treatment efficacy of targeted therapies for HCC patients. Discussion Currently, major advances and breakthroughs in precision medicine rely entirely on genomic analysis.[237]^48^,[238]^49^,[239]^50 A major challenge for proteomics-driven precision medicine is how to use the comprehensive proteome to identify subtypes of patients with shared biological factors that can be targeted for treatment. In this study, we identified three HCC proteomic subtypes characterized by metabolism, proliferation & immune, and proliferation & immune & metastasis, respectively. These findings were generally consistent with the proteomic subtypes of HCC reported by Jiang et al. and Gao et al. Furthermore, the cross-validation of three independent large-scale proteomics studies resulted in a stable and universal proteomic subtype for HCC, which was not limited to multiple centers, patient characteristics (BCLC stages and HBV infection), protein quantitative strategies (labeling and label-free), and data-acquisition mode of MS. These results demonstrated the reliability of the proteomic subtype strategies for HCC reported by Jiang et al. and Gao et al. as well as the feasibility of using the proteomics for HCC classification. Moreover, we developed a multi-centric and robust simplified panel based on machine learning for distinguishing HCC proteomic subtypes, suggesting that proteomics holds great promise in identifying HCC-subtype patients associated with different prognosis who might benefit from further clinical treatment. The integrated multi-omics analysis revealed alterations in mutation profiles, immune landscapes, and kinase-substrate regulation networks among three proteomic subtypes. SIII had stronger immune infiltration, suggesting that SIII HCC patients might derive potential benefits from immunotherapy. Therefore, a clinical trial of immunotherapy for SIII HCC patients could be attempted to test this speculation. Furthermore, the drug-targetable proteins identified through proteomic and phosphoproteomic data may provide evidence for expanding treatment selection beyond the current FDA-approved HCC therapies, which is promising for overcoming the limited availability of therapeutic drugs and their low response rates in HCC. Sorafenib remains the first-line drug for advanced HCC.[240]^2^,[241]^51 Based on the multi-omics dataset and PDC drug testing, we found that response rates to sorafenib were significantly higher in SII and SIII than in SI. We therefore hypothesize that SII and SIII patients may derive more benefit from sorafenib than SI patients. However, this hypothesis needs to be tested separately in a clinical trial. We further established a response prediction model for sorafenib based on machine learning. For patients who have lost the opportunity for surgery, we proposed a strategy that exploits the kinase-substrate regulation networks to effectively predict the benefit form sorafenib, regardless of whether tumors carry targets of sorafenib. Promisingly, the accumulation of multi-omics data combined with effective drug testing has the potential to establish a precise strategy for identifying the most appropriate drugs for HCC patients. In summary, we validated the proteomic subtypes of HCC in multi-centric datasets and developed a simplified panel for distinguishing these subtypes. Integrated multi-omics analysis revealed the functional impact of characterized molecular alterations in three proteomic subtypes. Effective prediction of sorafenib response demonstrated that targeting multi-omics could improve the implementation of advanced personalized therapeutic strategies for HCC. These findings provided important insights into the development of precision medicine for HCC patients and offered promising strategies to improve the selection of appropriate therapies for individual patients. Limitations of the study A potential limitation of our study is primarily the sample size of HCC patients in the clinical cohort. Inclusion of a larger sample size could allow for more sufficient statistical analyses among the three proteomic subtypes, including prognostic and ssGSEA analyses. Especially for the construction of a sorafenib efficacy prediction model, expanding the sample size of the training set and validation set could greatly improve the reliability. Future studies in large prospective cohorts are warranted to validate the proteomic subtypes of HCC and evaluate the efficacy of the prediction model for consideration in clinical trials. Taken together, while our study has some limitations, it establishes a strong foundation for the investigation of precision medicine for HCC patients. STAR★Methods Key resources table REAGENT or RESOURCE SOURCE IDENTIFIER Antibodies __________________________________________________________________ AFP Polyclonal antibody Proteintech Cat# 14550-1-AP; RRID: [242]AB_2223933 Albumin Polyclonal antibody Proteintech Cat# 16475-1-AP; RRID: [243]AB_2242567 glypican-3 (1G12) SANTA Cat# sc-65443; RRID: [244]AB_831376 Goat anti-Mouse IgG (H + L) Highly Cross-Adsorbed Secondary Antibody, Alexa Fluor Plus 488 Invitrogen Cat# A32723; RRID: [245]AB_2633275 Alexa Fluor® 546 goat anti-rabbit IgG (H + L) Life Technologies Cat# A11010; RRID: [246]AB_2534077 Alexa Fluor® 546 goat anti-mouse IgG (H + L) Life Technologies Cat# A11003; RRID: [247]AB_2534071 __________________________________________________________________ Biological samples __________________________________________________________________ Tumor with paired surrounding non-tumor tissues from primary HCC patients Mengchao Hepatobiliary Hospital of Fujian Medical University (Fujian, China) This paper Patient-derived cell (PDC) Mengchao Hepatobiliary Hospital of Fujian Medical University (Fujian, China) This paper __________________________________________________________________ Chemicals, peptides, and recombinant proteins __________________________________________________________________ Urea Sigma-Aldrich Cat# U1250 β-Glycerophosphate disodium salt hydrate Sigma-Aldrich Cat# G9422 Sodium orthovanadate Sigma-Aldrich Cat# S6508 Sodium pyrophosphate tetrabasic decahydrate Sigma-Aldrich Cat# S6422 Sodium dihydrogen phosphate dihydrate Sigma-Aldrich Cat# 04269 Protease inhibitor cocktail tablets Roche Cat# 11697498001 PhosSTOP Roche Cat# 04906837001 Iodoacetamide Sigma-Aldrich Cat# I6125 DL-Dithiothreitol Sigma-Aldrich Cat# D0632 Acetonitrile, LC-MS Grade Thermo Fisher Scientific Cat# 51101 Water, LC-MS Grade Thermo Fisher Scientific Cat# 51140 Triethylammonium bicarbonate buffer Sigma-Aldrich Cat# 90360 Sequencing Grade Modified Trypsin Promega Cat# V5111 0.1% Formic acid in water Thermo Fisher Scientific Cat# 85170 0.1% Formic acid in ACN Thermo Fisher Scientific Cat# 85174 Pierce HeLa Protein Digest Standard Thermo Fisher Scientific Cat# 88329 iRT Biognosys Cat# KI-3002-1 Collagenase D Roche Cat# 11088866001 Dnase I Roche Cat# 10104159001 EBSS Gibco Cat# 14155063 DMEM/F12 Gibco Cat# C11330500BT GlutaMax Gibco Cat# 35050061 HEPES Gibco Cat# 15630106 Pen/Strep Gibco Cat# 15140122 B27 Gibco Cat# 12587010 N2 Gibco Cat# 17502048 Forskolin Sigma-Aldrich Cat# F3917 Nicotinamide Sigma-Aldrich Cat# N0636 N-acetyl-L-cysteine Sigma-Aldrich Cat# A9165 recombinant human EGF PeproTech Cat# AF-100-15 recombinant human FGF-10 PeproTech Cat# 100-26 recombinant human HGF PeproTech Cat# 100-39 [Leu[248]^15]-Gastrin I human Sigma-Aldrich Cat# G9145 A83-01 Tocris Bioscience Cat# 2939 Y-27632 MCE Cat# HY-10071 Dexamethasone Sigma-Aldrich Cat# D1756 Hoechst 33342 Beyotime Cat# C1028 Sorafenib Selleckchem Cat# S7397 __________________________________________________________________ Critical commercial assays __________________________________________________________________ EasyⅡProtein Quantitative Kit (BCA) TransGen Biotech Cat# DQ111 High-Select™ Fe-NTA Phosphopeptide Enrichment Kit Thermo Fisher Scientific Cat# A32992 Pierce High pH Reversed-Phase Peptide Fractionation Kit Thermo Fisher Scientific Cat# 84868 QIAamp DNA Mini Kit Qiagen Cat# 51306 Agilent SureSelect Human All Exon V6 kits Agilent Technologies Cat# 5190-8864 TransDetect Cell Counting Kit (CCK) TransGen Biotech Cat# FC101 __________________________________________________________________ Deposited data __________________________________________________________________ DrugBank database NA RRID:[249]SCR_002700; [250]https://go.drugbank.com/drugs MSigDB database Broad Institute RRID: [251]SCR_002700; [252]https://data.broadinstitute.org/gsea-msigdb/msigdb/release/2022.1. Hs/c2.cp.kegg.v2022.1.Hs.symbols.gmt NetworKIN 3.0 Horn et al.[253]^46 RRID: [254]SCR_007818; [255]http://networkin.info PhosphoSitePlus Hornbeck et al.[256]^52 RRID:[257]SCR_001837; [258]https://www.phosphosite.org Proteomic data of this cohort (N = 152) This paper iPROX: IPX0005108000; ProteinXchange: PXD046519 WES data of this cohort (N = 58) Cai et al.[259]^53 NGDC: HRA000045 RNA_Seq data of this cohort (N = 57) Li et al.[260]^44 NGDC: HRA000464 MSigDB c2 gene sets (version 6.2) NA [261]http://software.broadinstitute.org/gsea/msigdb/index.jsp Proteogenomic data of the Jiang et al. ’s cohort (N = 101) Jiang et al.[262]^24 PMID: [263]30814741 Proteogenomic data of the Gao et al. ’s cohort (N = 159) Gao et al.[264]^25 PMID: [265]31585088 Proteomic data of the Ng et al. ’s cohort (N = 51) Ng et al.[266]^41 PMID: [267]35508466 __________________________________________________________________ Software and algorithms __________________________________________________________________ HCC Proteomic Molecular subtypes and the SP9 This paper [268]https://doi.org/10.5281/zenodo.10053446 Spectronaut (version 13.2) Biognosys Inc. [269]https://www.biognosys.com R (version3.6.3) R Core Team [270]https://www.r-project.orgs NIS-Elements (version 4.40) Nikon RRID: [271]SCR_014329 TitanCNA (version 1.23.1) Ha et al.[272]^54 [273]https://github.com/gavinha/TitanCNA Gephi (version 0.9.2) Bastian et al.[274]^55 RRID:[275]SCR_004293; [276]https://gephi.org/ Cytoscape (version 3.9.0) Shannon et al.[277]^56 RRID:[278]SCR_003032; [279]https://cytoscape.org/ ConsensusClusterPlus R package (version 1.50.0) Wilkerson et al.[280]^57 RRID:[281]SCR_016954; [282]http://bioconductor.org/packages/release/bioc/html/ConsensusCluste rPlus.html GISTIC (version 2.0.23) Mermel et al.[283]^58 RRID:[284]SCR_000151; [285]https://portals.broadinstitute.org/cgi-bin/cancer/publications/vie w/216 NMF (version 2.23.0) Gaujoux et al.[286]^59 [287]https://cran.r-project.org/web/packages/NMF/index.html Mutect (version 2) Broad Institute RRID:[288]SCR_000151; [289]https://portals.broadinstitute.org/cgi-bin/cancer/publications/vie w/216 Genome Analysis Toolkit (GATK) (version 4.1.0.0) Broad Institute RRID:[290]SCR_001876; [291]https://software.broadinstitute.org/gatk/ caret R package (version 1.50.0) Github [292]http://topepo.github.io/caret/index.html ANNOVAR (version 2017 Jul 17) Wang et al.[293]^60 [294]http://annovar.openbioinformatics.org/en/latest/ ggplot2 R package (version 3.3.6) Github [295]https://github.com/tidyverse/ggplot2 [296]Metascape software (version 3.5) Zhou et al.[297]^61 [298]https://metascape.org/gp/index.html#/main/step1 __________________________________________________________________ Other __________________________________________________________________ Centrifugal filter Millipore Cat# UFC5010BK ACQUITY UPLC BEH C18 Column Waters Cat# 186002352 nanoEase M/Z HSS T3 Column Waters Cat# 186008818 Acclaim PepMap™ 100 C18 column Thermo Fisher Scientific Cat# 164946 Cell Strainers Biologix Cat# 15-1070 Glass bottom cell culture dish Biosharp Cat# BS-15-GJM 96 Well cell culture plate Corning Cat# CLS3599 SpectraMax M5e Molecular Device RRID: [299]SCR_020300 Waters Acquity H-class UPLC system Waters Cat# ACQUITY UPLC H-Class; RRID: [300]SCR_022217 EASY-nLC 1000 Thermo Fisher Scientific Cat# LC120; RRID: [301]SCR_014993 Q Exactive Plus mass spectrometry Thermo Fisher Scientific Cat# IQLAAEGAAPFALGMBDK; RRID: [302]SCR_020556 Zeiss LSM 780 Confocal Laser Scanning Microscope Carl Zeiss RRID: [303]SCR_020922 [304]Open in a new tab Resource availability Lead contact Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Xiaolong Liu (xiaoloong.liu@gmail.com). Materials availability This study did not generate new unique reagents. Data and code availability * • Raw data of proteome and phosphoproteome derived from HCC tissues have been deposited at iProX: IPX0005108000 and ProteomeXchange: PXD046519.[305]^62^,[306]^63 Raw data of genome and transcriptome derived from HCC tissues have been deposited at Genome Sequence Achieve of National Genomics Data Center (NGDC): [307]HRA000045, [308]HRA000464. Accession numbers are listed in the [309]key resources table. * • All analysis code has been deposited at Zenodo: [310]https://doi.org/10.5281/zenodo.10053446 and is publicly available as of the date of publication. DOls are listed in the [311]key resources table. * • Any additional information required to reanalyze the data reported in this work paper is available from the [312]lead contact upon request. Experimental model and study participant details Patients and tissue samples for multi-omics analysis A total number of 160 primary HCC patients obtained from Mengchao Hepatobiliary Hospital of Fujian Medical University in China were recruited for multi-omics analysis in this study (Figure S1A). The clinical and pathological characteristics of these HCC patients were summarized in [313]Table S1. The included HCC patients represented all major subtypes at different pathological stages, such as age, gender, tumor number, tumor size, tumor capsule, AFP concentration, HBV-DNA, PVTT, MVI, BCLC stage, TNM stage, cirrhosis, and differentiation. The median follow-up of this cohort for overall survival was 30.5 months, and recurrence-free survival had a median of 7.2 months ([314]Figure S1A and Table S1). This project was approved by the Institution Review Board of Mengchao Hepatobiliary Hospital of Fujian Medical University. Informed consent was obtained from each participant before the operation. The use of clinical specimens was completely in compliance with the “Declaration of Helsinki”. Patient derived cell models of HCC To established Patient derived cell models of HCC, a total of 26 fresh HCC tumor tissue samples from patients who underwent surgical resection were collected, and the study protocol of clinical samples was approved by the Institutional Ethics Committee of Mengchao Hepatobiliary Hospital of Fujian Medical University. Tumor tissue acquisition was approved by the ethics committees of Mengchao Hepatobiliary Hospital of Fujian Medical University (Fuzhou, China) and agreed by each patient via written informed consent, and the whole procedure was carried out according to national and institutional regulations on experimental use of human tissues. The surgical tissue samples were enzymatically dissociated using digestive lysate (2.5 mg/mL Collagenase D, 1 mg/mL DNase I, 1×EBSS), and then the tumor cells from HCC tissues were collected by centrifugation at 500g for 3 min, followed by washing with washing buffer (1×DMEM/F12, 2mM GlutaMax, 10mM HEPES, 1×Pen/Strep).[315]^64 PDCs were cultured in defined medium (1×DMEM/F12, 2mM GlutaMax, 10mM HEPES, 1×Pen/Strep, 1×B27, 1×N2, 10μM forskolin, 10mM Nicotinamide, 1.25 mM N-Ac, 50 ng/mL hEGF, 100 ng/mL FGF-10, 25 ng/mL hHGF, 10 nM Gastrin, 5 μM A83-01, 10 μM Gastrin, 10 μM Y-27632, 3 nM Dexamethasone), and seeded in 96-well plates at 50000 cells/well in quintuplicate for sorafenib (purchased from Selleckchem) treatment at 37°C in a humidified atmosphere with 5% CO[2]. In addition, the PDCs were inoculated in 15 mm confocal dishes (Biosharp Life sciences) with 1 × 10^5 cells in triplicate for immunofluorescence. The cells were cultured as described above. Method details Protein extraction and digestion Total protein was extracted from tissue samples by lysis buffer containing 9 M urea, 75 mM NaCl, 10 mM Tris-Hcl (pH8.0), 10 mM IAA, 1 mM NaF, 1 mM β-glycerophosphate, 1mM Sodium orthovanatate, 10 mM Sodium pyrophosphate, 100 mM Sodium dihydrogen phosphate, 1 mM PMSF and protease inhibitors as described previously. Briefly, tissue was disrupted using a tissue homogenizer on ice for 3 min followed by centrifugation at 17000× g for 10 min at 4°C and then the supernatant was collected. The proteins were sequentially reduced with 10 mM dithiothreitol at 55°C for 30 min and alkylated with 50 mM iodoacetamide in the dark for 30 min at room temperature. Then, the protein concentration was determined using the BCA assay. Subsequently, 100 μg of total protein was loaded in 10 kDa centrifugal filter tubes (Millipore), washed twice with 400 mL 100 mM triethyl ammonium bicarbonate (TEAB), and enzymatic digestion was carried out with trypsin at a concentration of 1:50 (w/w) using filter-aided sample preparation for 18 h at 37°C. Peptides were eluted by centrifugation at 14,000g for 15 min and evaporated to dryness for LC-MS/MS analysis. To prepare samples for spectral library construction, each 20 samples were pooled into a 1 mg protein mixture (equally to a final of 16 pools), and prepared in the same method as mentioned above ([316]Figure S1B). Phosphopeptide enrichment The phosphopeptide enrichment was performed using High-Select Fe-NTA Phosphopeptide Enrichment Kit (Thermo Scientific) according to the kit manual. In brief, the peptide mixture from 1 mg proteins were dissolved with 200 μL binding buffer, then were added to a Fe-NTA spin column and incubated for 20 min at room temperature with end-over-end rotation. After 1 min of centrifugation at 1,000 g for 1 min, the column was washed for 4 times with wash buffer to remove nonspecific peptides. Finally, phosphopeptides were eluted with elution buffer and were immediately dried by speed-vac for MS analysis. To prepare samples for spectral library construction, each 40 samples were pooled into a 10 mg protein mixture (equally to a final of 8 pools), and prepared in the same method as mentioned above ([317]Figure S1B). High pH reversed-phase separation In order to increase the depth of protein or phosphopeptide identification for generating the spectral libraries, the pooled samples were fractionated by high-pH reverse phase liquid chromatography. Peptide mixtures were resuspended in FA/water (1/1000) for peptide fractionation. Offline basic RP UPLC fractionation was performed on a XBridge BEH C18 column (1.7 μm, 100 μm × 10 cm) using a Waters Acquity H-class UPLC system. Chromatographic separation was performed with 30 min gradient as following: 0–5 min, increase to 7% B; 5–20 min, 7–35% B; 20–25 min, 35–80% B (mobile phase A: 0.1% FA (v/v); mobile phase B: 90% ACN, 0.08% FA (v/v), flow rate 300 nL/min). A fraction was collected every 30 s; to reduce the MS running time, the 60 fractions were pooled (F01 + F13 + F25, F02 + F14 + F26 and so on) as described[318]^45 to a final of 12 samples ([319]Figure S1B). For each sample, 2 μg were taken out and dried by vacuum centrifugation for proteome analysis. LC-MS/MS analysis Peptide fractions were analyzed on an EASY-nLC 1000 LC system (Thermo Fisher Scientific,Waltham, MA) coupled with an Orbitrap Exactive Plus mass spectrometry (Thermo Fisher Scientific, Waltham, MA). Briefly, the peptide was resuspended in mobile phase C (0.1% formic acid in water) and equal amounts of 11 indexed retention time (iRT) peptide standards (Biognosys) were spiked into each shotgun run in data-dependent acquisition (DDA) mode. And the peptides were separated onto the analytical column (1.8 μm, 75 μm × 25 cm) with a 120 min gradient at a constant flow rate of 300 nL/min (0–3 min, 4 to 7% of D; 3–79 min, 7 to 20% D; 79–103 min, 20 to 32% of D, 103–105 min, 32 to 90% of D and held at 90% D for 15min mobile phase D: 0.1% Formic acid in ACN). MS was operated under a data-dependent acquisition mode. The spectra of full MS scan (m/z 300–1800) were acquired by Orbitrap mass analyzer at 70,000 resolution for a maximum injection time of 60 ms with an AGC target value of 3e6. Up to 20 precursors were selected for MS2 analysis and fragmented by HCD using a normalized collision energy of 27% and analyzed using the Orbitrap at 17,500 resolution for a maximum injection time of 45 ms with AGC target value of 1e5. The dynamic exclusion was 20 s and the precursor ions with unassigned charge state as well as charge state of 1+, or superior to 8+ were excluded from fragmentation selection. As above, equal amount of iRT peptide standards were mixed into individual sample peptide for each shotgun run at DIA mode, and nano-LC MS/MS basic parameters were as the same as described above. In DIA mode, precursor ions were acquired using an MS1 master scan (m/z range: 400–1200, max. injection time: 20 ms, AGC target: 3e6, resolution: 70K), following 32 DIA scans for MS2 with two kinds of wide isolation window as follows: m/z range 400–1000 using an isolation window width of m/z 20, and m/z range 1000–1200 using an isolation window width of m/z 100. Each isolation window overlapped by 1 m/z. Generation of proteomic and phosphoproteomic spectral library To generate an extensive spectral library, both DDA and DIA files were processed using Pulsar search engine inside Spectronaut with default settings. Reference FASTA files for human was downloaded from UniProt (release 2019-04, 23046 entries), combining with the fusion sequence of iRT. Enzyme specificity was set as trypsin/P. The maximum missing cleavage site was set as 2. The mass tolerances of precursor ion and fragment ion for peptides were set at 10 parts per million (ppm) and 0.02 Da, respectively. The minimal peptide length was set at 7. Carbamidomethyl (C) was set as fixed modifications. Oxidation (M) and Acetyl (Protein N-term) were set as variable modifications. For phosphoproteomic analysis, phospho (STY) was also set as a variable modification. In additional, the false discovery rate (FDR) was set to 1% at peptide spectrum matches, 1% at peptide precursor level and 1% at protein level. DIA mode analysis to get proteomic and phosphoproteomic data In total, 304 proteome DIA files and 264 phosphoproteome DIA files passed the quality-control were searched using Spectronaut based on the hybrid spectral library of HCC proteome or phosphoproteome, respectively. Reference FASTA files from UniProt database were combining with the fusion sequence of iRT. The samples were grouped by T (HCC tumor tissue group) and N (paratumor noncancerous tissue group). Calibration was set to non-linear iRT calibration with enabled precision iRT. Identification was performed using 5% q-value cutoff on precursor and protein level, while the maximum number of decoys was set to a fraction of 0.1 of library size. Quantity was determined on MS/MS level using area of XIC peaks with enabled cross run normalization. For phosphoproteomic analysis, minor quantified (Peptide) grouping was set by modified sequence and PTM localization was activated and the probability cutoff was set to 0, in order to summarize phosphopeptide or phosphosite later. Quality control of the MS platform To evaluate the performance of the MS system, the HeLa protein digest standard (Pierce) was measured every 10 samples from beginning to the end of the project as the quality-control standard. DDA analysis were performed all through the project, and DIA analysis were performed during the DIA process. The standard was analyzed using the same method and conditions as using in the HCC related tissue samples. A Pearson’s correlation coefficient was calculated for all quality-control runs in the statistical analysis environment R3.6.3. Quality assessment of proteomic and phosphoproteomic data The two-component mixture model implemented in the R MCLUST package v.5.2 was applied to filter the samples with insufficient peptide counts and protein identifications. In total, 152 paired samples among the 160 total involved paired samples passed the quality-control and were used for further proteomic analysis. Furthermore, 132 paired samples among the 152 paired samples passed the quality-control and were used for further phosphoproteomic analysis. All samples that passed quality control showed good consistency in the proteome quantification and phosphoproteome quantification ([320]Figures S1F and [321]S5F). Furthermore, PCA (Principal Component Analysis) and C.V (Coefficient of Variance) distribution of protein quantification showed that there was no batch effect in proteomic and Phosphoproteomic data ([322]Figures S1G, S1I, [323]S5E, and S5G). Pre-processing of proteomic and phosphoproteomic data In order to correct the initial loading volume for each sample, proteomic (304 samples and 6512 proteins) and phosphoproteomic (264 samples and 54789 phosphopeptides) data from Spectronaut software were first log2-transformed, and then normalized using the median centering method. Glmnet Ridge Regression (GRR) of NAguideR ([324]https://www.omicsolution.org/)[325]^65 was used to impute missing values in tumor or paired non-tumor samples. Before missing value imputation, proteins and phosphopeptides with more than 50% missing data were taken out to make sure that each sample had enough data for imputation. Consensus clustering for HCC in proteomic data For proteomic subtype identification, 1128 proteins from the training set with the top 1500 of MAD (median absolute deviation) and with expression in more than 80% of the tumor samples were selected for consensus clustering using the ConseususClusterPlus R package.[326]^57 The parameters were as follows: maxK = 8, reps = 1,000 bootstraps, pItem = 0.8, pFeature = 0.8, clusterAlg = “kmdist”, distance = “euclidean”. The number of clustering was determined by the average pairwise consensus matrix within consensus clusters, the delta plot of the relative change in the area under the cumulative distribution function (CDF) curve, and the average silhouette distance for consensus clusters. To verify the reliability of the proteomic subtypes, consensus clustering was also performed in the validation set using the same parameters. Identification of signatures for HCC subtypes Signatures for HCC subtypes were a group of proteins that meet the following criteria: (1) the Benjamini-Hochberg-adjusted p values in each subtype should be less than 0.01 compared to the other subtypes; (2) fold change of the protein abundance between proteomic subtypes was greater than 2 for upregulation or less than 0.5 for downregulation. We performed these analyses using the “Wilcoxon test” function in the R3.6.3 environment. In total, we identified the signatures containing 761 proteins from the 3 HCC proteomic subtypes, and these signatures were then subjected to further analyses. To verify the reliability of the signatures, consensus clustering was also performed in all 152 tumor patients using both the 1128 proteins and the signatures containing 761 proteins with the same parameters. The subtypes identified using 1128 proteins were taken as the true subtypes, and the results were compared using a confusion matrix. Correlations between proteomic subtypes and clinical features The chi-square and Fisher-exact tests were performed to identify the correlations of proteomic subtypes with clinicopathologic feature. Survival curves were generated using the KM method, and the log rank test was applied to calculate differences between the curves. Hazard ratios (HR), 95% confidence intervals (CI) and figures were estimated by the survival and survminer or plotted by ggplot2 R packages. The univariable and multivariable Cox analyses were further used to evaluate the prognostic power of HCC proteomic subtypes. All statistical analyses were performed in R (version 3.6.3), and a significance level of 0.05 was used. Pathway alterations of 3 HCC proteomic subtypes To explore the pathway alterations of 3 HCC proteomic subtypes, we performed single sample gene set enrichment analysis (ssGSEA) with 4026 proteins that were identified in more over 50% of patients using the R package GSVA,[327]^66 and the enrichment scores were scaled by R. In this study, 186 canonical pathway gene sets derived from the KEGG database were obtained from the MSigDB database ([328]https://data.broadinstitute.org/gsea-msigdb/msigdb/release/2022.1 .Hs/c2.cp.kegg.v2022.1.Hs.symbols.gmt). The clusterProfiler R package[329]^67 was used to perform pathway enrichment analysis of differentially abundant proteins or phosphoproteins. The statistical significance of pathways between 3 subtypes were analyzed by the “Wilcoxon test”, and p value less than 0.01 was considered as significant. Cross-validation of HCC proteomic subtypes from multi-centers To verify the robustness and universality of HCC proteomic subtypes, the protein abundance matrix, signatures, proteomic subtypes, and prognostic information were collected from Jiang et al.’s and Gao et al.’s cohorts. 1269 signatures obtained from Jiang et al.’s cohort containing 101 patients and 9252 proteins, as well as 1274 signatures were obtained from Gao et al.’s cohort containing 159 patients and 6478 proteins in total. The cross-validation was mainly focused on the two-by-two comparison of the 3 subtypes in 3 cohorts. For example, consensus clustering was performed using Gao et al.’s signatures and Jiang et al.’s signatures in our cohort, and they were further compared with our original subtypes based our signatures. When reclassifying samples in different cohort based on other signatures, the classification method in the original article related to the cohort was used. The validation results were presented in a confusion matrix, and the proportion of consistently assigned patients to the total patients was used to assess the consistency of the two subtypes. For patients with different assignments, survival analysis with two assignments by the KM method were required. Simplified panel for discriminating proteomic subtypes To construct the simplified panel for discriminating proteomic subtypes, a total of 412 patients (152 patients from our cohort, 101 patients from Jiang et al.’s cohort, and 159 patients from Gao et al.’s cohort) were used. Of these, our cohort and Jiang et al.’s cohort (N = 253) were used as the training set for model training and parameter tuning, and Gao et al.’s cohort was used as an external independent validation set to evaluate the model performance (N = 159). The scale function in R was used to normalize 3 datasets to eliminate the batch effect. Then, the panel for discriminating proteomic subtypes was constructed and validated by the following steps. Firstly, a random sampling method was used to screen for differentially expressed proteins in 3 subtypes, a total of 50 replicates were performed, 40 of which reached the threshold (p value < 0.01, fold change >1.5) that were initially selected as differentially expressed proteins. After removing the proteins with correlations higher than 0.9 with other proteins, a total of 1133 differentially expressed proteins were selected in the training set. Secondly, for reducing the misleading impact of random fluctuations and correlations, Boruta algorithm from the Boruta R package was used to select the signatures. Then, 59 SI specifically expressed proteins and 88 SIII specifically expressed proteins were selected in total. Thirdly, the KNN (K-Nearest Neighbors) algorithm of the caret R package was used to construct the SI discriminating panel and SIII discriminating panel, respectively. And a 5-fold cross-validation was performed to further reduce the number of signatures of the discriminating panel. The specific parameters are as follows: method = “cv”, 5, summaryFunction = twoClassSummary, and classProbs = TRUE for the trainControl function; method = “knn”, tuneLength = 30, and metric = “ROC” for the train function. Finally, 9-protein simplified panel containing 4 proteins from the SI subtype and 6 proteins from the SIII subtype (1 protein was shared) were integrated to identify HCC proteomic subtypes. In addition, the simplified panel for discriminating proteomic subtypes was validated in an external independent validation set, and the accuracy, sensitivity and specificity were calculated to evaluate the panel performance. DNA extraction and whole-exome sequencing Total DNA from HCC tissues and para-tumor tissues were extracted and subjected to DNA library preparation using QIAamp DNA Mini Kit (Qiagen) according to the manufacturer’s instructions. Whole-exome capture was carried out using Agilent SureSelect Human All Exon V6 kits. And sequencing was performed using the Illumina HiSeq X ten system (Annoroad Gene Tech. Co., Ltd). As described previously, qualified WES reads were aligned to hg19 human genome assembly (GRCh37) using BWA, and duplicates of all mapped reads were then marked and discarded using Picard.[330]^50 Transcriptome sequencing and mapping RNA was extracted from HCC tumor and para-tumor tissues, and the whole transcriptome sequencing (paired-end, 150 bp) was performed on the Illumina HiSeq X10 platform by Anoda Gene Technology (Beijing) Co. All high quality reads were aligned to GRCh37 with GENCODE gene annotations using STAR as described previously.[331]^44 The abundance of all genes was quantified using transcripts per kilobase per million mapped reads (TPM). Somatic mutation detection, CNV, TMB calculation and neoantigen prediction Somatic mutations of HCC tumor were identified with Mutect2 in GATK (version 4.1.0.0) using paired peritumoral tissue samples as the control. Somatic mutations that meet the following criteria are retained for downstream analysis: ≥20×depth in both tumor and peritumoral samples; variant allele frequencies (VAF) ≥5% in tumor samples; VAF≤1% in peritumoral samples. CNVs were identified using TitanCNA R/Bioconductor package (version 1.23.1),[332]^54 and regions of the genome significantly amplified or deleted were identified by Gistic2.0 in “bzhanglab/gistic2”. BAM format files and the data of TMB and neoantigens were analyzed according to our previously published protocols.[333]^44^,[334]^53 Random forest algorithm for ranking the importance of mutations Random forest algorithm was applied to the WES data to identify the most important mutations associated with the proteomic subtypes. In brief, the gene mutation dataset and subtype tags were applied as input. The caret R package was used to find the best hyperparameter for the random forest model. Immune microenvironment and immune infiltration analysis 28 immune cell types and their feature gene panels were downloaded from Charoentong et al.’s study.[335]^68 Immune cell infiltration of each cell type was estimated by ssGSEA of GSVA R package. Then, the enrichment score was scaled to represent the relative level of infiltration. The estimate R package was also used to assess the overall immune infiltration of every tumor sample. The anti-tumor immunity and pro-tumor immune suppression were calculated from the enrichment score of infiltration cell types executing anti-tumor immunity (activated CD4 T cell, activated CD8 T cell, central memory CD4 T cell, central memory CD8 T cell, effector memory CD4 T cell, effector memory CD8 T cell, type 1 T helper cell, type 17 T helper cell, activated dendritic cell, CD56 bright natural killer cell, natural killer cell and natural killer T cell) and cell types executing pro-tumor immune suppressive functions (regulatory T cell, type 2 T helper cell, CD56 dim natural killer cell, immature dendritic cell, macrophage, MDSC, neutrophil, plasmacytoid dendritic cell).[336]^69 Correlation analysis of multi-omics data Correlations of multi-omics data were calculated for each gene/protein across the tumor samples using Pearson’s correlation coefficient, including CNV alterations and RNA abundance, CNV alterations and protein abundance, CNV alterations and phosphopeptide abundance, RNA abundance and protein abundance, RNA abundance and phosphopeptide abundance, as well as protein abundance and phosphopeptide abundance. Protein attenuation was calculated by the difference between the Pearson coefficient of the transcript correlation (correlation between CNV and transcript measurements) and the Pearson coefficient of the protein correlation (correlation between copy-number and protein measurements). Then proteins were classified according to their attenuation effect using a Gaussian mixture model with 2 mixture Components.[337]^70 Differentially expressed proteins or phosphopeptides between tumor and paired non-tumor samples were filtered using p values of less than 0.01 and fold change greater than 2. Phosphosite-to-protein co-variation analysis As for all proteomic and phosphoproteomic data from 132 HCC tumor samples, Pearson correlations for each protein and corresponding phosphorylation sites were also calculated in the 3 proteomic subtypes, where the upper quartile of all differences were considered as significantly co-regulated among subtypes. In total, we obtained 609 proteins corresponding to 1,185 phosphorylation sites for further analysis. Functional enrichment and interactome analysis of these proteins were performed using Metascape.[338]^61 The top 5 MCODE complexes extracted from the interactome networks were selected and visualized by Gephi software. These significantly co-regulated proteins were classified into 3 classes by the k-means algorithm. Also, for each cluster, known or predicted upstream kinases of the phosphosites in the significantly co-regulated phosphosite-to-protein correlations were given based on PhosphoSitePlus (PSP)[339]^52 or NetworKIN 3.0 (NetworKIN Score >1).[340]^46 Kinase activity prediction To estimate the kinase activity for each patient, enrichment analysis was performed by the ssGSEA method using the GSVA R package.[341]^71 Kinase activity for each kinase was estimate using a kinase enrichment score, which was calculated based on the overall phosphorylation status of its all substrates.[342]^72^,[343]^73 To deduct protein levels and abundance in the para-tumor, we used the following formula to correct each phosphorylation site abundance: [MATH: yj=(Tj T)(Nj N) :MATH] “ [MATH: j :MATH] ” represented a single phosphorylation site; “ [MATH: yj :MATH] ” was the abundance of this phosphorylation site; “ [MATH: Tj :MATH] ” and “ [MATH: Nj :MATH] ” referred to the log2-transformed abundance of this phosphorylation site in tumor and non-tumor tissues, respectively; while “ [MATH: T :MATH] ” and “ [MATH: N :MATH] ” represented the log2-transformed abundance of the protein that corresponded to this phosphorylation site in tumor and non-tumor tissues. The known or predicted kinase-substrate relations, were also gained from PSP or NetworKIN 3.0. The gene symbol and Uniprot ID were corrected by the Uniprot Database ([344]https://www.uniprot.org/), and any ambiguous correspondence was excluded. In total, 239759 pairs of kinase-substrate relations were collected, which contained 381 kinases, 6906 protein substrates, and 48730 phosphorylation sites. Clinical available drugs and target mapping The FDA-approved drugs or candidate drugs and corresponding targets were downloaded from the DrugBank database (release version 5.1.7). Only targets with known pharmacological action as inhibitors, enzyme, carrier and transporter were selected, and 46 candidate targets were identified in this study. To get subtype-specific drug targets, we comprehensively evaluated their protein abundance, kinase activity, and correlations of kinase-substrate among the 3 proteomic subtypes. Immunofluorescence analysis The PDCs on confocal dishes were fixed with 4% paraformaldehyde for 15min, and then permeabilized with 0.5% Triton X-100 in PBS for 15 min at room temperature. After blocking (5% BSA in TBS containing 0.01% Triton X-100) for 1h, the PDCs were subsequently incubated overnight at 4°C with primary antibodies against AFP (1:300, proteintech), albumin (1:300, proteintech) and glypican-3 (1:300, SANTA) diluted in blocking buffer. After washing with TBS for 4 times, PDCs were incubated with secondary Alexa Fluor Plus 488 (1:1000, Invitrogen) and Alexa Fluor 546 (1:1000, Life Technologies). In addition, cell nucleus was stained with Hoechst 33342 (1:500, Beyotime). Following extensive washing, stained organoids were imaged in confocal (Zeiss LSM780, Gemany). Drug treatment and viability assays PDCs were treated with sorafenib in an eight-point serial dilution series from 0.25 nM to 32 μM. After 4 days of incubation, cell viability was analyzed using CCK8 assay (TransGen Biotech, Beijing, China). Viable cells were estimated using SpectraMax M5e (Molecular Device, USA). Control cells treated with dimethyl sulfoxide (DMSO) were used to calculate relative cell viability and to normalize the data. Dose-response curve fitting was performed using R (3.6.3) and was evaluated by measuring the area under curve (AUC). In brief, the plate was normalized to the mean value from the seven serial conditions compared with DMSO control. The AUC of curve was determined using R (3.6.3), ignoring regions defined by fewer than two peaks. Non-convergence or ambiguous curves were excluded in every analysis. Drug sensitivity prediction model To find sorafenib response associated proteomic features and build prediction models, the 26 PDC models were randomly split into two datasets. The 16 PDC models were used as the training dataset, and the remaining 10 PDC models were regarded as testing dataset. The Elastic net (EN) algorithm was used to build drug sensitivity prediction models.[345]^31^,[346]^74^,[347]^75 We firstly pre-selected the proteins in the training set based on their Pearson’s correlation coefficients with drug sensitivity, and then elastic net regression models were built using the 19 selected proteins. RMSE was used to select the optimal model using the smallest value. The final values used for the model were alpha = 0.770 and L1 ratio = 0.1. The model was used to predict the drug response of the 10 PDC models in the testing cohort. The Pearson’s correlation coefficients were calculated between the predicted drug sensitivity and examined ones to assess the prediction performance. Identification of sorafenib sensitivity-related pathways To explore the pathways that may affect sorafenib sensitivity, the ssGSEA algorithm was used to evaluate the alteration of different pathways between PDC samples. The correlation analysis of sorafenib sensitivity with pathways was performed using the Pearson test, and p values of less than 0.01 were considered as statistically significant. Quantification and statistical analysis Quantification and statistical analysis methods for single-omics and multi-omics analyses were mainly presented and cited in the respective [348]STAR Methods details. Standard statistical tests were used to analyze the clinical data, including but not limited to Wilcoxon test, Chi-square test, Fisher’s exact test, Log rank test. For categorical variables versus continuous variables, Wilcoxon test was used to test if any of the differences between the subtypes were statistically significant; for categorical variables versus categorical variables, Chi-square test and Fisher’s exact test were used; and for continuous variables versus continuous variables, Pearson correlation was used. All statistical tests were two-sided, and p value <0.05 statistical was considered statistically significant. Kaplan-Meier plots (Log rank test) were used to describe OS and RFS. Univariate Cox proportional hazards regression models were used to identify the variables associated with OS and RFS. Significant factors in the univariate analysis were further subjected to multivariate Cox regression analysis using a forward LR approach. All the analyses of clinical data were performed in R (version 3.6.3). Acknowledgments