Graphical abstract graphic file with name fx1.jpg [127]Open in a new tab Highlights * • Subtype-specific features across a pan-RCC cohort revealed by multi-omics data resources * • Diverse cell-of-origin predictions and tumor signatures revealed by snRNA-seq * • Proteogenomic, metabolic, glycoproteomic signatures of high-wGII tumors * • Biomarkers GPNMB, ADGRF5, MAPRE3 for chRCC/RO and PIGR, SOSTDC1 for pRCC __________________________________________________________________ Li et al. perform comprehensive multi-omics characterization of a broad range of renal cell carcinomas, improving our understanding of the biology, proteogenomics, post-translational modifications, and metabolism of kidney cancer. These findings identify important biomarkers IGF2BP3, PYCR1, GPNMB, ADGRF5, MAPRE3, PIGR, and SOSTDC1 that illuminate molecular differences within renal cell carcinoma. Introduction World Health Organization (WHO) 2022 lists 20 different renal cell carcinoma (RCC) subtypes, of which 7 are defined by specific molecular aberrations.[128]^1 Non-clear cell RCC (ccRCC) accounts for ∼20% of RCCs and encompasses a variety of rare subtypes largely defined by histopathologic features,[129]^1^,[130]^2^,[131]^3 collectively referred to here as non-ccRCCs. Among non-ccRCC tumors, papillary RCC (pRCC) (10%–15%) and chromophobe RCC (chRCC) (3%–5%) are relatively common, while the other subtypes are much rarer. Several rare renal tumors with benign clinical courses show morphological overlap with malignant counterparts.[132]^4^,[133]^5 We have previously discovered several biomarkers to aid in differential diagnosis of many RCC subtypes[134]^6^,[135]^7^,[136]^8^,[137]^9; however, diagnosis in limited biopsy samples settings remains challenging.[138]^10^,[139]^11 In addition, biomarkers to identify patients at high risk of disease relapse within each RCC subtype who will benefit from increased surveillance and adjuvant therapy is another unmet clinical need.[140]^12 Our recent ccRCC study[141]^13 nominated biomarkers associated with features of worse prognosis, such as genome instability (GI). Similar markers for non-ccRCC tumors remain to be identified. Similarly, while immune checkpoint and angiogenesis inhibitors are treatment options in metastatic ccRCC,[142]^14^,[143]^15^,[144]^16 immune infiltration and tumor vascularity vary widely among non-ccRCC tumors, necessitating further evaluation of markers of responsiveness. To address the knowledge gaps in non-ccRCC differential diagnoses, prognoses, and therapeutic avenues, as part of the Clinical Proteomic Tumor Analysis Consortium (CPTAC), we performed integrated proteogenomic multi-omic analysis of non-ccRCC and ccRCC tumors. Besides the few studies detailing genomics,[145]^17^,[146]^18^,[147]^19 transcriptomics,[148]^17^,[149]^19 and proteomics[150]^20^,[151]^21^,[152]^22^,[153]^23 of rare RCCs, multi-omic profiling is largely unavailable. Here, we report multi-omic analysis of 48 non-ccRCC cases (non-ccRCC cohort) along with the reported ccRCC (n = 103) discovery cohort samples.[154]^24 This integrative pan-RCC analysis identified shared and subtype-specific proteogenomic, glycoproteomic, metabolic features across RCC subtypes, nominated various diagnostic biomarkers, and provided validation for selected candidates. Single-nucleus RNA sequencing (snRNA-seq) analysis captured transcriptomic heterogeneity of tumor subclusters and helped predict cell-of-origin. Combined, the data from this study provide a rich resource for identifying diagnostic biomarkers, disease mechanisms, and potentially new therapeutic targets for non-ccRCC subtypes. Results Specimens and multi-omics data types We performed multi-omics data analysis of 48 non-ccRCC and 103 ccRCC tumors and 101 normal adjacent tissues (NATs) (22 and 79 from non-ccRCC and ccRCC patients, respectively) ([155]Figure S1A; [156]Table S1). Multi-omics data available from common sample aliquots[157]^25 include whole-genome sequencing, whole-exome sequencing, DNA methylation profiling, and RNA-seq for all 151 tumor samples, and RNA-seq data for 89 NATs (ccRCC n = 71; non-ccRCC n = 18) ([158]Figure S1A). snRNA-seq data were generated for eight non-ccRCC tumors. Histopathological subtyping information and signature molecular aberrations such as copy number variation patterns, somatic/germline mutations, marker gene expression, and gene fusions were collectively assessed to arrive at tumor-molecular annotation ([159]Figure 1A; [160]Table S1).[161]^26 Based on the WHO 2018 renal tumor histological classification (available at data freeze), the analysis cohort comprises 103 ccRCC, 15 renal oncocytomas (ROs), 13 pRCC (8 of them with type 1 features; WHO 2018), 3 chRCC, 2 angiomyolipoma (AML), 2 eosinophilic solid and cystic RCC (ESCRCC), 1 Birt-Hogg-Dube syndrome-associated renal cell carcinoma, 1 mixed epithelial and stromal tumor of the kidney, 1 MTOR mutated RCC, 1 translocation RCC (TRCC), and 8 tumors where genomics aberrations patterns did not concur with histological classification were annotated as molecularly divergent to histology (MDTH) ([162]Figure 1A; [163]Table S1). Figure 1. [164]Figure 1 [165]Open in a new tab Proteogenomic biomarkers of copy number-based genome instability in renal cell carcinoma (A) Proteogenomic aberration landscape of ccRCC and non-ccRCC. Top panel: histo-molecular annotations condensed as tracks (∗excluded sample). RNA and protein automatic relevance determination in non-negative matrix factorization (ARD-NMF) classification. Middle panel: non-ccRCC display distinct recurrent events. Bottom panel: heatmaps show the top 10 differentially expressed genes and proteins enriched in annotated biological processes. Top 20 protein and RNA features (log2 fold change) from selected pathways. (B) Differentially enriched pathways (RNA and protein) among the various RCC subtypes. (C) Predicted immune composition for ccRCC and non-ccRCC. (D) Heatmap of absolute copy number variation (CNV) deduced from CNVEX output for non-ccRCC (top) and ccRCC (bottom) sorted by ploidy. Ploidy, RCC subtype, wGII annotations tracks provided (left). (E) Distribution of BAP1 mutation, wGII, immune subtype, tumor classes, and NMF clustering in five methylation subgroups. Significant enrichment (p < 0.01) of BAP1 mutation, high wGII, myeloid-lymphoid high immune subtype, and NMF cluster1 hyper-methylated group. (F) Subtype composition among low- and high-wGII tumors, in TCGA (left) and CPTAC (right) non-ccRCC (upper), and ccRCC (lower) cohorts. Bold black borders, high-wGII samples. (G) Comparison of significance levels (signed –log10 p value) between protein (x axis) and mRNA (y axis) under high to low-wGII comparison within a subset of non-ccRCC samples. Significantly upregulated genes are labeled and colored. The inset shows the global correlation between the changes. (H) Overlap between TCGA and CPTAC high-wGII mRNA expression gene markers in non-ccRCC (left) and ccRCC (right). The sample cohorts have comparable demographic and clinical composition except for a higher proportion of female patients in non-ccRCC compared with ccRCC cohorts (p = 0.036) ([166]Figure S1B). The multi-omics dataset can be queried using the interactive ProTrack website [167]http://ccrcc-conf.cptac-data-view.org ([168]Figure S1B).[169]^27 We identified a total of 12,299 proteins, 9,396 phosphorylated proteins, and 1,035 glycoproteins, of which 9,528 proteins, 6,465 phosphorylated proteins, and 639 glycoproteins were quantified in more than half of all samples. Principal-component analysis (PCA) of global proteome, phosphoproteome, and glycoproteome data showed clear separation between different tumor subtypes and normal samples in two-dimensional space ([170]Figure S1C). Known mutation consequences on kinase protein expression and phosphorylation are consistent with previous findings[171]^18^,[172]^28 ([173]Figures S1D, S3B, and S3C). Subtype-specific proteogenomic signatures Different subtypes of non-ccRCC tumors displayed recurrent genomic aberrations distinct from ccRCC ([174]Figure 1A). Notable non-ccRCC subtype-specific events include: the signature chromosomal losses and TP53 mutations in chRCC (3/3 cases), chr7/17 gain (8/13 cases), and MET mutations (3/13 cases) in pRCC, TSC gene mutations in ESCRCC (2/2 cases) and AML tumors (2/2 cases), and the TFE3 gene fusion in a TRCC case. Consistent with previous classification of RO molecular subtypes,[175]^18 RO type 1 was enriched with CCND1 gene rearrangement with a diploid genome, and type 2 was marked by one copy loss of chromosome 1 (chr1), and RO cases that did not show either of these molecular features were categorized as the “RO variant” subgroup.[176]^18 Gene set enrichment analysis (GSEA) revealed several interesting pathway similarities and differences among RCC subtypes ([177]Figure 1B; [178]Tables S2 and[179]S3). For instance, immune/inflammatory response concepts, including allograft rejection, inflammatory response, interferon alpha/gamma pathways, were significantly upregulated, especially at the protein levels, in both pRCC and ccRCC. In contrast, glycolysis, hypoxia, and epithelial-to-mesenchymal transition (EMT) were significantly enriched in the ccRCC proteome but showed a negative enrichment trend in pRCC and RO. Interestingly, oxidative phosphorylation showed significant positive enrichment in RO but was down in pRCC and ccRCC as expected ([180]Figures 1A and 1B).[181]^18^,[182]^22^,[183]^24^,[184]^29 Next, the status of tumor-immune infiltration was assessed through immune deconvolution followed by clustering analysis ([185]STAR Methods; [186]Figure 1A). In addition to four previously described ccRCC clusters,[187]^24 three non-ccRCC clusters were identified: one myeloid-lymphoid high non-ccRCC cluster, a myeloid-high cluster containing most pRCC samples, and an immune-absent cluster comprising all oncocytic tumors ([188]Figure 1A). Overall, the extent of immune infiltration was lower in non-ccRCC than in ccRCC ([189]Figure 1C). High weighted genome instability (wGII) containing cases were observed among non-ccRCC (∼37%), ccRCC (∼23%) in the CPTAC, and non-ccRCC (∼20%) and ccRCC (21%) in the TCGA cohorts ([190]Figures 1D–1F). Interestingly, myeloid-lymphoid high non-ccRCC tumors showed high immune infiltration and high wGII ([191]Figures 1A and [192]S1E). Proteogenomics of high-wGII samples Integrative analysis of RNA, protein, and phosphorylation site level expression data performed using automatic relevance determination non-negative matrix factorization (ARD-NMF)[193]^30 defined six multi-omics clusters. Among these, most pRCC tumors clustered in ARD-NMF-0, oncocytic tumors (RO, chRCC) in ARD-NMF-3, ccRCC samples were distributed in ARD-NMF-1 and -5, while the NATs populated ARD-NMF-2 and ARD-NMF-4 ([194]Figure 1A; [195]Table S2). The smaller ARD-NMF-1 is associated with DNA hypermethylation Methyl1 group, higher-grade ccRCC,[196]^13 and worse prognosis, while the larger ARD-NMF-5 ccRCC cluster is enriched (p < 0.05, chi-square test) in low-grade ccRCC tumors. Next, as DNA hypermethylation subgroups have been associated with worse survival,[197]^17^,[198]^19 we performed consensus clustering with DNA methylation data and identified five different methylation clusters. Methyl3 and Methyl5 were largely subtype specific and contained ccRCC and all oncocytic tumors, respectively. Interestingly, Methyl1 was enriched with ccRCC samples with high wGII, BAP1 mutants, and a subset of non-ccRCC samples with high wGII and high ploidy mostly from the MDTH category ([199]Figures 1E and [200]S1E). We next compared the mRNA and protein differential expression (DE) between high-wGII (n = 9) versus low-wGII (n = 15) non-ccRCC samples including pRCCs, TRCC, ESCRCC, MTOR mutated, and MDTH ([201]Figure 1G; [202]Table S4). A collection of prominent wGII markers was concordantly identified including the mitochondrial proline biosynthetic pathway enzyme PYCR1, associated with cancer cell survival, invasion, and progression across multiple cancer types.[203]^31^,[204]^32^,[205]^33 PYCR1 was confirmed to be upregulated in high-wGII samples based on RNA in situ hybridization (RNA-ISH) ([206]Figure S1F). In addition, the RNA binding protein and N6-methyladenosine reader IGF2BP3 showed a significantly higher mRNA expression in non-ccRCC high-wGII samples ([207]Figure 1G), and upregulation trend noted in protein expression was validated by immunohistochemistry (IHC) ([208]Figure S1G). Importantly, high IGF2BP3 RNA expression was noted in high-wGII samples across CPTAC and TCGA datasets ([209]Figure 1H). IGF2BP3 has been associated with worse survival in several cancer types,[210]^34 but has not been previously associated with high wGII. In general, minimal overlap of high wGII associated differentially expressed genes and proteins was noted between ccRCC and non-ccRCC, but Hallmark pathways associated with high-wGII cases included cell-cycle/proliferation concepts (e.g., enrichment of E2F targets, G2-M checkpoint), as well as immune- and inflammation-related concepts, EMT, hypoxia, and glycolysis ([211]Figure S1H). Notably, the MDTH non-ccRCC tumors that are largely genome unstable (6/7) tend to be both hypermethylated (4/7) and immune infiltrated (6/7). Non-ccRCC snRNA-seq reveals intra-tumor transcriptomic heterogeneity and low immune infiltration To investigate cellular level associations in non-ccRCCs, snRNA-seq data for 8 samples (9,673 single-nuclei transcriptomes [median 10,592 nuclei/sample]), were analyzed along with 3 ccRCC samples from our companion study[212]^13 ([213]Table S5). Dimensionality reduction analysis post downsampling (2,000 nuclei/sample) showed distinct immune, endothelial, and stromal cell clusters irrespective of the patient of origin ([214]Figures 2A and [215]S2A), while the tumor epithelia formed patient-specific clusters ([216]Figure 2B). Most non-ccRCC samples had higher tumor cell fractions, implying higher tumor content and lower immune infiltration[217]^17 compared with ccRCC ([218]Figures S2A–S2C) except for the two high immune fraction AML cases. ROs and chRCC were closer in space, while pRCC, ESCRCC, and TRCC were more distinctly positioned. Figure 2. [219]Figure 2 [220]Open in a new tab Tumor transcriptomic heterogeneity, immune infiltration status, and tumor cell-of-origin by snRNA-seq (A) UMAP of snRNA-seq data from eight non-ccRCC tumors. Nuclei are colored by RCC subtypes for tumor cells (left) and cell types (right). (B) First three principal components of six tumors (AML excluded) colored by tumor types. (C) Probabilities of cell-of-origin are predicted by a random forest classifier for different tumor subclusters for RCC subtypes. Classifier was trained on Lake et al.[221]^35 benign renal epithelial cell snRNA-seq data. (D) Averaged abundance of DE protein (top) and mRNA (bottom) markers from each RCC subtype versus NATs among the epithelial cell types identified from normal kidney scRNA-seq data.[222]^36 Multiple tumor subclusters in AML samples revealing intra-tumor transcriptomic heterogeneity were associated with different constituent tumor cell types, presumably a result of trans-differentiation from a common cell of origin[223]^37 ([224]Figure S2D). AML tumor compartment comprises an admixture of cells that are histologically and molecularly similar to vascular (angio-), smooth muscle (myo-), and fat (lipo-) lineages.[225]^38 Finally, among the ROs, RO type 1 showed multiple tumor subclusters, including one entirely associated with the S-phase of the cell cycle, indicating higher tumor proliferation rates ([226]Figure S2E). Among other tumors analyzed, all tumor clusters from a given sample showed corresponding mRNA expression changes associated with clonal copy number events such as chr7 and 17 gains (pRCC) and chr1 loss (type 2 RO) ([227]Figure S2F). Tumor single-cell transcriptome data have been employed to predict cell-of-origin of tumors, using a random forest model trained on benign nephronal epithelial cell types.[228]^36 We performed similar analysis using snRNA-seq datasets from various RCC subtypes and the publicly available benign human kidney samples[229]^35 ([230]Figure 2C; [231]STAR Methods). Interestingly, TRCC, ESCRCC, and pRCC showed highest origin-probability to the proximal tubule 2 (PT2) population, a rare cell type that is equivalent to the PT-B population (designated from single-cell RNA-seq data) that we previously demonstrated to contain stem-like marker gene expression[232]^36 ([233]Figure 2D). In contrast, the ROs and chRCC consistently showed highest probability to the intercalated-A (IC-A) population, suggesting a distal nephron origin ([234]Figure 2D). Among the AML tumor compartments, we noted similarities between tumor subclusters to mesenchymal vSMC cells and endothelial cells as expected ([235]Figure 2C). Similar results were obtained when bulk tumor RNA-seq data were analyzed with single-cell data from benign kidney ([236]Figure S2G). Finally, to bridge the single-cell and snRNA-seq-based predictions, we demonstrated that the PT2 and cluster 29 populations of PT cells published by Lake et al.[237]^35 were equivalent to the previously identified PT_B and PT_C rare stem-like populations, and we nominated PT2/PT_B cells as the cell-of-origin for several RCC subtypes ([238]Figure S2H). Our analysis supports IC-A as putative cell-of-origin for oncocytomas. Furthermore, we identified the top 100 proteogenomic DE markers in each RCC subtype ([239]Figure S2I; [240]Table S3) and noted their distinct enrichment among benign nephron cell types ([241]Figure 2D), for example MAPRE3 in RO and PIGR in pRCC, which are described in later sections. Phosphoproteomic signatures of RCC subtypes and GI tumors Phosphoproteomics can reveal potentially targetable kinase signaling pathways in tumors. First, as expected we observed enrichment of vascular endothelial growth factor receptor FLT1 in ccRCC,[242]^24 receptor tyrosine kinases MET and KIT (CD117) in pRCC and chRCC/RO, respectively, and serine threonine kinase MYLK in AML ([243]Figures 3A and 3B; [244]Table S6). In addition, we discovered higher expression of CDK18, NEK6, and PNCK in ccRCC, and BAZ1B and TNIK in pRCC type 1. While LATS1, PRKCD, PRKAG2, and STK39 were common between RO and chRCC, DAPK2, MAPK13, MAP3K1, SYK, DDR1, EIF2AK4, PAK4, and PTK2B were specific to chRCC. Therapeutic inhibition of many of these kinases are currently being evaluated in clinical and preclinical settings ([245]Figure 3A). Figure 3. [246]Figure 3 [247]Open in a new tab Phosphoproteomic changes in non-ccRCC and genome-unstable tumors (A) DE kinases across major subtypes. Colors represent protein abundance fold change between tumor subtype and NATs. Highlighted kinases are significantly differentially expressed in certain tumor subtypes (adjusted p < 0.01, abs(log2 fc) > 1). CD8+, CD8 positive; CD8–, CD8 negative; MID, metabolic immune-desert; VEGF, VEGF immune-desert. Drug discovery stages (for kinases) from the drug repurposing hub[248]^39 indicated. (B) Subtype-specific upregulated kinases. Top to bottom: FLT1 in ccRCC, MET in pRCC type 1, KIT in oncocytic tumors, and MYLK in AML. (C) Pathways enriched among the differentially regulated phosphorylation sites across subtypes. Black borders, pathways with FDR < 20%. (D) Kinases that are enriched with down- or upregulated phosphorylation in high compared with low-wGII non-ccRCC. Kinases with enrichment p ≤ 0.05 are labeled. (E) Significantly co-regulated kinase-substrate pairs in high-wGII tumors (FDR < 0.05, abs(log2fc of kinase) > 0.05, abs(log2fc of substrates >0.5)). Diamonds and circles represent kinases and substrate proteins, respectively, and arrows point from former to latter. Diamonds filled with color represent protein abundance log2 fold change between high- and low-wGII non-ccRCC. Border color around circles represents average phosphorylation intensity log2 fold changes between high- and low-wGII non-ccRCC. Size of nodes and thickness of colored arrows are proportional to the number of significant phosphorylation events between kinases and substrate proteins. (F) Protein 3D structure of CDK2. Highlighted residues are significantly upregulated phosphorylation clusters identified by CLUMPS-PTM. Next, to assess phosphorylation changes in subtype-enriched kinases, we compared phospho data and highlighted selected phosphosite changes across the RCC subtypes ([249]Figure S3A; [250]Table S6). For example, phosphorylation of S645 and T507 in the protein kinase C, delta type (PRKCD) is significantly elevated in RO and chRCC compared with pRCCs and ccRCCs ([251]Figure S3A). PRKCD phosphorylation, necessary to prime protein kinase maturation,[252]^40^,[253]^41 is associated with PLC-PKC signaling in leptin stimulation,[254]^42 which is significantly enriched in RO tumors ([255]Figures 3C and [256]S3A). Leptin regulates the PI3K-AKT pathway, signaling through the JAK-STAT axis[257]^43 ([258]Figures 3C and [259]S3A). On the other hand, the IL-2 pathway was uniquely upregulated in RO type 2, with high phosphorylation intensity noted in BAD and STAT1 ([260]Figure S3A). Other immune-related pathways including IL-33, TSLP, and T and B cell receptors were generally highly phosphorylated in different immune subtypes of ccRCC and in pRCC, consistent with the snRNA-seq-based observation of higher immune content in ccRCC and pRCC ([261]Figure S2B). We also explored phosphorylation changes associated with GI, comparing kinase-substrate co-regulation in high- versus low-wGII non-ccRCC samples ([262]STAR Methods; FDR < 0.05, abs(kinase log2 fc) > 0.05, abs(substrate site log2 fc) > 0.5)). Remarkably, cyclin-dependent kinases (CDK1, CDK2) were the most enriched in wGII-high samples ([263]Figures 3D, 3E, and [264]S3D). CDK1 and 2 are critical regulators of multiple steps in cell-cycle and DNA synthesis,[265]^44 thus closely linked to genomic stability.[266]^45^,[267]^46 Significantly upregulated CDK1 substrates include E2F targets such as RRM2, MCM4, DUT, RFC1, PAICS, NASP, and HMGA1, which regulate DNA replication and chromosome stability[268]^47 ([269]Figure 3E). Phosphorylation of RB1 T356 by CDK2 (and CDK4/6)[270]^48^,[271]^49 promotes E2F activity[272]^50 as well as apoptosis in response to replication stress and DNA damage.[273]^51 Interestingly, CDK2 can also be phosphorylated at Y15 by LYN kinase.[274]^52 CLUMPS-PTM analysis used to identify phosphorylation clusters in protein 3D structure[275]^53 revealed three phosphorylation sites in CDK2 (T14, Y15, and T160) forming a phosphorylation hotspot ([276]Figure 3F). Phosphorylation of Y15 and T160 have opposing effects on CDK2 function, Y15 is inhibitory and T160 activating, both events noted together previously in ovarian high-grade serous cancer.[277]^54 Furthermore, increased phosphorylation of CDK2 Y15 is associated with cell-cycle exit in response to replication stress,[278]^55^,[279]^56 altogether supporting mechanistic links with genomic instability.[280]^46^,[281]^56 RCC glycoproteome reflects tumor immune infiltration and angiogenesis Protein glycosylation is linked with cancer development and progression,[282]^57^,[283]^58 as well as tumor microenvironment (TME).[284]^59 To explore RCC glycobiology and its implications on TMEs, we analyzed two different glycoproteomics[285]^60 datasets generated independently for this cohort. First, 41 non-ccRCC and 19 NAT samples were enriched for N-glycopeptides,[286]^61 analyzed by MSFragger-Glyco search pipeline[287]^62^,[288]^63 ([289]STAR Methods). Second, phosphorylation enrichment via immobilized metal affinity chromatography, co-enriched with a substantial number of glycopeptides, particularly sialoglycopeptides,[290]^64 were analyzed similarly. Our N-linked glycoproteomics pipeline identified 12,503 intact glycopeptides (IGPs) with glycans (glycoforms) from 1,035 glycoproteins in glyco-enriched samples and 29,850 glycoforms from 1,591 glycoproteins in the phospho-enriched samples, respectively, with an overlap of 521 glycoproteins ([291]Figure 4A; [292]Table S7). Figure 4. [293]Figure 4 [294]Open in a new tab RCC glycoproteome reflects tumor immune infiltration and angiogenesis (A) Glycoprotein overlap between glyco searches on glyco-enriched samples (glyco enrichment) and phospho-enriched samples (phospho enrichment). (B) Distribution of various glycoforms found in the glyco-enriched samples. (C) Distribution of differentially expressed glycoforms. (D) DE glycoproteins (left) and proteins (right) in glyco-enriched samples and their cell type annotation, delineated by cell-type-specific expression from previous scRNA-seq data.[295]^36 (E) Cell-type enrichment analysis for glycoproteins markers in oncocytoma (left) and pRCC (right) in glyco-enriched samples. (F) DE cell-type-specific glycoprotein markers in glyco-enriched samples. Asterisks indicate significant adjusted q value <0.05) marker expressions. (G) Selected glycoprotein marker expression was validated using data from the Human Protein Atlas. Scale bars, 50 μm. (H) FUT8 protein expression across different RCC subtypes and NATs. (I) FUT8 RNA expression among different cell types identified in type 1 pRCC (C3N-00439) snRNA-seq data. (J) Expression of putative FUT8 glycoprotein targets in pRCC by GSEA. (K) DE glycoproteins (unnormalized data) between high- versus low-wGII non-ccRCC. Based on glycan monosaccharide composition, IGPs were classified into five categories: oligomannose, sialylated, fucosylated, fuco-sialylated, and neutral moieties.[296]^65 In glyco-enriched samples, the IGPs were mainly attached to oligomannose glycans, followed by sialylated glycans ([297]Figure 4B), while in phospho-enriched samples IGPs were largely sialylated ([298]Figure S4A) as expected.[299]^64 Due to sample number constraints, we focused on RO, pRCC, and ccRCC samples. In the glyco-enriched dataset, glycopeptides attached with oligomannose glycans accounted for a large number of the tumor versus normal DE events in both RO and pRCC ([300]Figure 4C). Differential glycosylation abundance positively correlated with corresponding protein abundance changes, but discordant events were also noted ([301]Figures S4B and S4C). Integration with kidney scRNA-seq data[302]^36 ([303]STAR Methods) revealed a significant fraction of the dysregulated glycoproteins contributed by the TME. For example, we observed more upregulated immune compartment changes in pRCC (∼30%) compared with RO (∼5%) ([304]Figures 4D and [305]S4D). Interestingly, only RO samples showed higher fractions of upregulated markers of intercalated cells, a cell type we propose as cell-of-origin for RO. These observations are also consistent with GSEA ([306]Figure 4E). Similar trends of immune infiltration were seen in the phospho-enriched dataset as well for the RO and pRCC samples ([307]Figure S4E). In addition, significant differences between ccRCC immune subtypes were also observed ([308]Figure S4E). As differential glycosylation of key targets has been associated with altered immune and endothelial cell functions,[309]^59 we looked at selected glycoprotein markers in TME cell types ([310]Figures 4F and [311]S4F). Specifically, RO showed upregulation of IGPs of known marker PLCG2,[312]^66 as well as ADGRF5 from epithelial/tumor, VWF, POSTN, and STAT5 from endothelial, and CTSD from the immune compartments. On the other hand, pRCC showed upregulation of TFPI2, FSTL1, FAS, and PIGR in the epithelial/tumor, C1QTNF3 and GRN in the endothelial, and ITGAX, HLA-DQA1, IL4I1, and CTSC in the immune compartments. We also observed differential glycosylations not specific to any cell type, for example involving the cancer stem cell marker CD44[313]^67 in ccRCC and pRCC ([314]Figure 4F). Protein expression of selected markers in different cell types was corroborated by Human Protein Atlas IHC data[315]^68 ([316]Figure 4G). Next, evaluating the expression of glycosylation enzymes associated with glycosylation alterations, we noted high levels of glycotransferases (e.g., MGAT1, FUT11) and low levels of glycohydrolases (e.g., GLB1, FUCA1, FUCA2, HEXA, HEXB) in ccRCC versus NATs and other RCC subtypes, at both RNA and protein levels[317]^69 ([318]Figure S4G). Meanwhile, RO showed upregulated expression of MAN2A1 and ST3GAL1, while pRCC showed higher expression of glycotransferase FUT8[319]^70^,[320]^71 ([321]Figures 4H, 4I, S2B, and S4H). Consistent with FUT8 overexpression, N-glycoproteomics profiling data showed upregulated glycosylation of its putative targets[322]^71^,[323]^72 including CTSC, FSTL1, and LGALS3BP[324]^20^,[325]^73^,[326]^74 ([327]Figures 4J and [328]S4I), and MET ([329]Figure S4J), the oncogenic driver of pRCC type 1 classification[330]^75 and a crucial regulator of EMT.[331]^76 c-MET (encoded by MET) activity is regulated by N-glycosylation.[332]^77^,[333]^78^,[334]^79^,[335]^80 Upregulation of c-MET glycosylation in pRCC type 1 samples was further localized to MET_N785, recently reported to be largely core-fucosylated[336]^81 ([337]Figure S4K). Interestingly, L1CAM, which is a FUT8 target and mediator of cancer progression in melanoma,[338]^71 showed downregulation in this case, suggesting an alternate mechanism in RCCs ([339]Figures 4F and [340]S4F). Finally, comparing the glycosylation patterns in high- versus low-wGII tumors, immune marker glycoproteins such as GZMA (cytotoxic T cells), FCGR1A, PTPRC (lymphocyte), and CD163 (macrophage), endothelial glycoproteins such as POSTN, ITRIP, ANO6, CD74, CD14, and STAB, stromal markers such as FBN, FBLN2, ITGA5, and COL1A1, and other markers MERTK and FH were enriched in high-wGII tumors ([341]Figure 4K). This supports increased TME cell involvement in high-wGII samples. GZMA is proposed to promote colorectal cancer development.[342]^82 MERTK is a receptor tyrosine kinase aberrantly expressed in several malignancies and represents a novel target for cancer therapeutics.[343]^83 GSEA also revealed that glycosylation upregulated in high-wGII samples is involved in EMT hallmark ([344]Table S7), similar to our observation in global proteomics and transcriptomics data ([345]Figure S1H). RCC subtypes metabolome delineates tumor growth dynamics RCCs are known to exhibit a wide array of mutation-driven metabolic defects.[346]^84 ccRCCs displaying increased glycolysis and decreased oxidative phosphorylation (Warburg effect) have been associated with high grade, high stage, and low survival.[347]^85 To explore tumorigenic metabolic reprogramming[348]^86 in non-ccRCCs, we profiled 253 metabolites across 28 non-ccRCC tumors and 7 NATs ([349]Figure S1A; [350]Table S8). The quantified metabolites include organic acids and derivatives (68), nucleosides, nucleotides, and analogs (48), organic oxygen compounds (42), and other intermediates of major metabolic pathways such as organoheterocyclic compounds, lipids, and benzenoids ([351]Figures 5A and [352]S5A). Differential metabolomic characteristics across RCC subtypes and AML samples were resolved by PCA ([353]Figure 5B), including 65, 136, and 97 differential compounds significantly enriched in pRCC type 1, AML, and ROs, respectively, compared with NATs (≥ 2-fold change and q ≤ 0.05) ([354]Figure S5B). Next, analysis of differentially expressed metabolic enzymes identified metabolic pathways perturbed across RCC subtypes. For example, ccRCC and pRCC type 1 tumors shared some common pathway enrichments compared with ccRCC and ROs ([355]Figure 5C). Specifically, purine nucleotide de novo biosynthesis and TCA cycle were depleted in both ccRCC and pRCC type 1 but were enriched in ROs. Pentose phosphate pathway and dermatan sulfate degradation were potentially upregulated in pRCC type 1 but not in other tumor types. Pyrimidine deoxyribonucleoside salvage pathway and glycolysis were active in both AML and ROs. High levels of ACACA, ACACB enzymes, and phosphoric acid in AML indicate increased fatty acid biosynthesis ([356]Figures S5C and S5D). Figure 5. [357]Figure 5 [358]Open in a new tab Metabolomic aberrations across RCC subtypes (A) Filtered metabolites analyzed and their distribution across functional categories. (B) Clustering of metabolomics data from different non-ccRCC and NATs. (C) DE pathways between tumor subtypes. Bubble size, number of compounds per pathway. (D) Schematic sketch of key pathways, protein and metabolite abundance log2 fold changes are represented in rounded-corner and regular-corner color boxes, respectively. (E) Distribution of tumor subtypes stratified by high- and low-wGII groups. (F) Metabolites with significant differential abundance (abs(log2fc) > 1 and p < 0.05) between high- and low-wGII tumors. Several enzymes in the oxidative (e.g., G6PD) as well as non-oxidative (e.g., TALDO1, TKT) phases of pentose phosphate pathway was highly expressed in pRCC type 1 ([359]Figures 5D, [360]S5C, and S5D), associated with increased demand for ribonucleotides in the rapidly proliferating cancer cells.[361]^66 In renal ROs,[362]^66 these enzymes are not differentially expressed, likely representing a metabolic barrier to progression. Indeed, ROs showed an accumulation of pyruvate, a product of glycolysis ([363]Figures 5D, [364]S5C, and S5D). Furthermore, low levels of TCA cycle enzymes such as succinate dehydrogenase (SDHB, SDHC, SDHD) and FH are seen in pRCC type 1,[365]^87 in contrast to high levels of FH, IDH3, and CS seen in ROs. These metabolomic observations are consistent with previously noted high numbers of defective mitochondria (with high abundance of mitochondrial protein) in ROs.[366]^21^,[367]^88^,[368]^89 Finally, we compared the metabolomic profile of 4 high-wGII versus 10 low-wGII non-ccRCC samples and identified 6 compounds significantly upregulated and 5 downregulated in the high-wGII group ([369]Figures 5E and 5F). High levels of proline and NADH, coupled with high PYCR1 expression ([370]Figure S1G), indicated higher proline biosynthesis, which might support cancer cell proliferation and survival in oxygen-limiting conditions.[371]^81 On the other hand, genome-stable samples showed high expression of saccharic acid, glucosamine, and 8-hydroxyquinoline, of which the derivatives are known to have anticancer effects. Papillary RCC biomarkers and proteogenomics of activating MET mutations Malignant pRCCs[372]^75 accounting for 15% of all RCCs are histomorphologically and genetically heterogeneous tumors that currently lack specific diagnostic biomarkers. Importantly, a subset of pRCCs show overlapping morphology with mucinous tubular and spindle cell carcinoma (MTSCC), a rare benign tumor[373]^90 confounding clinical care decisions. To delineate pRCC-specific biomarkers using our multi-omics data, we identified a number of pRCC type 1-specific candidates significantly upregulated (n = 176, log2 fc > 1, q < 0.05) and downregulated (n = 108, log2 fc < −1, q < 0.05) ([374]Figure 6A; [375]Table S9). Top pRCC-specific candidates included sclerostin domain-containing protein1 (SOSTDC1)[376]^91 and polymeric immunoglobulin receptor (PIGR), further validated in the pan-RCC RNA-seq data from a combined cohort of TCGA plus MCTP (n = 1,000) ([377]Figures 6B, [378]S6A, and S6B). Comparing MTSCC (n = 18) and pRCC (n = 8) proteomics data from Xu et al.[379]^23 we noted both PIGR and SOSTDC1 proteins highly upregulated in pRCC compared with MTSCC ([380]Figure 6C), and we validated these findings by IHC and RNA-ISH ([381]Figures 6D and 6E). SOSTDC1 as a biomarker for pRCC has not been studied previously. PIGR has been listed as pRCC-specific in Jorge et al.’s DE analysis,[382]^20 corroborated by our observations. Figure 6. [383]Figure 6 [384]Open in a new tab Proteogenomic biomarkers that distinguish pRCC from MTSCC (A) Significantly differential events (abs(log2fc) > 2 and q < 0.05) in protein expression (x axis) and RNA expression (y axis) between pRCC type 1 and other tumors. (B) Specificity of pRCC type 1 protein markers PIGR and SOSTDC1. (C) Expression of pRCC type 1 protein markers PIGR and SOSTDC1 in the proteomics data from Xu et al.[385]^23 (PXD027972). (D) H&E, protein IHC, and RNA-ISH images (top to bottom) of biomarker PIGR in normal kidney tissue, pRCC, MTSCC tumors (upper panels from left to right) and SOSTDC1 in chRCC, pRCC, and MTSCC (lower panels from left to right). (E) RNA-ISH comparative scores of PIGR and SOSTDC1 in different tumor types. Red points represent external University of Michigan samples. (F) Location of missense mutations in MET across TCGA cohorts are colored on the MET protein domain diagram. (G) PTM-SEA analysis shows pathways such as EGFR are significantly enriched with increased phosphorylation in MET mutant pRCC samples. (H) Enrichment in chromosomes 7 and 17 gene sets are tested with protein expression difference between chromosome 7 gain and no gain non-ccRCC sample groups. Next we explored the proteogenomic impact of activating mutations in the MET kinase domain frequently observed in type 1 pRCC. Two type 1 pRCC samples with hotspot mutations in the MET kinase domain (Asp1246Asn and Met1268Thr) ([386]Figure 6F), compared against type 1 pRCC cases with wild-type MET (n = 5), revealed upregulated phosphoserine/threonine and phosphotyrosine events, including several known MET substrates such as GAB1 Y689 ([387]Figure S6C). In addition to known MET substrates, enrichment analysis with PTM-SEA identified signaling pathways enriched with upregulated phosphorylation sites, such as EGFR, PI3K-AKT, and MAPK ([388]Figure 6G). The intracellular signaling cascades activated by MET include the PI3K-AKT, RAC1-cell division control protein CDC42, RAP1, and RAS-MAPK pathways. Cooperative signaling between MET and EGFR has been observed during kidney development,[389]^92 and aberrant cross-signalling in renal cancer noted to have major implications for therapy.[390]^93 chr7 gain is common in type 1 pRCC and occurs in some ccRCCs. Increased abundance of chr7 genes is notably observed on RNA level ([391]Figures S6D and S6E). As both chr7/17 gains tend to co-occur in pRCC, we also saw increased chr17 gene expression in non-ccRCC tumors, which was not observed in ccRCC ([392]Figures 6H and [393]S6F). Pathway enrichment analysis revealed upregulation in EMT, angiogenesis and KRAS signaling, and downregulation in adipogenesis and fatty acid metabolism in both ccRCC and non-ccRCC tumors with chr7 gain ([394]Figure S6G). In addition, papillary lineage tumors exhibiting chr7 gain show elevated phosphorylation activities associated with several chr7 kinases, including HIPK2, CDK13, MET, CDK6, and BRAF ([395]Figure S6H). Regulons and differential diagnosis biomarkers in oncocytic tumors Current IHC clinical markers for chRCC[396]^17^,[397]^94 and RO[398]^95^,[399]^96 include KRT7, CD117 (c-kit), epithelial mesenchymal antigen, parvalbumin, S100A, and kidney-specific cadherins. Instances of patchy staining and pattern overlap with chRCC[400]^17^,[401]^94^,[402]^97 are limitations, as in clinical diagnostic criteria, patchy KRT7 expression usually supports RO, while strong uniform staining is usually supportive of chRCC. KRT7 was higher in chRCC (both RNA and protein levels), markers such as KIT and FOXI1 were equally expressed in ROs and chRCC, while CCND1 overexpression was specific to fusion-positive RO ([403]Figure S7A). We next employed SCENIC tool,[404]^98^,[405]^99 which examines transcriptional modules or regulons (coexpression of a given transcription factor and its target genes) to characterize differences among the different tumor and benign tissues ([406]Figure S7B). The transcription factor FOXI1 is specifically expressed in intercalated cells and tumors such as chRCC and ROs.[407]^8^,[408]^100 Regulons shared between chRCC and RO include the lineage-specific transcription factors FOXI1 and DMRT2, and DMRT2 is a known target of FOXI1. We also identified several regulons that were enriched only in chRCC, such as ZBTB7A, SMARCB1, E4F1, and FOXJ2, among others ([409]Figure S7B) that were not previously associated with this disease. We next performed RNA and protein DE analysis between three chRCC and 15 ROs profiled in this study to identify diagnostic biomarkers ([410]Figure 7A). Compared with two publicly available datasets, the CPTAC proteomic dataset had a better coverage of FOXI1 and DMRT2 where both transcription factor proteins and most of their gene targets (such as ATPV0D2, HEPACAM2, DMRT2, etc.) showed differential expression in tumor versus normal comparisons, as expected ([411]Figure 7A; [412]Table S10).[413]^66^,[414]^96 Figure 7. [415]Figure 7 [416]Open in a new tab Proteogenomic biomarkers that distinguish oncocytomas (RO) from chRCC (A) DE proteins (x axis) and mRNA (y axis) between RO and chRCC. Indicated genes have p < 0.01 in both dimensions, and candidates in red (MAPRE3, ADGRF5, GPNMB) were subsequently validated as RO- and chRCC-specific biomarkers, respectively. (B) chRCC marker GPNMB (left) and RO biomarkers ADGRF5 and MAPRE3 protein abundance in different subtypes. (C) Overlap between DE proteins identified in this study (CPTAC) and the publicly available PXD007633 dataset in RO (left) and PXD019123 chRCC dataset (right). Genes in red are associated with FOX1 and DMRT2. (D) Immunohistochemistry validation of nominated markers seen in representative tumor sections. Corresponding H&E staining images are shown alongside. DE analysis ([417]Figure 7A) discovered candidates such as microtubule-associated protein RP/EB family member 3, adhesion G protein-coupled receptor F5 (MAPRE3, ADGRF5, specific to RO) and glycoprotein nonmetastatic melanoma protein B (GPNMB upregulated in chRCC) ([418]Figures 7B, 7C, and [419]S7C). We validated our findings in independent publicly available mass spectrometry-based proteomics data for RO (n = 6, PXD007633) and chRCC (n = 9, PXD019123) ([420]Figure 7C). Using IHC, we next independently confirmed and validated biomarker specificity including CCND1 protein overexpression in gene fusion-positive ROs, MAPRE3, and ADGRF5 (also identified in the glycoproteomics analysis) expression in all RO subtypes ([421]Figure 7D) and GPNMB in chRCC. While CCND1 and FOXI1 were enriched in the nuclei, GPNMB showed a homogeneous and moderate/strong expression within the cytoplasmic compartment of the chRCC tumor cells, and MAPRE3 protein showed a predominant membranous expression pattern in RO. ADGRF5, also called GPR116, is an adhesion G protein-coupled receptor, and an emerging role in cancers for this family of proteins is being investigated.[422]^101 Furthermore, we observed THSD4 upregulation in RO type 1, but downregulation in RO type 2 compared with NAT ([423]Figure S7D), future validation of this marker might enable rapid distinction between the two subtypes. Discussion NGS and global proteomics data generated by CPTAC provide a high-quality data resource that can be explored further to derive novel biomarkers and gain deeper insights into disease biology. Motivated by the specific clinical need of biomarkers specific to rare subtypes of renal cell carcinoma, we carried out multi-omics analyses to identify protein/mRNA biomarkers to distinguish benign ROs from chRCC (MAPRE3, ADGRF5, GPNMB), pRCC from MTSCC (SOSTDC1, PIGR), and tumors with high wGII (PYCR1, IGF2BP3). A number of these markers were validated by IHC and RNA-ISH, supporting further evaluation in independent cohorts to facilitate development of renal cancer biomarker panels for clinical use. A number of single-cell studies in renal cancers have mainly characterized ccRCC focusing on immune infiltration,[424]^102^,[425]^103^,[426]^104 immunotherapy resistance,[427]^105 and cell-of-origin,[428]^36^,[429]^106 while non-ccRCC tumors largely remain uncharacterized. Here, we analyzed snRNA-seq data from eight non-ccRCC samples covering all oncocytoma subtypes. Our analysis highlights intra-tumor transcriptomic heterogeneity and a wide variation in the degree of immune infiltration among non-ccRCC subtypes, wherein malignancies such as pRCC and AML showed higher levels of immune infiltration compared with chRCC, ROs, TRCC, and others ([430]Figure S2A). We also identified several cell-type-specific markers representing putative cell-of-origin that could be further characterized for expansion of diagnostic panels. Some non-ccRCC tumors have been previously profiled proteomically,[431]^23^,[432]^66^,[433]^96 but the landscape of their PTMs remains uncharacterized. Here, we examined two different PTM profiles, namely protein phosphorylations and glycosylations from non-ccRCC and ccRCC samples. Besides identifying known kinase expression patterns and therapeutic targets in RCC subtypes such as FLT1 (ccRCC), KIT (chRCC and RO), and MET (pRCC type 1), we have identified several additional subtype-enriched kinases, with some among them being evaluated for their therapeutic utility in preclinical and clinical settings. Additional characterization studies of these potential kinase targets to evaluate their therapeutic utility is warranted. Our integrative phosphoproteomics analysis on tumors with GI besides identifying important biomarkers for early detection of this molecular subset, clarifies signaling cascades that might drive this molecular disease subset. We show significantly increased cyclin-dependent kinase activities in GI tumors, which suggests increased proliferation (taken together with pathways found enriched at whole-protein and RNA abundance) and decreased MTOR activity in these tumors. The latter observation now provides a reason, at least in part, on why MTOR inhibitors such as everolimus show poor response in metastatic RCC. To explore RCC glycobiology and its implication on TMEs, we analyzed both phospho-glyco (contained more sialylated events) and glyco-enriched (had more oligomannose IGPs) data. Using cell type gene expression annotation from previous single-cell data, we inferred TME contribution within the differentially expressed glycoproteins. Larger impact of TME was noted in both ccRCC and pRCC compared with ROs, revalidating the biology of these tumors. Further core-fucosylation characterization using the glyco data generated here, might deepen our understanding of tumor and immune microenvironment in pRCC. Finally, differential glycosylation events noted on proteins potentially contributed by the TME compartment in higher wGII non-ccRCC support higher immune infiltration. Genomic drivers of RCC are linked to dysregulated metabolism.[434]^107 Interesting similarities and differences observed among kidney tumor subtypes include the depletion of purine nucleotide de novo biosynthesis and TCA cycle intermediates in ccRCC and pRCC type 1 tumors, and, in contrast, their enrichment in ROs. Enrichment of the pentose phosphate pathway and dermatan sulfate degradation in pRCC type 1, oncometabolite SAICAR in ROs, and proline and NADH, coupled with high PYCR1 expression in non-ccRCC tumors with high wGII warrant further investigations. In conclusion, proteogenomic analysis provided insights into a variety of non-ccRCC subtypes, identifying histologically specific diagnostic biomarkers, markers of GI, and revealed the interconnectedness between the omic layers. While single-nucleus analysis highlighted the potential intra-tumor heterogeneity and differences in putative cell-of-origin within non-ccRCC subtypes. Fundamentally, this study provides a comprehensive proteogenomic data resource to enable further in-depth exploration of the biology of these rare kidney tumors. Limitations of the study This study evaluates a wide range of non-ccRCC subtypes with an extensive array of multi-omic analyses but has its limitations. The specific tissue procurement protocols necessary to facilitate high-quality protein-based multi-omics limited this study to prospective sample recruitment, thereby limiting the number of tumors analyzed. Lack of samples representing other non-ccRCC subtypes such as FH-deficient RCC, clear cell pRCC, among others, due to nonavailability of those rare subtypes is another limitation. Some subtypes are represented by one or two samples, and do not account for any heterogeneity within these subtypes. However, currently there is little or no high-quality multi-omics data available for most of these tumor subtypes, therefore observations presented here can represent a foundation for further, targeted analyses of specific features. This future work will be essential for confirming and refining this study’s observations, which serve as an initial stepping stone for a deeper understanding of the complexity of non-ccRCCs. Consortia The members of the National Cancer Institute Clinical Proteomic Tumor Analysis Consortium are Eunkyung An, Shankara Anand, Andrzej Antczak, Alexander J. Lazar, Meenakshi Anurag, Jasmin Bavarva, Chet Birger, Michael J. Birrer, Melissa Borucki, Shuang Cai, Anna Calinawan, Wagma Caravan, Steven A. Carr, Daniel W. Chan, Feng Chen, Lijun Chen, Siqi Chen, David Chesla, Arul M. Chinnaiyan, Hanbyul Cho, Seema Chugh, Marcin Cieslik, Sandra Cottingham, Reese Crispen, Felipe da Veiga Leprevost, Aniket Dagar, Saravana M. Dhanasekaran, Rajiv Dhir, Li Ding, Marcin J. Domagalski, Brian J. Druker, Nathan J. Edwards, David Fenyö, Stacey Gabriel, Gad Getz, Yifat Geffen, Michael A. Gillette, Charles A. Goldthwaite Jr., Anthony Green, Shenghao Guo, Jason Hafron, Sarah Haynes, Tara Hiltke, Barbara Hindenach, Bart Williams, Katherine A. Hoadley, Alex Hopkins, Noshad Hosseini, Galen Hostetter, Andrew Houston, Yi Hsiao, Scott D. Jewell, Xiaojun Jing, Ivy John, Corbin D. Jones, Karen A. Ketchum, Iga Kołodziejczak, Chandan Kumar-Sinha, Anne Le, Toan Le, Ginny Xiaohe Li, Yize Li, W. Marston Linehan, Tao Liu, Yin Lu, Jie Luo, Weiping Ma, Avi Ma’ayan, D.R. Mani, Rahul Mannan, Peter B. McGarvey, Rohit Mehra, Mehdi Mesri, Nataly Naser Al Deen, Alexey I. Nesvizhskii, Chelsea J. Newton, Kristen Nyce, Gilbert S. Omenn, Amanda G. Paulovich, Samuel H. Payne, Francesca Petralia, Daniel A. Polasky, Sean Ponce, Barb Pruetz, Ratna R.Thangudu, Boris Reva, Christopher J. Ricketts, Ana I. Robles, Karin D. Rodland, Henry Rodriguez, Eric E. Schadt, Michael Schnaubelt, Yvonne Shutack, Richard D. Smith, Mathangi Thiagarajan, Pamela VanderKolk, Negin Vatanian, Josh Vo, Pei Wang, Xiaoming Wang, George Wilson, Maciej Wiznerowicz, Fengchao Yu, Kakhaber Zaalishvili, Cissy Zhang, Hui Zhang, Yuping Zhang, Stephanie Miner, Bing Zhang, Zhen Zhang, and Xu Zhang. STAR★Methods Key resources table REAGENT or RESOURCE SOURCE IDENTIFIER Antibodies __________________________________________________________________ Goat Polyclonal IgG Human Osteoactivin/GPNMB antibody R&D Systems Catalog: AF2550, RRID: [435]AB_416615 Rabbit Polyclonal IgG Human MAPRE3 antibody Atlas Antibodies Catalog: HPA009263 RRID: [436]AB_1078716 Mouse Monoclonal IgG Human FOXI1 antibody Origene Technologies Catalog: TA800146 RRID: [437]AB_2625262 Rabbit Monoclonal IgG Human Cyclin D1 Cell Marque Catalog: 241R-18 RRID: [438]AB_1158233 Mouse Monoclonal IgG Human PIGR antibody Santa Cruz Catalog: SC-374343, RRID: [439]AB_10989564 Rabbit Polyclonal IgG Human PYCR1 antibody Cell Signaling Technology Catalog: 47935 Rabbit Polyclonal IgG Human AKT antibody Cell Signaling Technology Catalog: 9272 Rabbit Polyclonal IgG Human Phospho-AKT (Ser473) Antibody Cell Signaling Technology Catalog: 9271 Rabbit Polyclonal IgG Human p44/42 MAPK (Erk1/2) Antibody Cell Signaling Technology Catalog: 9102 Rabbit Monoclonal IgG Human Phospho-p44/42 MAPK (Erk1/2) (Thr202/Tyr204) antibody Cell Signaling Technology Catalog: 4376 Rabbit Polyclonal IgG Human Vinculin Antibody Cell Signaling Technology Catalog: 4650 __________________________________________________________________ Biological samples __________________________________________________________________ Primary tumor and normal adjacent tissue samples See [440]experimental model and study participant details See [441]Table S1 __________________________________________________________________ Critical commercial assays __________________________________________________________________ Discovery CC1 Roche-Ventana Medical System Catalog: 950-500 Discovery CC2 Roche-Ventana Medical System Catalog: 950-123 OptiView Universal DAB Detection Kit Roche-Ventana Medical System Catalog: 760-700 UltraView Universal DAB Detection Kit Roche-Ventana Medical System Catalog: 760-500 Discovery mRNA DAB Detection RUO Roche-Ventana Medical System Catalog: 760-224 RNAscope® 2.5 HD Reagent Kit -BROWN Advanced Cell Diagnostics, Inc Catalog: 322300 RNAscope® VS Universal HRP Reagent Kit Advanced Cell Diagnostics, Inc Catalog: 323200 RNAscope Target Probe - Hs-PIGR Advanced Cell Diagnostics, Inc Catalog: 472681 RNAscope Target Probe - Hs-PYCR1 Advanced Cell Diagnostics, Inc Catalog: 509259 RNAscope Target Probe - Hs-SOSTDC1 Advanced Cell Diagnostics, Inc Catalog: 469929 RNAscope PositiveProbe - Hs-PPIB Advanced Cell Diagnostics, Inc Catalog: 313901/313909 RNAscope Negative Probe – DapB Advanced Cell Diagnostics, Inc Catalog: 310043/312039 Rabbit Polyclonal IgG Human IGF2BP3 Proteintech Catalog: 14642-1-AP, RRID: [442]AB_2122782 Rabbit Polyclonal IgG Human ADGRF5 (GPR116) Proteintech Catalog: 14047-1-AP, RRID: [443]AB_2113095 __________________________________________________________________ Deposited data __________________________________________________________________ CPTAC non-ccRCC clinical data and proteomic data This manuscript [444]https://pdc.cancer.gov/ CPTAC ccRCC genomic, transcriptomic, and snRNA-seq data This manuscript [445]https://portal.gdc.cancer.gov/projects/CPTAC-3 CPTAC non-ccRCC pathology and radiology images This manuscript [446]https://portal.imaging.datacommons.cancer.gov/ TCGA KIRC Cancer Genome Atlas Research Network[447]^108 [448]https://portal.gdc.cancer.gov/ TCGA KIRP Cancer Genome Atlas Research Network[449]^75 [450]https://portal.gdc.cancer.gov/ __________________________________________________________________ Software and algorithms __________________________________________________________________ CNVEX [451]https://github.com/mctp/cnvex R v4.1 R Development Core Team [452]https://www.R-project.org Python Python Software Foundation [453]https://www.python.org/ Philosopher da Veiga Leprevost et al.[454]^109 [455]https://philosopher.nesvilab.org/ MSFragger Kong et al.[456]^110 [457]https://msfragger.nesvilab.org/ PTM-Shepherd Geiszler et al.[458]^111 [459]https://ptmshepherd.nesvilab.org/ TMT-Integrator Djomehri et al.[460]^112 [461]https://github.com/Nesvilab/TMT-Integrator ARD-NMF Tan et al.[462]^30 [463]https://github.com/getzlab/getzlab-SignatureAnalyzer CancerSubtypes Xu et al.[464]^113 [465]https://www.bioconductor.org/packages/release/bioc/html/CancerSubt ypes.html Limma Ritchie et al.[466]^90 [467]https://bioconductor.org/packages/release/bioc/html/limma.html PTM-SEA Krug et al.[468]^114 [469]https://github.com/broadinstitute/ssGSEA2.0 KSA-2D Han et al.[470]^115 [471]https://github.com/ginnyintifa/KSA2D CLUMPS-PTM Geffen et al.[472]^53 [473]https://github.com/getzlab/CLUMPS-PTM IMPaLA Kamburov et al.[474]^116 [475]http://impala.molgen.mpg.de/ ClusterProfiler Yu et al.[476]^117 [477]https://bioconductor.org/packages/release/bioc/html/clusterProfile r.html pySCENIC Van de Sande et al.[478]^118 [479]https://github.com/aertslab/pySCENIC BayesDeBulk Petralia et al.[480]^119 [481]http://www.bayesdebulk.com/ DreamAI Ma et al.[482]^120 [483]https://github.com/WangLab-MSSM/DreamAI OmniPathR D Turei et al.[484]^99^,[485]^121 [486]https://www.bioconductor.org/packages/release/bioc/html/OmnipathR. html Survival Therneau et al.[487]^122 [488]https://cran.r-project.org/web/packages/survival/index.html [489]Open in a new tab Resource availability Lead contact Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Alexey Nesvizhskii, nesvi@med.umich.edu. Materials availability * This study did not generate new unique reagents. Data and code availability * • Clinical data and raw proteomic data reported in this paper can be accessed via the CPTAC Data Portal at: [490]https://cptac-data-portal.georgetown.edu/cptac. Genomic, transcriptomic, and snRNA seq data files can be accessed via Genomic Data Commons (GDC) at: [491]https://portal.gdc.cancer.gov/projects/CPTAC-3. Proteomic data files can be accessed via Proteomic Data Commons (PDC) at: [492]https://pdc.cancer.gov/with following accession codes: PDC000464, PDC000465, and PDC000466. Processed data used in this publication can be found in the CPTAC PDC. An interactive ProTrack web portal[493]^30 is also provided to visualize multi-omics data in interactive heatmap and boxplot visualizations, as well reviewing histological images, exploring the cohort with a sample dashboard, and reviewing quality control results for ccRCC and non-ccRCC data ([494]http://ccrcc-conf.cptac-data-view.org/). * • This paper does not report original code. * • Any additional information required to reanalyze the data reported in this work paper is available from the [495]STAR Methods upon request. Experimental model and study participant details Human subjects A total of 151 participants were included in this study.Institutional review boards at each Tissue Source Site (TSS) reviewed protocols and consent documentation, in adherence to Clinical Proteomic Tumor Analysis Consortium (CPTAC) guidelines. Clinical data annotation Clinical data were obtained from TSS and aggregated by the CPTAC Biospecimen Core Resource (BCR, at the Pathology and Biorepository Core of Van Andel Research Institute (Grand Rapids, MI)). Data forms were stored as Microsoft Excel files (.xls). Clinical data can be accessed and downloaded from the CPTAC Data Portal. Demographics of patients can be viewd at ProTrack ([496]http://ccrcc-conf.cptac-data-view.org/). Patients with any prior history of other malignancies within twelve months or any systemic treatment (chemotherapy, radiotherapy, or immune-related therapy) were excluded from this study. Method details Sample processing The CPTAC BCR manufactured and distributed biospecimen kits to the TSS located in the US, and Europe. Each kit contains a set of pre-manufactured labels for unique tracking of every specimen respective to TSS location, disease, and sample type, used to track the specimens through the BCR to the CPTAC proteomic and genomic characterization centers. Tissue specimens averaging 200 mg were snap-frozen by the TSS within a 30 min cold ischemic time (CIT) (CIT average = 15 min) and an adjacent segment was formalin-fixed paraffin embedded (FFPE) and H&E stained by the TSS for quality assessment to meet the CPTA tissue requirements. Routinely, several tissue segments for each case were collected. Tissues were flash-frozen in liquid nitrogen (LN2) and then transferred to a liquid nitrogen freezer for storage until approval for shipment to the BCR. Specimens were shipped using a cryoport that maintained an average temperature of under −140°C to the BCR with a time and temperature tracker to monitor the shipment. Receipt of specimens at the BCR included a physical inspection and review of the time and temperature tracker data for specimen integrity, followed by barcode entry into a biospecimen tracking database. Specimens were again placed in LN2 storage until further processing. Acceptable non-ccRCC tumor tissue segments were determined by TSS pathologists based on the percent viable tumor nuclei (>80%), total cellularity (>50%), and necrosis (<20%). Segments received at the BCR were verified by BCR and Leidos Biomedical Research (LBR) pathologists and the percent of the total area of tumor in the segment was also documented. Additionally, disease-specific working group pathology experts reviewed the morphology to clarify or standardize specific disease classifications and correlation to the proteomic and genomic data. The cryopulverized specimen was divided into aliquots for DNA (30 mg) and RNA (30 mg) isolation and proteomics (50 mg) for molecular characterization. Nucleic acids were isolated and stored at −80°C until further processing and distribution; cryopulverized protein material was returned to the LN2 freezer until distribution. Shipment of the cryopulverized segments used cryoports for distribution to the proteomic characterization centers and shipment of the nucleic acids used dry ice shippers for distribution to the genomic characterization centers; a shipment manifest accompanied all distributions for the receipt and integrity inspection of the specimens at the destination. Sample cohort details In this study, we performed proteogenomics profiling of 194 tumor and NAT samples from the discovery cohort[497]^27 (110 tumors profiled with proteomics and RNA-seq, 84 NATs profiled with proteomics and 73 NATs profiled with RNA-seq), 4 samples from confirmatory[498]^13 (2 tumors and 2 NATs profiled with both proteomics and RNA-seq) and 56 samples from non-ccRCC cohorts (39 tumors profiled with proteins and RNA-seq, 17 NATs profiled with proteomics and 14 NATs profiled with RNA-seq). Within the 110 tumor samples from the discovery ccRCC cohort,[499]^27 103 were confirmed ccRCC and 7 were non-ccRCC. Across all three cohorts, we profiled 103 ccRCC tumor samples (all from the discovery cohort) and 48 non-ccRCC tumor samples (7 samples from the discovery cohort, 2 samples from the confirmatory cohort, 39 from the non-ccRCC cohort). Within the 48 non-ccRCC samples, we counted 15 ROs (3 RO type 1, 8 RO type 2, 4 RO variant), 13 papillary RCC (pRCC, 8 pRCC with the previous defined type 1 features (pRCC-1) and 5 without (pRCC-2)), 3 chromophobe RCC (chRCC), 2 angiomyolipoma (AML), 2 eosinophilic solid and cystic RCC (ESCRCC), 1 Birt-Hogg-Dube syndrome-associated renal cell carcinoma (BHD), 1 mixed epithelial and stromal tumor of the kidney (MEST), 1 MTOR mutated RCC, 1 translocation RCC (TRCC), 8 molecularly divergent to histology RCC (MDTH), and 1 plasmacytoid urothelial carcinoma (PUC). The following three samples were excluded from all downstream analysis: 2 NAT samples (C3N-00314-N and C3N-01524-N) that were found to be contaminated with tumor tissue and 1 PUC sample (C3L-02212-T) which is not a renal cell carcinoma. One tumor (C3N-02204-T) indicated with asterix in [500]Figure 1A is excluded from the wGIIanalysis since genomics data was not fully available at the time of data freeze. Immunohistochemistry (IHC) Immunohistochemistry (IHC) was performed on 4-micron formalin-fixed, paraffin-embedded (FFPE) tissue sections. The Ventana Benchmark XT staining platform with Discovery CCI and CC2 (Ventana cat#950-500 and 950-123) were used for antigen retrieval. The immune complexes were developed with either the ultraView or optiView Universal DAB (diaminobenzidine tetrahydrochloride) Detection Kit (Ventana cat#760-500 and cat#760-700). he details of the panel of primary antibodies utilized is as follows: polymeric immunoglobulin receptor (PIGR/Anti-SC; Santa Cruz, mouse monoclonal, catalog no. SC-374343), cyclin D1 (CCND1; Cell Marque, rabbit monoclonal, catalog no. 241R-18), transmembrane glycoprotein NMB (GPNMB, R&D systems, goat polyclonal, catalog no. AF2550), microtubule-associated protein RP/EB family member-3 (MAPRE3, Atlas antibodies, rabbit polyclonal, catalog no. HPA009263), and forkhead boxI1 (FOXI1, Origene antibodies, mouse monoclonal, catalog no. TA800146). Brown pigmentation within the subcellular component (cytoplasmic and or membranous for PIGR, GPNMB, MAPRE3 and nuclear for FOXI1 and CCND1) were taken as positive expressions. In addition for PIGR the presence and intensity of cytoplasmic staining were scored where the percentage of PIGR positive neoplastic cells and the staining intensity (none, 0; weak, 1; moderate, 2; strong, 3) were recorded for each tumor as described previously.[501]^8 Appropriate positive and negative control tissue were run in each assay batch. RNA in situ hybridization (RNA-ISH) RNA-ISH was performed using the RNAscope 2.5 HD Brown kit (Advanced Cell Diagnostics, Newark, CA) and target probes against PIGR (472681 Hs-PIGR targeting [502]NM_002644.3, 2-903nt), PYCR1 (509259 Hs-PYCR1 targeting [503]NM_001282281.1, 64-1770nt), and SOSTDC1 469929 Hs-SOSTDC1 targeting [504]NM_015464.2, 2-938nt) according to the manufacturer’s instructions. RNA quality was evaluated in each case utilizing a positive and a negative control probe against human housekeeping gene Peptidylprolyl Isomerase B (PPIB) (313901 for manual and 313909 for Ventana automated system) and bacillus bacterial gene DapB (310043 for manual and 312039 for Ventana automated system) respectively. The assay was run according to the protocol previously described.[505]^6^,[506]^9 Stained slides were evaluated under a light microscope at ×100 and ×200 magnification for RNA-ISH signals in neoplastic cells by multiple study investigators. Each RNA molecule in this assay’s result is represented as a punctate brown dot. The expression level was evaluated according to the RNAscope scoring criteria: score 0 = no staining or <1 dot per 10 cells; score 1 = 1–3 dots per cell, score 2 = 4–9 dots per cell, and no or very few dot clusters; score 3 = 10–15 dots per cell and <10% dots in clusters; score 4 = >15 dots per cell and >10% dots in clusters. The H-score was calculated for each examined tissue section as the sum of the percentage of cells with score 0–4 [(A% × 0) + (B% × 1) + (C% × 2) + (D % × 3) + (E% × 4), A + B + C + D + E = 100], using previously published scoring criteria.[507]^6^,[508]^9 Sample processing for genomic DNA and total RNA extraction Our study sampled a single site of the primary tumor from surgical resections, due to the internal requirement to process a minimum of 125 mg of tumor issue and 50 mg of adjacent normal tissue. DNA and RNA were extracted from tumor and blood normal specimens in a co-isolation protocol using Qiagen’s QIAsymphony DNA Mini Kit and QIAsymphony RNA Kit. Genomic DNA was also isolated from peripheral blood (3–5 mL) to serve as matched normal reference material. The Qubit dsDNA BR Assay Kit was used with the Qubit 2.0 Fluorometer to determine the concentration of dsDNA in an aqueous solution. Any sample that passed quality control and produced enough DNA yield to go through various genomic assays was sent for genomic characterization. RNA quality was quantified using both the NanoDrop 8000 and quality assessed using Agilent Bioanalyzer. A sample that passed RNA quality control and had a minimum RIN (RNA integrity number) score of 7 was subjected to RNA sequencing. Identity match for germline, normal adjacent tissue, and tumor tissue was assayed at the BCR using the Illumina Infinium QC array. This beadchip contains 15,949 markers designed to prioritize sample tracking, quality control, and stratification. Preparation of libraries for cluster amplification and WGS sequencing An aliquot of genomic DNA (350 ng in 50 μL) was used as the input into DNA fragmentation (aka shearing). Shearing was performed acoustically using a Covaris focused-ultrasonicator, targeting 385bp fragments. Following fragmentation, additional size selection was performed using an SPRI cleanup. Library preparation was performed using a commercially available kit provided by KAPA Biosystems (KAPA Hyper Prep without amplification module) and with palindromic forked adapters with unique 8-base index sequences embedded within the adapter (purchased from IDT). Following sample preparation, libraries were quantified using quantitative PCR (kit purchased from KAPA Biosystems), with probes specific to the ends of the adapters. This assay was automated using Agilent’s Bravo liquid handling platform. Based on qPCR quantification, libraries were normalized to 1.7 nM and pooled into 24-plexes. Cluster amplification and sequencing (HiSeq X) Sample pools were combined with HiSeq X Cluster Amp Reagents EPX1, EPX2, and EPX3 into single wells on a strip tube using the Hamilton Starlet Liquid Handling system. Cluster amplification of the templates was performed according to the manufacturer’s protocol (Illumina) with the Illumina cBot. Flow cells were sequenced to a minimum of 15x on HiSeq X utilizing sequencing-by-synthesis kits to produce 151bp paired-end reads. Output from Illumina software was processed by the Picard data processing pipeline to yield BAMs containing demultiplexed, aggregated, aligned reads. All sample information tracking was performed by automated LIMS messaging. Whole exome sequencing library construction Library construction was performed as described in Fisher et al.,[509]^123 with the following modifications: initial genomic DNA input into shearing was reduced from 3 μg to 20–250 ng in 50 μL of solution. For adapter ligation, Illumina paired-end adapters were replaced with palindromic forked adapters, purchased from Integrated DNA Technologies, with unique dual-indexed molecular barcode sequences to facilitate downstream pooling. Kapa HyperPrep reagents in 96- reaction kit format was used for end repair/A-tailing, adapter ligation, and library enrichment PCR. In addition, during the post-enrichment SPRI cleanup, elution volume was reduced to 30 μL to maximize library concentration, and a vortexing step was added to maximize the amount of template eluted. In-solution hybrid selection After library construction, libraries were pooled into groups of up to 96 samples. Hybridization and capture were performed using the relevant components of Illumina’s Nextera Exome Kit and following the manufacturer’s suggested protocol, with the following exceptions. First, all libraries within a library construction plate were pooled prior to hybridization. Second, the Midi plate from Illumina’s Nextera Exome Kit was replaced with a skirted PCR plate to facilitate automation. All hybridization and capture steps were automated on the Agilent Bravo liquid handling system. Preparation of libraries for cluster amplification and sequencing After post-capture enrichment, library pools were quantified using qPCR (automated assay on the Agilent Bravo) using a kit purchased from KAPA Biosystems with probes specific to the ends of the adapters. Based on qPCR quantification, libraries were normalized to 2 nM. Cluster amplification and sequencing Cluster amplification of DNA libraries was performed according to the manufacturer’s protocol (Illumina) using exclusion amplification chemistry and flowcells. Flowcells were sequenced utilizing sequencing-by-synthesis chemistry. The flow cells were then analyzed using RTA v.2.7.3 or later. Each pool of whole-exome libraries was sequenced on paired 76 cycle runs with two 8 cycle index reads across the number of lanes needed to meet coverage for all libraries in the pool. Pooled libraries were run on HiSeq 4000 paired-end runs to achieve a minimum of 150x on target coverage per each sample library. The raw Illumina sequence data were demultiplexed and converted to fastq files; adapter and low-quality sequences were trimmed. The raw reads were mapped to the hg38 human reference genome, and the validated BAMs were used for downstream analysis and variant calling. Quality assurance and quality control of RNA analytes All RNA analytes were assayed for RNA integrity, concentration, and fragment size. Samples for total RNA-seq were quantified on a TapeStation system (Agilent, Inc. Santa Clara, CA). Samples with RINs >8.0 were considered high quality. Total RNA-seq library construction Total RNA-seq library construction was performed from the RNA samples using the TruSeq Stranded RNA Sample Preparation Kit and bar-coded with individual tags following the manufacturer’s instructions (Illumina, Inc. San Diego, CA). Libraries were prepared on an Agilent Bravo Automated Liquid Handling System. Quality control was performed at every step and the libraries were quantified using the TapeStation system. Total RNA sequencing Indexed libraries were prepared and run on HiSeq 4000 paired-end 75 base pairs to generate a minimum of 120 million reads per sample library with a target of greater than 90% mapped reads. Typically, these were pools of four samples. The raw Illumina sequence data were demultiplexed and converted to FASTQ files, and adapter and low-quality sequences were quantified. Samples were then assessed for quality by mapping reads to the hg38 human genome reference, estimating the total number of reads that mapped, amount of RNA mapping to coding regions, amount of rRNA in sample, number of genes expressed, and relative expression of housekeeping genes. Samples passing this QA/QC were then clustered with other expression data from similar and distinct tumor types to confirm expected expression patterns. Atypical samples were then SNP typed from the RNA data to confirm the source analyte. FASTQ files of all reads were then uploaded to the GDC repository. Single-nuclei RNA library preparation and sequencing About 20–30 mg of cryopulverized powder from ccRCC specimens was resuspended in Lysis buffer (10 mM Tris-HCl (pH 7.4); 10 mM NaCl; 3 mM MgCl2; and 0.1% NP-40). This suspension was pipetted gently 6–8 times, incubated on ice for 30 s, and pipetted again 4-6 times. The lysate containing free nuclei was filtered through a 40 μm cell strainer. We washed the filter with 1 mL Wash and Resuspension buffer (1X PBS +2% BSA +0.2 U/μL RNase inhibitor) and combined the flow through with the original filtrate. After 6-min centrifugation at 500 x g and 4°C, the nuclei pellet was resuspended in 500 μL of Wash and Resuspension buffer. After staining by DRAQ5, the nuclei were further purified by Fluorescence-Activated Cell Sorting (FACS). FACS-purified nuclei were centrifuged again and resuspended in a small volume (about 30 μL). After counting and microscopic inspection of nuclei quality, the nuclei preparation was diluted to about 1,000 nuclei/μL. About 20,000 nuclei were used for single-nuclei RNA sequencing (snRNA seq) by the 10X Chromium platform. We loaded the single nuclei onto a Chromium Chip B Single Cell Kit, 48 rxns (10x Genomics, PN-1000073), and processed them through the Chromium Controller to generate GEMs (Gel Beads in Emulsion). We then prepared the sequencing libraries with the Chromium Single Cell 3′ GEM, Library & Gel Bead Kit v3, 16 rxns (10x Genomics, PN 1000075) following the manufacturer’s protocol. Sequencing was performed on an Illumina NovaSeq 6000 S4 flow cell. The libraries were pooled and sequenced using the XP workflow according to the manufacturer’s protocol with a 28 × 8 × 98bp sequencing recipe. The resulting sequencing files were available as FASTQs per sample after demultiplexing. Illumina Infinium methylationEPIC beadchip array The MethylationEPIC array uses an 8-sample version of the Illumina Beadchip capturing >850,000 DNA methylation sites per sample. 250 ng of DNA was used for the bisulfite conversation using Infinium MethylationEPIC BeadChip Kit. The EPIC array includes sample plating, bisulfite conversion, and methylation array processing. After scanning, the data was processed through an automated genotype calling pipeline. Data generated consisted of raw idats and a sample sheet. Sample processing for protein extraction and tryptic digestion All samples for the current study were prospectively collected as described above and processed for mass spectrometric (MS) analysis at Johns Hopkins University. Tissue lysis and downstream sample preparation for global proteomic, phosphoproteomic and glycoproteomic analysis were carried out as previously described.[510]^24^,[511]^25^,[512]^124 Each of cryopulverized renal tumor tissues or NATs were homogenized separately in an appropriate volume of lysis buffer (8 M urea, 75 mM NaCl, 50 mM Tris, pH 8.0, 1 mM EDTA, 2 μg/mL aprotinin, 10 μg/mL leupeptin, 1 mM PMSF, 10 mM NaF, Phosphatase Inhibitor Cocktail 2 and Phosphatase Inhibitor Cocktail 3 [1:100 dilution], and 20 μM PUGNAc) by repeated vortexing. Proteins in the lysates were clarified by centrifugation at 20,000 x g for 10 min at 4C, and protein concentrations were determined by BCA assay (Pierce). The proteins were diluted to a final concentration of 8 mg/mL with a lysis buffer for the downstream reduction, alkylation and digestion. 1.2 mg of protein was reduced with 5 mM dithiothreitol (DTT) for 1 h at 37 C and subsequently alkylated with 10 mM iodoacetamide for 45 min at RT (room temperature) in the dark. Samples were then diluted by 1:4 with 50 mM Tris-HCl (pH 8.0) and subjected to proteolytic digestion with LysC (Wako Chemicals, at 1:50 enzyme-to-substrate weight ratio for 2 h incubation at RT) followed by the addition of sequencing-grade modified trypsin (Promega, at a 1:50 enzyme-to-substrate weight ratio for overnight incubation at RT). The digested samples were then acidified with 50% formic acid (FA, Fisher Chemicals) to pH < 3. Tryptic peptides were desalted on reversed-phase C18 SPE columns (Waters) and dried using a Speed-Vac (Thermo Scientific). TMT labeling of peptides Tandem-mass-tag (TMT) quantitation utilizes reporter ion intensities to determine protein abundance and facilitate quantitative proteomic analysis.[513]^125 The samples from the discovery cohort were labeled with TMT-10plex as described in the ccRCC discovery paper,[514]^24 while the samples from the non-ccRCC cohort were labeled with TMT-11plex reagents (Thermo Fisher Scientific). 70 non-ccRCC samples were co-randomized to 7 TMT 11-plex sets. The sample-to-TMT channel mapping is available in the PDC portal ([515]https://proteomic.datacommons.cancer.gov/). 300ug desalted peptides from each non-ccRCC and NAT sample were dissolved in 120 μL of 100 mM HEPES, pH 8.5 solution. 5mg TMT reagent was dissolved in 500 μL of anhydrous acetonitrile, and 45 μL of each TMT reagent was added to the corresponding aliquot of peptides. After 1 h incubation at RT, the reaction was quenched by incubation with 5% hydroxylamine at RT for 15 min. The reference sample used in the ccRCC discovery cohort study[516]^24 was included in all TMT 11-plexes as a reference channel in the non-ccRCC cohort study, labeled with the TMT-131 reagent. Following labeling, peptides were mixed according to the sample-to-TMT channel mapping, concentrated and desalted on reversed-phase C18 SPE columns (Waters), and dried using a Speed-Vac (Thermo Scientific). Peptide fractionation by basic reversed-phase liquid chromatography To reduce the likelihood of peptides co-isolating and co-fragmenting in these highly complex samples, we employed extensive, high-resolution fractionation via basic reversed-phase liquid chromatography (bRPLC). The desalted and dried peptides from each TMT set were reconstituted in 900 mL of 5 mM ammonium formate (pH 10) and 2% acetonitrile (ACN) and loaded onto a 4.6 mm × 250 mm RP Zorbax 300 A Extend-C18 column with 3.5 μm size beads (Agilent). Peptides were separated at a flow-rate of 1 mL/min using an Agilent 1200 Series HPLC instrument with Solvent A (2% ACN, 5 mM ammonium formate, pH 10) and a non-linear gradient of Solvent B (90% ACN, 5 mM ammonium formate, pH 10) as follows: 0% Solvent B (7 min), 0%–16% Solvent B (6 min), 16%–40% Solvent B (60 min), 40%–44% Solvent B (4 min), 44%–60% Solvent B (5 min), and holding at 60% Solvent B for 14 min. Collected fractions were concatenated into 24 fractions by combining four fractions that are 24 fractions apart as described previously[517]^25; a 5% aliquot of each of the 24 fractions was used for global proteomic analysis, dried in a Speed-Vac, and resuspended in 3% ACN/0.1% formic acid prior to ESI-LC-MS/MS analysis. The remaining sample was utilized for phosphopeptide enrichment. Enrichment of phosphopeptides by Fe-IMAC The remaining 95% of the sample was further concatenated into 12 fractions before being subjected to phosphopeptide enrichment using immobilized metal affinity chromatography (IMAC) as previously described.[518]^25 In brief, Ni-NTA agarose beads (Qiagen) were conditioned and incubated with 10mM FeCl3 to prepare Fe3+-NTA agarose beads. Dried peptides from each fraction were reconstituted in 80% ACN/0.1% trifluoroacetic acid and incubated with 10 μL of the Fe3+-IMAC beads for 30 min. Samples were then centrifuged at 1000∗g for 1 min to collect the beads coupled with phophopeptides, and the supernatant containing unbound peptides was removed for the subsequent glycopeptides enrichment (Cao, PDA paper, cell, 2021). The beads were resuspended with 80% ACN/0.1% trifluoroacetic acid and then transferred onto equilibrated C-18 Stage Tips. Tips were washed twice with 80% ACN/0.1% trifluoroacetic acid followed by 1% formic acid. The flowthroughs were collected and combined with the supernatants for subsequent glycopeptides enrichments. Phosphopeptides were eluted from the Fe3+-IMAC beads onto the C-18 Stage Tips with 70 μL of 500 mM dibasic potassium phosphate, pH 7.0 three times. C-18 Stage Tips were then washed twice with 1% formic acid to remove salts, followed by elution of the phosphopeptides from the C-18 Stage Tips with 50% ACN/0.1% formic acid twice. Eluted phosphopeptides were dried down and resuspended in 3% ACN/0.1% formic acid prior to ESI-LC-MS/MS analysis. Enrichment of intact glycopeptides by MAX columns All unbound peptides from phosphopeptide enrichment were desalted on reversed phase C18 SPE column (Waters). The glycopeptides were enriched with OASIS MAX solid-phase extraction (Waters). The MAX cartridge was conditioned with 3 × 1 mL ACN, then 3 × 1 mL of 100 mM triethylammonium acetate buffer, followed by 3 × 1 mL of water, and finally 3 × 1 mL of 95% ACN (1% TFA). The peptides were loaded twice. The cartridge was washed with 4 × 1 mL of 95% ACN (1% TFA) to remove non-glycosylated peptides. The glycopeptide fraction was eluted with 50% ACN (0.1% TFA), dried down, and reconstituted in 3% ACN, 0.1% FA prior to ESI-LC-MS/MS analysis. ESI-LC-MS/MS for global proteome, phosphoproteome, and glycoproteome analysis The TMT-labeled global proteome, phosphoproteome, and glycoproteome fractions were analyzed using Orbitrap Fusion Lumos mass spectrometer (Thermo Scientific). Approximately 0.8 μg of peptides were separated on an in-house packed 28 cm × 75 mm diameter C18 column (1.9 mm Reprosil-Pur C18-AQ beads (Dr. Maisch GmbH); Picofrit 10 mm opening (New Objective)) lined up with an Easy nLC 1200 UHPLC system (Thermo Scientific). The column was heated to 50°C using a column heater (Phoenix-ST). The flow rate was set at 200 nL/min. Buffer A and B were 3% ACN (0.1% FA) and 90% ACN (0.1% FA), respectively. The peptides were separated with a 6%–30% B gradient in 84 min. Peptides were eluted from the column and nanosprayed directly into the mass spectrometer. The mass spectrometer was operated in a data-dependent mode. Parameters for global proteomic and phosphoproteomic samples were set as follows: MS1 resolution - 60,000, mass range – 350 to 1800 m/z, RF Lens – 30%, AGC Target – 4.0e5, Max injection time – 50 ms, charge state include – 2–6, dynamic exclusion – 45 s. The cycle time was set to 2 s, and within this 2 s the most abundant ions per scan were selected for MS/MS in the orbitrap. MS2 resolution – 50,000, high-energy collision dissociation activation energy (HCD) – 34, isolation width (m/z) – 0.7, AGC Target – 2.0e5, Max injection time – 100 ms. Parameters for glycoproteomic samples were set as follows: MS1 resolution - 60,000, mass range – 500 to 2000 m/z, RF Lens – 30%, AGC Target – 5.0e5, Max injection time – 50 ms, charge state include – 2–6, dynamic exclusion – 45 s. The cycle time was set to 2 s, and within this 2 s the most abundant ions per scan were selected for MS/MS in the orbitrap. MS2 resolution – 50,000, high-energy collision dissociation activation energy (HCD) – 35, isolation width (m/z) – 0.7, AGC Target – 1.0e5, Max injection time – 100 ms. Metabolomic acquisition To extract metabolites, a solution consisting of 80% (v/v) mass spectrometry-grade methanol and 20% (v/v) mass spectrometry-grade water were used to extract the metabolites from the tissue samples as described previously.[519]^126^,[520]^127^,[521]^128 The metabolite samples then underwent speed vacuum processing to evaporate the methanol and lyophilization to remove the water. The dried metabolites were re-suspended in a solution consisting of 50% (v/v) acetonitrile and 50% (v/v) mass spectrometry-grade water before data acquisition. Data acquisition was performed using a Vanquish ultra-performance liquid chromatography (UPLC) system and a Thermo Scientific Q Exactive Plus Orbitrap Mass Spectrometer. The samples were kept at 4°C inside the Vanquish UPLC auto-sampler. The injection volume for each sample was 2 μL. A Discovery HSF5 reverse phase HPLC column (Sigma) kept at 35°C with a guard column was used for reverse-phase chromatography. The mobile aqueous phase was mass spectrometry-grade water containing 0.1% formic acid, while the mobile organic phase was acetonitrile containing 0.1% formic acid. Mass calibration was performed prior to data acquisition to ensure the sensitivity and accuracy of the system. The total run time for each sample was 15 min, for which 11 min was used for data acquisition. Full MS data were acquired to quantify the metabolites while Full MS/ddMS2 data were also acquired to identify the metabolites based on fragmentation matching. Quantification and statistical analysis Somatic mutation calling WES reads were aligned FASTQ files to the GRCh38 references, including