Graphical abstract

   graphic file with name fx1.jpg
   [127]Open in a new tab

Highlights

     * •
       Subtype-specific features across a pan-RCC cohort revealed by
       multi-omics data resources
     * •
       Diverse cell-of-origin predictions and tumor signatures revealed by
       snRNA-seq
     * •
       Proteogenomic, metabolic, glycoproteomic signatures of high-wGII
       tumors
     * •
       Biomarkers GPNMB, ADGRF5, MAPRE3 for chRCC/RO and PIGR, SOSTDC1 for
       pRCC
     __________________________________________________________________

   Li et al. perform comprehensive multi-omics characterization of a broad
   range of renal cell carcinomas, improving our understanding of the
   biology, proteogenomics, post-translational modifications, and
   metabolism of kidney cancer. These findings identify important
   biomarkers IGF2BP3, PYCR1, GPNMB, ADGRF5, MAPRE3, PIGR, and SOSTDC1
   that illuminate molecular differences within renal cell carcinoma.

Introduction

   World Health Organization (WHO) 2022 lists 20 different renal cell
   carcinoma (RCC) subtypes, of which 7 are defined by specific molecular
   aberrations.[128]^1 Non-clear cell RCC (ccRCC) accounts for ∼20% of
   RCCs and encompasses a variety of rare subtypes largely defined by
   histopathologic features,[129]^1^,[130]^2^,[131]^3 collectively
   referred to here as non-ccRCCs. Among non-ccRCC tumors, papillary RCC
   (pRCC) (10%–15%) and chromophobe RCC (chRCC) (3%–5%) are relatively
   common, while the other subtypes are much rarer. Several rare renal
   tumors with benign clinical courses show morphological overlap with
   malignant counterparts.[132]^4^,[133]^5 We have previously discovered
   several biomarkers to aid in differential diagnosis of many RCC
   subtypes[134]^6^,[135]^7^,[136]^8^,[137]^9; however, diagnosis in
   limited biopsy samples settings remains challenging.[138]^10^,[139]^11
   In addition, biomarkers to identify patients at high risk of disease
   relapse within each RCC subtype who will benefit from increased
   surveillance and adjuvant therapy is another unmet clinical
   need.[140]^12 Our recent ccRCC study[141]^13 nominated biomarkers
   associated with features of worse prognosis, such as genome instability
   (GI). Similar markers for non-ccRCC tumors remain to be identified.
   Similarly, while immune checkpoint and angiogenesis inhibitors are
   treatment options in metastatic ccRCC,[142]^14^,[143]^15^,[144]^16
   immune infiltration and tumor vascularity vary widely among non-ccRCC
   tumors, necessitating further evaluation of markers of responsiveness.

   To address the knowledge gaps in non-ccRCC differential diagnoses,
   prognoses, and therapeutic avenues, as part of the Clinical Proteomic
   Tumor Analysis Consortium (CPTAC), we performed integrated
   proteogenomic multi-omic analysis of non-ccRCC and ccRCC tumors.
   Besides the few studies detailing genomics,[145]^17^,[146]^18^,[147]^19
   transcriptomics,[148]^17^,[149]^19 and
   proteomics[150]^20^,[151]^21^,[152]^22^,[153]^23 of rare RCCs,
   multi-omic profiling is largely unavailable. Here, we report multi-omic
   analysis of 48 non-ccRCC cases (non-ccRCC cohort) along with the
   reported ccRCC (n = 103) discovery cohort samples.[154]^24 This
   integrative pan-RCC analysis identified shared and subtype-specific
   proteogenomic, glycoproteomic, metabolic features across RCC subtypes,
   nominated various diagnostic biomarkers, and provided validation for
   selected candidates. Single-nucleus RNA sequencing (snRNA-seq) analysis
   captured transcriptomic heterogeneity of tumor subclusters and helped
   predict cell-of-origin. Combined, the data from this study provide a
   rich resource for identifying diagnostic biomarkers, disease
   mechanisms, and potentially new therapeutic targets for non-ccRCC
   subtypes.

Results

Specimens and multi-omics data types

   We performed multi-omics data analysis of 48 non-ccRCC and 103 ccRCC
   tumors and 101 normal adjacent tissues (NATs) (22 and 79 from non-ccRCC
   and ccRCC patients, respectively) ([155]Figure S1A; [156]Table S1).
   Multi-omics data available from common sample aliquots[157]^25 include
   whole-genome sequencing, whole-exome sequencing, DNA methylation
   profiling, and RNA-seq for all 151 tumor samples, and RNA-seq data for
   89 NATs (ccRCC n = 71; non-ccRCC n = 18) ([158]Figure S1A). snRNA-seq
   data were generated for eight non-ccRCC tumors.

   Histopathological subtyping information and signature molecular
   aberrations such as copy number variation patterns, somatic/germline
   mutations, marker gene expression, and gene fusions were collectively
   assessed to arrive at tumor-molecular annotation ([159]Figure 1A;
   [160]Table S1).[161]^26 Based on the WHO 2018 renal tumor histological
   classification (available at data freeze), the analysis cohort
   comprises 103 ccRCC, 15 renal oncocytomas (ROs), 13 pRCC (8 of them
   with type 1 features; WHO 2018), 3 chRCC, 2 angiomyolipoma (AML), 2
   eosinophilic solid and cystic RCC (ESCRCC), 1 Birt-Hogg-Dube
   syndrome-associated renal cell carcinoma, 1 mixed epithelial and
   stromal tumor of the kidney, 1 MTOR mutated RCC, 1 translocation RCC
   (TRCC), and 8 tumors where genomics aberrations patterns did not concur
   with histological classification were annotated as molecularly
   divergent to histology (MDTH) ([162]Figure 1A; [163]Table S1).

Figure 1.

   [164]Figure 1
   [165]Open in a new tab

   Proteogenomic biomarkers of copy number-based genome instability in
   renal cell carcinoma

   (A) Proteogenomic aberration landscape of ccRCC and non-ccRCC. Top
   panel: histo-molecular annotations condensed as tracks (∗excluded
   sample). RNA and protein automatic relevance determination in
   non-negative matrix factorization (ARD-NMF) classification. Middle
   panel: non-ccRCC display distinct recurrent events. Bottom panel:
   heatmaps show the top 10 differentially expressed genes and proteins
   enriched in annotated biological processes. Top 20 protein and RNA
   features (log2 fold change) from selected pathways.

   (B) Differentially enriched pathways (RNA and protein) among the
   various RCC subtypes.

   (C) Predicted immune composition for ccRCC and non-ccRCC.

   (D) Heatmap of absolute copy number variation (CNV) deduced from CNVEX
   output for non-ccRCC (top) and ccRCC (bottom) sorted by ploidy. Ploidy,
   RCC subtype, wGII annotations tracks provided (left).

   (E) Distribution of BAP1 mutation, wGII, immune subtype, tumor classes,
   and NMF clustering in five methylation subgroups. Significant
   enrichment (p < 0.01) of BAP1 mutation, high wGII, myeloid-lymphoid
   high immune subtype, and NMF cluster1 hyper-methylated group.

   (F) Subtype composition among low- and high-wGII tumors, in TCGA (left)
   and CPTAC (right) non-ccRCC (upper), and ccRCC (lower) cohorts. Bold
   black borders, high-wGII samples.

   (G) Comparison of significance levels (signed –log10 p value) between
   protein (x axis) and mRNA (y axis) under high to low-wGII comparison
   within a subset of non-ccRCC samples. Significantly upregulated genes
   are labeled and colored. The inset shows the global correlation between
   the changes.

   (H) Overlap between TCGA and CPTAC high-wGII mRNA expression gene
   markers in non-ccRCC (left) and ccRCC (right).

   The sample cohorts have comparable demographic and clinical composition
   except for a higher proportion of female patients in non-ccRCC compared
   with ccRCC cohorts (p = 0.036) ([166]Figure S1B). The multi-omics
   dataset can be queried using the interactive ProTrack website
   [167]http://ccrcc-conf.cptac-data-view.org ([168]Figure S1B).[169]^27
   We identified a total of 12,299 proteins, 9,396 phosphorylated
   proteins, and 1,035 glycoproteins, of which 9,528 proteins, 6,465
   phosphorylated proteins, and 639 glycoproteins were quantified in more
   than half of all samples. Principal-component analysis (PCA) of global
   proteome, phosphoproteome, and glycoproteome data showed clear
   separation between different tumor subtypes and normal samples in
   two-dimensional space ([170]Figure S1C). Known mutation consequences on
   kinase protein expression and phosphorylation are consistent with
   previous findings[171]^18^,[172]^28 ([173]Figures S1D, S3B, and S3C).

Subtype-specific proteogenomic signatures

   Different subtypes of non-ccRCC tumors displayed recurrent genomic
   aberrations distinct from ccRCC ([174]Figure 1A). Notable non-ccRCC
   subtype-specific events include: the signature chromosomal losses and
   TP53 mutations in chRCC (3/3 cases), chr7/17 gain (8/13 cases), and MET
   mutations (3/13 cases) in pRCC, TSC gene mutations in ESCRCC (2/2
   cases) and AML tumors (2/2 cases), and the TFE3 gene fusion in a TRCC
   case. Consistent with previous classification of RO molecular
   subtypes,[175]^18 RO type 1 was enriched with CCND1 gene rearrangement
   with a diploid genome, and type 2 was marked by one copy loss of
   chromosome 1 (chr1), and RO cases that did not show either of these
   molecular features were categorized as the “RO variant”
   subgroup.[176]^18 Gene set enrichment analysis (GSEA) revealed several
   interesting pathway similarities and differences among RCC subtypes
   ([177]Figure 1B; [178]Tables S2 and[179]S3). For instance,
   immune/inflammatory response concepts, including allograft rejection,
   inflammatory response, interferon alpha/gamma pathways, were
   significantly upregulated, especially at the protein levels, in both
   pRCC and ccRCC. In contrast, glycolysis, hypoxia, and
   epithelial-to-mesenchymal transition (EMT) were significantly enriched
   in the ccRCC proteome but showed a negative enrichment trend in pRCC
   and RO. Interestingly, oxidative phosphorylation showed significant
   positive enrichment in RO but was down in pRCC and ccRCC as expected
   ([180]Figures 1A and 1B).[181]^18^,[182]^22^,[183]^24^,[184]^29

   Next, the status of tumor-immune infiltration was assessed through
   immune deconvolution followed by clustering analysis ([185]STAR
   Methods; [186]Figure 1A). In addition to four previously described
   ccRCC clusters,[187]^24 three non-ccRCC clusters were identified: one
   myeloid-lymphoid high non-ccRCC cluster, a myeloid-high cluster
   containing most pRCC samples, and an immune-absent cluster comprising
   all oncocytic tumors ([188]Figure 1A). Overall, the extent of immune
   infiltration was lower in non-ccRCC than in ccRCC ([189]Figure 1C).
   High weighted genome instability (wGII) containing cases were observed
   among non-ccRCC (∼37%), ccRCC (∼23%) in the CPTAC, and non-ccRCC (∼20%)
   and ccRCC (21%) in the TCGA cohorts ([190]Figures 1D–1F).
   Interestingly, myeloid-lymphoid high non-ccRCC tumors showed high
   immune infiltration and high wGII ([191]Figures 1A and [192]S1E).

Proteogenomics of high-wGII samples

   Integrative analysis of RNA, protein, and phosphorylation site level
   expression data performed using automatic relevance determination
   non-negative matrix factorization (ARD-NMF)[193]^30 defined six
   multi-omics clusters. Among these, most pRCC tumors clustered in
   ARD-NMF-0, oncocytic tumors (RO, chRCC) in ARD-NMF-3, ccRCC samples
   were distributed in ARD-NMF-1 and -5, while the NATs populated
   ARD-NMF-2 and ARD-NMF-4 ([194]Figure 1A; [195]Table S2). The smaller
   ARD-NMF-1 is associated with DNA hypermethylation Methyl1 group,
   higher-grade ccRCC,[196]^13 and worse prognosis, while the larger
   ARD-NMF-5 ccRCC cluster is enriched (p < 0.05, chi-square test) in
   low-grade ccRCC tumors. Next, as DNA hypermethylation subgroups have
   been associated with worse survival,[197]^17^,[198]^19 we performed
   consensus clustering with DNA methylation data and identified five
   different methylation clusters. Methyl3 and Methyl5 were largely
   subtype specific and contained ccRCC and all oncocytic tumors,
   respectively. Interestingly, Methyl1 was enriched with ccRCC samples
   with high wGII, BAP1 mutants, and a subset of non-ccRCC samples with
   high wGII and high ploidy mostly from the MDTH category
   ([199]Figures 1E and [200]S1E).

   We next compared the mRNA and protein differential expression (DE)
   between high-wGII (n = 9) versus low-wGII (n = 15) non-ccRCC samples
   including pRCCs, TRCC, ESCRCC, MTOR mutated, and MDTH ([201]Figure 1G;
   [202]Table S4). A collection of prominent wGII markers was concordantly
   identified including the mitochondrial proline biosynthetic pathway
   enzyme PYCR1, associated with cancer cell survival, invasion, and
   progression across multiple cancer types.[203]^31^,[204]^32^,[205]^33
   PYCR1 was confirmed to be upregulated in high-wGII samples based on RNA
   in situ hybridization (RNA-ISH) ([206]Figure S1F). In addition, the RNA
   binding protein and N6-methyladenosine reader IGF2BP3 showed a
   significantly higher mRNA expression in non-ccRCC high-wGII samples
   ([207]Figure 1G), and upregulation trend noted in protein expression
   was validated by immunohistochemistry (IHC) ([208]Figure S1G).
   Importantly, high IGF2BP3 RNA expression was noted in high-wGII samples
   across CPTAC and TCGA datasets ([209]Figure 1H). IGF2BP3 has been
   associated with worse survival in several cancer types,[210]^34 but has
   not been previously associated with high wGII. In general, minimal
   overlap of high wGII associated differentially expressed genes and
   proteins was noted between ccRCC and non-ccRCC, but Hallmark pathways
   associated with high-wGII cases included cell-cycle/proliferation
   concepts (e.g., enrichment of E2F targets, G2-M checkpoint), as well as
   immune- and inflammation-related concepts, EMT, hypoxia, and glycolysis
   ([211]Figure S1H). Notably, the MDTH non-ccRCC tumors that are largely
   genome unstable (6/7) tend to be both hypermethylated (4/7) and immune
   infiltrated (6/7).

Non-ccRCC snRNA-seq reveals intra-tumor transcriptomic heterogeneity and low
immune infiltration

   To investigate cellular level associations in non-ccRCCs, snRNA-seq
   data for 8 samples (9,673 single-nuclei transcriptomes [median 10,592
   nuclei/sample]), were analyzed along with 3 ccRCC samples from our
   companion study[212]^13 ([213]Table S5). Dimensionality reduction
   analysis post downsampling (2,000 nuclei/sample) showed distinct
   immune, endothelial, and stromal cell clusters irrespective of the
   patient of origin ([214]Figures 2A and [215]S2A), while the tumor
   epithelia formed patient-specific clusters ([216]Figure 2B). Most
   non-ccRCC samples had higher tumor cell fractions, implying higher
   tumor content and lower immune infiltration[217]^17 compared with ccRCC
   ([218]Figures S2A–S2C) except for the two high immune fraction AML
   cases. ROs and chRCC were closer in space, while pRCC, ESCRCC, and TRCC
   were more distinctly positioned.

Figure 2.

   [219]Figure 2
   [220]Open in a new tab

   Tumor transcriptomic heterogeneity, immune infiltration status, and
   tumor cell-of-origin by snRNA-seq

   (A) UMAP of snRNA-seq data from eight non-ccRCC tumors. Nuclei are
   colored by RCC subtypes for tumor cells (left) and cell types (right).

   (B) First three principal components of six tumors (AML excluded)
   colored by tumor types.

   (C) Probabilities of cell-of-origin are predicted by a random forest
   classifier for different tumor subclusters for RCC subtypes. Classifier
   was trained on Lake et al.[221]^35 benign renal epithelial cell
   snRNA-seq data.

   (D) Averaged abundance of DE protein (top) and mRNA (bottom) markers
   from each RCC subtype versus NATs among the epithelial cell types
   identified from normal kidney scRNA-seq data.[222]^36

   Multiple tumor subclusters in AML samples revealing intra-tumor
   transcriptomic heterogeneity were associated with different constituent
   tumor cell types, presumably a result of trans-differentiation from a
   common cell of origin[223]^37 ([224]Figure S2D). AML tumor compartment
   comprises an admixture of cells that are histologically and molecularly
   similar to vascular (angio-), smooth muscle (myo-), and fat (lipo-)
   lineages.[225]^38 Finally, among the ROs, RO type 1 showed multiple
   tumor subclusters, including one entirely associated with the S-phase
   of the cell cycle, indicating higher tumor proliferation rates
   ([226]Figure S2E). Among other tumors analyzed, all tumor clusters from
   a given sample showed corresponding mRNA expression changes associated
   with clonal copy number events such as chr7 and 17 gains (pRCC) and
   chr1 loss (type 2 RO) ([227]Figure S2F).

   Tumor single-cell transcriptome data have been employed to predict
   cell-of-origin of tumors, using a random forest model trained on benign
   nephronal epithelial cell types.[228]^36 We performed similar analysis
   using snRNA-seq datasets from various RCC subtypes and the publicly
   available benign human kidney samples[229]^35 ([230]Figure 2C;
   [231]STAR Methods). Interestingly, TRCC, ESCRCC, and pRCC showed
   highest origin-probability to the proximal tubule 2 (PT2) population, a
   rare cell type that is equivalent to the PT-B population (designated
   from single-cell RNA-seq data) that we previously demonstrated to
   contain stem-like marker gene expression[232]^36 ([233]Figure 2D). In
   contrast, the ROs and chRCC consistently showed highest probability to
   the intercalated-A (IC-A) population, suggesting a distal nephron
   origin ([234]Figure 2D). Among the AML tumor compartments, we noted
   similarities between tumor subclusters to mesenchymal vSMC cells and
   endothelial cells as expected ([235]Figure 2C). Similar results were
   obtained when bulk tumor RNA-seq data were analyzed with single-cell
   data from benign kidney ([236]Figure S2G). Finally, to bridge the
   single-cell and snRNA-seq-based predictions, we demonstrated that the
   PT2 and cluster 29 populations of PT cells published by Lake
   et al.[237]^35 were equivalent to the previously identified PT_B and
   PT_C rare stem-like populations, and we nominated PT2/PT_B cells as the
   cell-of-origin for several RCC subtypes ([238]Figure S2H). Our analysis
   supports IC-A as putative cell-of-origin for oncocytomas. Furthermore,
   we identified the top 100 proteogenomic DE markers in each RCC subtype
   ([239]Figure S2I; [240]Table S3) and noted their distinct enrichment
   among benign nephron cell types ([241]Figure 2D), for example MAPRE3 in
   RO and PIGR in pRCC, which are described in later sections.

Phosphoproteomic signatures of RCC subtypes and GI tumors

   Phosphoproteomics can reveal potentially targetable kinase signaling
   pathways in tumors. First, as expected we observed enrichment of
   vascular endothelial growth factor receptor FLT1 in ccRCC,[242]^24
   receptor tyrosine kinases MET and KIT (CD117) in pRCC and chRCC/RO,
   respectively, and serine threonine kinase MYLK in AML ([243]Figures 3A
   and 3B; [244]Table S6). In addition, we discovered higher expression of
   CDK18, NEK6, and PNCK in ccRCC, and BAZ1B and TNIK in pRCC type 1.
   While LATS1, PRKCD, PRKAG2, and STK39 were common between RO and chRCC,
   DAPK2, MAPK13, MAP3K1, SYK, DDR1, EIF2AK4, PAK4, and PTK2B were
   specific to chRCC. Therapeutic inhibition of many of these kinases are
   currently being evaluated in clinical and preclinical settings
   ([245]Figure 3A).

Figure 3.

   [246]Figure 3
   [247]Open in a new tab

   Phosphoproteomic changes in non-ccRCC and genome-unstable tumors

   (A) DE kinases across major subtypes. Colors represent protein
   abundance fold change between tumor subtype and NATs. Highlighted
   kinases are significantly differentially expressed in certain tumor
   subtypes (adjusted p < 0.01, abs(log2 fc) > 1). CD8+, CD8 positive;
   CD8–, CD8 negative; MID, metabolic immune-desert; VEGF, VEGF
   immune-desert. Drug discovery stages (for kinases) from the drug
   repurposing hub[248]^39 indicated.

   (B) Subtype-specific upregulated kinases. Top to bottom: FLT1 in ccRCC,
   MET in pRCC type 1, KIT in oncocytic tumors, and MYLK in AML.

   (C) Pathways enriched among the differentially regulated
   phosphorylation sites across subtypes. Black borders, pathways with
   FDR < 20%.

   (D) Kinases that are enriched with down- or upregulated phosphorylation
   in high compared with low-wGII non-ccRCC. Kinases with enrichment p ≤
   0.05 are labeled.

   (E) Significantly co-regulated kinase-substrate pairs in high-wGII
   tumors (FDR < 0.05, abs(log2fc of kinase) > 0.05, abs(log2fc of
   substrates >0.5)). Diamonds and circles represent kinases and substrate
   proteins, respectively, and arrows point from former to latter.
   Diamonds filled with color represent protein abundance log2 fold change
   between high- and low-wGII non-ccRCC. Border color around circles
   represents average phosphorylation intensity log2 fold changes between
   high- and low-wGII non-ccRCC. Size of nodes and thickness of colored
   arrows are proportional to the number of significant phosphorylation
   events between kinases and substrate proteins.

   (F) Protein 3D structure of CDK2. Highlighted residues are
   significantly upregulated phosphorylation clusters identified by
   CLUMPS-PTM.

   Next, to assess phosphorylation changes in subtype-enriched kinases, we
   compared phospho data and highlighted selected phosphosite changes
   across the RCC subtypes ([249]Figure S3A; [250]Table S6). For example,
   phosphorylation of S645 and T507 in the protein kinase C, delta type
   (PRKCD) is significantly elevated in RO and chRCC compared with pRCCs
   and ccRCCs ([251]Figure S3A). PRKCD phosphorylation, necessary to prime
   protein kinase maturation,[252]^40^,[253]^41 is associated with PLC-PKC
   signaling in leptin stimulation,[254]^42 which is significantly
   enriched in RO tumors ([255]Figures 3C and [256]S3A). Leptin regulates
   the PI3K-AKT pathway, signaling through the JAK-STAT axis[257]^43
   ([258]Figures 3C and [259]S3A). On the other hand, the IL-2 pathway was
   uniquely upregulated in RO type 2, with high phosphorylation intensity
   noted in BAD and STAT1 ([260]Figure S3A). Other immune-related pathways
   including IL-33, TSLP, and T and B cell receptors were generally highly
   phosphorylated in different immune subtypes of ccRCC and in pRCC,
   consistent with the snRNA-seq-based observation of higher immune
   content in ccRCC and pRCC ([261]Figure S2B).

   We also explored phosphorylation changes associated with GI, comparing
   kinase-substrate co-regulation in high- versus low-wGII non-ccRCC
   samples ([262]STAR Methods; FDR < 0.05, abs(kinase log2 fc) > 0.05,
   abs(substrate site log2 fc) > 0.5)). Remarkably, cyclin-dependent
   kinases (CDK1, CDK2) were the most enriched in wGII-high samples
   ([263]Figures 3D, 3E, and [264]S3D). CDK1 and 2 are critical regulators
   of multiple steps in cell-cycle and DNA synthesis,[265]^44 thus closely
   linked to genomic stability.[266]^45^,[267]^46 Significantly
   upregulated CDK1 substrates include E2F targets such as RRM2, MCM4,
   DUT, RFC1, PAICS, NASP, and HMGA1, which regulate DNA replication and
   chromosome stability[268]^47 ([269]Figure 3E). Phosphorylation of RB1
   T356 by CDK2 (and CDK4/6)[270]^48^,[271]^49 promotes E2F
   activity[272]^50 as well as apoptosis in response to replication stress
   and DNA damage.[273]^51 Interestingly, CDK2 can also be phosphorylated
   at Y15 by LYN kinase.[274]^52 CLUMPS-PTM analysis used to identify
   phosphorylation clusters in protein 3D structure[275]^53 revealed three
   phosphorylation sites in CDK2 (T14, Y15, and T160) forming a
   phosphorylation hotspot ([276]Figure 3F). Phosphorylation of Y15 and
   T160 have opposing effects on CDK2 function, Y15 is inhibitory and T160
   activating, both events noted together previously in ovarian high-grade
   serous cancer.[277]^54 Furthermore, increased phosphorylation of CDK2
   Y15 is associated with cell-cycle exit in response to replication
   stress,[278]^55^,[279]^56 altogether supporting mechanistic links with
   genomic instability.[280]^46^,[281]^56

RCC glycoproteome reflects tumor immune infiltration and angiogenesis

   Protein glycosylation is linked with cancer development and
   progression,[282]^57^,[283]^58 as well as tumor microenvironment
   (TME).[284]^59 To explore RCC glycobiology and its implications on
   TMEs, we analyzed two different glycoproteomics[285]^60 datasets
   generated independently for this cohort. First, 41 non-ccRCC and 19 NAT
   samples were enriched for N-glycopeptides,[286]^61 analyzed by
   MSFragger-Glyco search pipeline[287]^62^,[288]^63 ([289]STAR Methods).
   Second, phosphorylation enrichment via immobilized metal affinity
   chromatography, co-enriched with a substantial number of glycopeptides,
   particularly sialoglycopeptides,[290]^64 were analyzed similarly. Our
   N-linked glycoproteomics pipeline identified 12,503 intact
   glycopeptides (IGPs) with glycans (glycoforms) from 1,035 glycoproteins
   in glyco-enriched samples and 29,850 glycoforms from 1,591
   glycoproteins in the phospho-enriched samples, respectively, with an
   overlap of 521 glycoproteins ([291]Figure 4A; [292]Table S7).

Figure 4.

   [293]Figure 4
   [294]Open in a new tab

   RCC glycoproteome reflects tumor immune infiltration and angiogenesis

   (A) Glycoprotein overlap between glyco searches on glyco-enriched
   samples (glyco enrichment) and phospho-enriched samples (phospho
   enrichment).

   (B) Distribution of various glycoforms found in the glyco-enriched
   samples.

   (C) Distribution of differentially expressed glycoforms.

   (D) DE glycoproteins (left) and proteins (right) in glyco-enriched
   samples and their cell type annotation, delineated by
   cell-type-specific expression from previous scRNA-seq data.[295]^36

   (E) Cell-type enrichment analysis for glycoproteins markers in
   oncocytoma (left) and pRCC (right) in glyco-enriched samples.

   (F) DE cell-type-specific glycoprotein markers in glyco-enriched
   samples. Asterisks indicate significant adjusted q value <0.05) marker
   expressions.

   (G) Selected glycoprotein marker expression was validated using data
   from the Human Protein Atlas. Scale bars, 50 μm.

   (H) FUT8 protein expression across different RCC subtypes and NATs.

   (I) FUT8 RNA expression among different cell types identified in type 1
   pRCC (C3N-00439) snRNA-seq data.

   (J) Expression of putative FUT8 glycoprotein targets in pRCC by GSEA.

   (K) DE glycoproteins (unnormalized data) between high- versus low-wGII
   non-ccRCC.

   Based on glycan monosaccharide composition, IGPs were classified into
   five categories: oligomannose, sialylated, fucosylated,
   fuco-sialylated, and neutral moieties.[296]^65 In glyco-enriched
   samples, the IGPs were mainly attached to oligomannose glycans,
   followed by sialylated glycans ([297]Figure 4B), while in
   phospho-enriched samples IGPs were largely sialylated ([298]Figure S4A)
   as expected.[299]^64 Due to sample number constraints, we focused on
   RO, pRCC, and ccRCC samples. In the glyco-enriched dataset,
   glycopeptides attached with oligomannose glycans accounted for a large
   number of the tumor versus normal DE events in both RO and pRCC
   ([300]Figure 4C). Differential glycosylation abundance positively
   correlated with corresponding protein abundance changes, but discordant
   events were also noted ([301]Figures S4B and S4C). Integration with
   kidney scRNA-seq data[302]^36 ([303]STAR Methods) revealed a
   significant fraction of the dysregulated glycoproteins contributed by
   the TME. For example, we observed more upregulated immune compartment
   changes in pRCC (∼30%) compared with RO (∼5%) ([304]Figures 4D and
   [305]S4D). Interestingly, only RO samples showed higher fractions of
   upregulated markers of intercalated cells, a cell type we propose as
   cell-of-origin for RO. These observations are also consistent with GSEA
   ([306]Figure 4E). Similar trends of immune infiltration were seen in
   the phospho-enriched dataset as well for the RO and pRCC samples
   ([307]Figure S4E). In addition, significant differences between ccRCC
   immune subtypes were also observed ([308]Figure S4E). As differential
   glycosylation of key targets has been associated with altered immune
   and endothelial cell functions,[309]^59 we looked at selected
   glycoprotein markers in TME cell types ([310]Figures 4F and [311]S4F).
   Specifically, RO showed upregulation of IGPs of known marker
   PLCG2,[312]^66 as well as ADGRF5 from epithelial/tumor, VWF, POSTN, and
   STAT5 from endothelial, and CTSD from the immune compartments. On the
   other hand, pRCC showed upregulation of TFPI2, FSTL1, FAS, and PIGR in
   the epithelial/tumor, C1QTNF3 and GRN in the endothelial, and ITGAX,
   HLA-DQA1, IL4I1, and CTSC in the immune compartments. We also observed
   differential glycosylations not specific to any cell type, for example
   involving the cancer stem cell marker CD44[313]^67 in ccRCC and pRCC
   ([314]Figure 4F). Protein expression of selected markers in different
   cell types was corroborated by Human Protein Atlas IHC data[315]^68
   ([316]Figure 4G).

   Next, evaluating the expression of glycosylation enzymes associated
   with glycosylation alterations, we noted high levels of
   glycotransferases (e.g., MGAT1, FUT11) and low levels of
   glycohydrolases (e.g., GLB1, FUCA1, FUCA2, HEXA, HEXB) in ccRCC versus
   NATs and other RCC subtypes, at both RNA and protein levels[317]^69
   ([318]Figure S4G). Meanwhile, RO showed upregulated expression of
   MAN2A1 and ST3GAL1, while pRCC showed higher expression of
   glycotransferase FUT8[319]^70^,[320]^71 ([321]Figures 4H, 4I, S2B, and
   S4H). Consistent with FUT8 overexpression, N-glycoproteomics profiling
   data showed upregulated glycosylation of its putative
   targets[322]^71^,[323]^72 including CTSC, FSTL1, and
   LGALS3BP[324]^20^,[325]^73^,[326]^74 ([327]Figures 4J and [328]S4I),
   and MET ([329]Figure S4J), the oncogenic driver of pRCC type 1
   classification[330]^75 and a crucial regulator of EMT.[331]^76 c-MET
   (encoded by MET) activity is regulated by
   N-glycosylation.[332]^77^,[333]^78^,[334]^79^,[335]^80 Upregulation of
   c-MET glycosylation in pRCC type 1 samples was further localized to
   MET_N785, recently reported to be largely core-fucosylated[336]^81
   ([337]Figure S4K). Interestingly, L1CAM, which is a FUT8 target and
   mediator of cancer progression in melanoma,[338]^71 showed
   downregulation in this case, suggesting an alternate mechanism in RCCs
   ([339]Figures 4F and [340]S4F).

   Finally, comparing the glycosylation patterns in high- versus low-wGII
   tumors, immune marker glycoproteins such as GZMA (cytotoxic T cells),
   FCGR1A, PTPRC (lymphocyte), and CD163 (macrophage), endothelial
   glycoproteins such as POSTN, ITRIP, ANO6, CD74, CD14, and STAB, stromal
   markers such as FBN, FBLN2, ITGA5, and COL1A1, and other markers MERTK
   and FH were enriched in high-wGII tumors ([341]Figure 4K). This
   supports increased TME cell involvement in high-wGII samples. GZMA is
   proposed to promote colorectal cancer development.[342]^82 MERTK is a
   receptor tyrosine kinase aberrantly expressed in several malignancies
   and represents a novel target for cancer therapeutics.[343]^83 GSEA
   also revealed that glycosylation upregulated in high-wGII samples is
   involved in EMT hallmark ([344]Table S7), similar to our observation in
   global proteomics and transcriptomics data ([345]Figure S1H).

RCC subtypes metabolome delineates tumor growth dynamics

   RCCs are known to exhibit a wide array of mutation-driven metabolic
   defects.[346]^84 ccRCCs displaying increased glycolysis and decreased
   oxidative phosphorylation (Warburg effect) have been associated with
   high grade, high stage, and low survival.[347]^85 To explore
   tumorigenic metabolic reprogramming[348]^86 in non-ccRCCs, we profiled
   253 metabolites across 28 non-ccRCC tumors and 7 NATs ([349]Figure S1A;
   [350]Table S8). The quantified metabolites include organic acids and
   derivatives (68), nucleosides, nucleotides, and analogs (48), organic
   oxygen compounds (42), and other intermediates of major metabolic
   pathways such as organoheterocyclic compounds, lipids, and benzenoids
   ([351]Figures 5A and [352]S5A). Differential metabolomic
   characteristics across RCC subtypes and AML samples were resolved by
   PCA ([353]Figure 5B), including 65, 136, and 97 differential compounds
   significantly enriched in pRCC type 1, AML, and ROs, respectively,
   compared with NATs (≥ 2-fold change and q ≤ 0.05) ([354]Figure S5B).
   Next, analysis of differentially expressed metabolic enzymes identified
   metabolic pathways perturbed across RCC subtypes. For example, ccRCC
   and pRCC type 1 tumors shared some common pathway enrichments compared
   with ccRCC and ROs ([355]Figure 5C). Specifically, purine nucleotide de
   novo biosynthesis and TCA cycle were depleted in both ccRCC and pRCC
   type 1 but were enriched in ROs. Pentose phosphate pathway and dermatan
   sulfate degradation were potentially upregulated in pRCC type 1 but not
   in other tumor types. Pyrimidine deoxyribonucleoside salvage pathway
   and glycolysis were active in both AML and ROs. High levels of ACACA,
   ACACB enzymes, and phosphoric acid in AML indicate increased fatty acid
   biosynthesis ([356]Figures S5C and S5D).

Figure 5.

   [357]Figure 5
   [358]Open in a new tab

   Metabolomic aberrations across RCC subtypes

   (A) Filtered metabolites analyzed and their distribution across
   functional categories.

   (B) Clustering of metabolomics data from different non-ccRCC and NATs.

   (C) DE pathways between tumor subtypes. Bubble size, number of
   compounds per pathway.

   (D) Schematic sketch of key pathways, protein and metabolite abundance
   log2 fold changes are represented in rounded-corner and regular-corner
   color boxes, respectively.

   (E) Distribution of tumor subtypes stratified by high- and low-wGII
   groups.

   (F) Metabolites with significant differential abundance
   (abs(log2fc) > 1 and p < 0.05) between high- and low-wGII tumors.

   Several enzymes in the oxidative (e.g., G6PD) as well as non-oxidative
   (e.g., TALDO1, TKT) phases of pentose phosphate pathway was highly
   expressed in pRCC type 1 ([359]Figures 5D, [360]S5C, and S5D),
   associated with increased demand for ribonucleotides in the rapidly
   proliferating cancer cells.[361]^66 In renal ROs,[362]^66 these enzymes
   are not differentially expressed, likely representing a metabolic
   barrier to progression. Indeed, ROs showed an accumulation of pyruvate,
   a product of glycolysis ([363]Figures 5D, [364]S5C, and S5D).
   Furthermore, low levels of TCA cycle enzymes such as succinate
   dehydrogenase (SDHB, SDHC, SDHD) and FH are seen in pRCC type
   1,[365]^87 in contrast to high levels of FH, IDH3, and CS seen in ROs.
   These metabolomic observations are consistent with previously noted
   high numbers of defective mitochondria (with high abundance of
   mitochondrial protein) in ROs.[366]^21^,[367]^88^,[368]^89

   Finally, we compared the metabolomic profile of 4 high-wGII versus 10
   low-wGII non-ccRCC samples and identified 6 compounds significantly
   upregulated and 5 downregulated in the high-wGII group ([369]Figures 5E
   and 5F). High levels of proline and NADH, coupled with high PYCR1
   expression ([370]Figure S1G), indicated higher proline biosynthesis,
   which might support cancer cell proliferation and survival in
   oxygen-limiting conditions.[371]^81 On the other hand, genome-stable
   samples showed high expression of saccharic acid, glucosamine, and
   8-hydroxyquinoline, of which the derivatives are known to have
   anticancer effects.

Papillary RCC biomarkers and proteogenomics of activating MET mutations

   Malignant pRCCs[372]^75 accounting for 15% of all RCCs are
   histomorphologically and genetically heterogeneous tumors that
   currently lack specific diagnostic biomarkers. Importantly, a subset of
   pRCCs show overlapping morphology with mucinous tubular and spindle
   cell carcinoma (MTSCC), a rare benign tumor[373]^90 confounding
   clinical care decisions. To delineate pRCC-specific biomarkers using
   our multi-omics data, we identified a number of pRCC type 1-specific
   candidates significantly upregulated (n = 176, log2 fc > 1, q < 0.05)
   and downregulated (n = 108, log2 fc < −1, q < 0.05) ([374]Figure 6A;
   [375]Table S9). Top pRCC-specific candidates included sclerostin
   domain-containing protein1 (SOSTDC1)[376]^91 and polymeric
   immunoglobulin receptor (PIGR), further validated in the pan-RCC
   RNA-seq data from a combined cohort of TCGA plus MCTP (n = 1,000)
   ([377]Figures 6B, [378]S6A, and S6B). Comparing MTSCC (n = 18) and pRCC
   (n = 8) proteomics data from Xu et al.[379]^23 we noted both PIGR and
   SOSTDC1 proteins highly upregulated in pRCC compared with MTSCC
   ([380]Figure 6C), and we validated these findings by IHC and RNA-ISH
   ([381]Figures 6D and 6E). SOSTDC1 as a biomarker for pRCC has not been
   studied previously. PIGR has been listed as pRCC-specific in Jorge
   et al.’s DE analysis,[382]^20 corroborated by our observations.

Figure 6.

   [383]Figure 6
   [384]Open in a new tab

   Proteogenomic biomarkers that distinguish pRCC from MTSCC

   (A) Significantly differential events (abs(log2fc) > 2 and q < 0.05) in
   protein expression (x axis) and RNA expression (y axis) between pRCC
   type 1 and other tumors.

   (B) Specificity of pRCC type 1 protein markers PIGR and SOSTDC1.

   (C) Expression of pRCC type 1 protein markers PIGR and SOSTDC1 in the
   proteomics data from Xu et al.[385]^23 (PXD027972).

   (D) H&E, protein IHC, and RNA-ISH images (top to bottom) of biomarker
   PIGR in normal kidney tissue, pRCC, MTSCC tumors (upper panels from
   left to right) and SOSTDC1 in chRCC, pRCC, and MTSCC (lower panels from
   left to right).

   (E) RNA-ISH comparative scores of PIGR and SOSTDC1 in different tumor
   types. Red points represent external University of Michigan samples.

   (F) Location of missense mutations in MET across TCGA cohorts are
   colored on the MET protein domain diagram.

   (G) PTM-SEA analysis shows pathways such as EGFR are significantly
   enriched with increased phosphorylation in MET mutant pRCC samples.

   (H) Enrichment in chromosomes 7 and 17 gene sets are tested with
   protein expression difference between chromosome 7 gain and no gain
   non-ccRCC sample groups.

   Next we explored the proteogenomic impact of activating mutations in
   the MET kinase domain frequently observed in type 1 pRCC. Two type 1
   pRCC samples with hotspot mutations in the MET kinase domain
   (Asp1246Asn and Met1268Thr) ([386]Figure 6F), compared against type 1
   pRCC cases with wild-type MET (n = 5), revealed upregulated
   phosphoserine/threonine and phosphotyrosine events, including several
   known MET substrates such as GAB1 Y689 ([387]Figure S6C). In addition
   to known MET substrates, enrichment analysis with PTM-SEA identified
   signaling pathways enriched with upregulated phosphorylation sites,
   such as EGFR, PI3K-AKT, and MAPK ([388]Figure 6G). The intracellular
   signaling cascades activated by MET include the PI3K-AKT, RAC1-cell
   division control protein CDC42, RAP1, and RAS-MAPK pathways.
   Cooperative signaling between MET and EGFR has been observed during
   kidney development,[389]^92 and aberrant cross-signalling in renal
   cancer noted to have major implications for therapy.[390]^93

   chr7 gain is common in type 1 pRCC and occurs in some ccRCCs. Increased
   abundance of chr7 genes is notably observed on RNA level
   ([391]Figures S6D and S6E). As both chr7/17 gains tend to co-occur in
   pRCC, we also saw increased chr17 gene expression in non-ccRCC tumors,
   which was not observed in ccRCC ([392]Figures 6H and [393]S6F). Pathway
   enrichment analysis revealed upregulation in EMT, angiogenesis and KRAS
   signaling, and downregulation in adipogenesis and fatty acid metabolism
   in both ccRCC and non-ccRCC tumors with chr7 gain ([394]Figure S6G). In
   addition, papillary lineage tumors exhibiting chr7 gain show elevated
   phosphorylation activities associated with several chr7 kinases,
   including HIPK2, CDK13, MET, CDK6, and BRAF ([395]Figure S6H).

Regulons and differential diagnosis biomarkers in oncocytic tumors

   Current IHC clinical markers for chRCC[396]^17^,[397]^94 and
   RO[398]^95^,[399]^96 include KRT7, CD117 (c-kit), epithelial
   mesenchymal antigen, parvalbumin, S100A, and kidney-specific cadherins.
   Instances of patchy staining and pattern overlap with
   chRCC[400]^17^,[401]^94^,[402]^97 are limitations, as in clinical
   diagnostic criteria, patchy KRT7 expression usually supports RO, while
   strong uniform staining is usually supportive of chRCC.

   KRT7 was higher in chRCC (both RNA and protein levels), markers such as
   KIT and FOXI1 were equally expressed in ROs and chRCC, while CCND1
   overexpression was specific to fusion-positive RO ([403]Figure S7A). We
   next employed SCENIC tool,[404]^98^,[405]^99 which examines
   transcriptional modules or regulons (coexpression of a given
   transcription factor and its target genes) to characterize differences
   among the different tumor and benign tissues ([406]Figure S7B). The
   transcription factor FOXI1 is specifically expressed in intercalated
   cells and tumors such as chRCC and ROs.[407]^8^,[408]^100 Regulons
   shared between chRCC and RO include the lineage-specific transcription
   factors FOXI1 and DMRT2, and DMRT2 is a known target of FOXI1. We also
   identified several regulons that were enriched only in chRCC, such as
   ZBTB7A, SMARCB1, E4F1, and FOXJ2, among others ([409]Figure S7B) that
   were not previously associated with this disease. We next performed RNA
   and protein DE analysis between three chRCC and 15 ROs profiled in this
   study to identify diagnostic biomarkers ([410]Figure 7A). Compared with
   two publicly available datasets, the CPTAC proteomic dataset had a
   better coverage of FOXI1 and DMRT2 where both transcription factor
   proteins and most of their gene targets (such as ATPV0D2, HEPACAM2,
   DMRT2, etc.) showed differential expression in tumor versus normal
   comparisons, as expected ([411]Figure 7A;
   [412]Table S10).[413]^66^,[414]^96

Figure 7.

   [415]Figure 7
   [416]Open in a new tab

   Proteogenomic biomarkers that distinguish oncocytomas (RO) from chRCC

   (A) DE proteins (x axis) and mRNA (y axis) between RO and chRCC.
   Indicated genes have p < 0.01 in both dimensions, and candidates in red
   (MAPRE3, ADGRF5, GPNMB) were subsequently validated as RO- and
   chRCC-specific biomarkers, respectively.

   (B) chRCC marker GPNMB (left) and RO biomarkers ADGRF5 and MAPRE3
   protein abundance in different subtypes.

   (C) Overlap between DE proteins identified in this study (CPTAC) and
   the publicly available PXD007633 dataset in RO (left) and PXD019123
   chRCC dataset (right). Genes in red are associated with FOX1 and DMRT2.

   (D) Immunohistochemistry validation of nominated markers seen in
   representative tumor sections. Corresponding H&E staining images are
   shown alongside.

   DE analysis ([417]Figure 7A) discovered candidates such as
   microtubule-associated protein RP/EB family member 3, adhesion G
   protein-coupled receptor F5 (MAPRE3, ADGRF5, specific to RO) and
   glycoprotein nonmetastatic melanoma protein B (GPNMB upregulated in
   chRCC) ([418]Figures 7B, 7C, and [419]S7C). We validated our findings
   in independent publicly available mass spectrometry-based proteomics
   data for RO (n = 6, PXD007633) and chRCC (n = 9, PXD019123)
   ([420]Figure 7C). Using IHC, we next independently confirmed and
   validated biomarker specificity including CCND1 protein overexpression
   in gene fusion-positive ROs, MAPRE3, and ADGRF5 (also identified in the
   glycoproteomics analysis) expression in all RO subtypes
   ([421]Figure 7D) and GPNMB in chRCC. While CCND1 and FOXI1 were
   enriched in the nuclei, GPNMB showed a homogeneous and moderate/strong
   expression within the cytoplasmic compartment of the chRCC tumor cells,
   and MAPRE3 protein showed a predominant membranous expression pattern
   in RO. ADGRF5, also called GPR116, is an adhesion G protein-coupled
   receptor, and an emerging role in cancers for this family of proteins
   is being investigated.[422]^101 Furthermore, we observed THSD4
   upregulation in RO type 1, but downregulation in RO type 2 compared
   with NAT ([423]Figure S7D), future validation of this marker might
   enable rapid distinction between the two subtypes.

Discussion

   NGS and global proteomics data generated by CPTAC provide a
   high-quality data resource that can be explored further to derive novel
   biomarkers and gain deeper insights into disease biology. Motivated by
   the specific clinical need of biomarkers specific to rare subtypes of
   renal cell carcinoma, we carried out multi-omics analyses to identify
   protein/mRNA biomarkers to distinguish benign ROs from chRCC (MAPRE3,
   ADGRF5, GPNMB), pRCC from MTSCC (SOSTDC1, PIGR), and tumors with high
   wGII (PYCR1, IGF2BP3). A number of these markers were validated by IHC
   and RNA-ISH, supporting further evaluation in independent cohorts to
   facilitate development of renal cancer biomarker panels for clinical
   use.

   A number of single-cell studies in renal cancers have mainly
   characterized ccRCC focusing on immune
   infiltration,[424]^102^,[425]^103^,[426]^104 immunotherapy
   resistance,[427]^105 and cell-of-origin,[428]^36^,[429]^106 while
   non-ccRCC tumors largely remain uncharacterized. Here, we analyzed
   snRNA-seq data from eight non-ccRCC samples covering all oncocytoma
   subtypes. Our analysis highlights intra-tumor transcriptomic
   heterogeneity and a wide variation in the degree of immune infiltration
   among non-ccRCC subtypes, wherein malignancies such as pRCC and AML
   showed higher levels of immune infiltration compared with chRCC, ROs,
   TRCC, and others ([430]Figure S2A). We also identified several
   cell-type-specific markers representing putative cell-of-origin that
   could be further characterized for expansion of diagnostic panels.

   Some non-ccRCC tumors have been previously profiled
   proteomically,[431]^23^,[432]^66^,[433]^96 but the landscape of their
   PTMs remains uncharacterized. Here, we examined two different PTM
   profiles, namely protein phosphorylations and glycosylations from
   non-ccRCC and ccRCC samples. Besides identifying known kinase
   expression patterns and therapeutic targets in RCC subtypes such as
   FLT1 (ccRCC), KIT (chRCC and RO), and MET (pRCC type 1), we have
   identified several additional subtype-enriched kinases, with some among
   them being evaluated for their therapeutic utility in preclinical and
   clinical settings. Additional characterization studies of these
   potential kinase targets to evaluate their therapeutic utility is
   warranted. Our integrative phosphoproteomics analysis on tumors with GI
   besides identifying important biomarkers for early detection of this
   molecular subset, clarifies signaling cascades that might drive this
   molecular disease subset. We show significantly increased
   cyclin-dependent kinase activities in GI tumors, which suggests
   increased proliferation (taken together with pathways found enriched at
   whole-protein and RNA abundance) and decreased MTOR activity in these
   tumors. The latter observation now provides a reason, at least in part,
   on why MTOR inhibitors such as everolimus show poor response in
   metastatic RCC.

   To explore RCC glycobiology and its implication on TMEs, we analyzed
   both phospho-glyco (contained more sialylated events) and
   glyco-enriched (had more oligomannose IGPs) data. Using cell type gene
   expression annotation from previous single-cell data, we inferred TME
   contribution within the differentially expressed glycoproteins. Larger
   impact of TME was noted in both ccRCC and pRCC compared with ROs,
   revalidating the biology of these tumors. Further core-fucosylation
   characterization using the glyco data generated here, might deepen our
   understanding of tumor and immune microenvironment in pRCC. Finally,
   differential glycosylation events noted on proteins potentially
   contributed by the TME compartment in higher wGII non-ccRCC support
   higher immune infiltration.

   Genomic drivers of RCC are linked to dysregulated metabolism.[434]^107
   Interesting similarities and differences observed among kidney tumor
   subtypes include the depletion of purine nucleotide de novo
   biosynthesis and TCA cycle intermediates in ccRCC and pRCC type 1
   tumors, and, in contrast, their enrichment in ROs. Enrichment of the
   pentose phosphate pathway and dermatan sulfate degradation in pRCC type
   1, oncometabolite SAICAR in ROs, and proline and NADH, coupled with
   high PYCR1 expression in non-ccRCC tumors with high wGII warrant
   further investigations.

   In conclusion, proteogenomic analysis provided insights into a variety
   of non-ccRCC subtypes, identifying histologically specific diagnostic
   biomarkers, markers of GI, and revealed the interconnectedness between
   the omic layers. While single-nucleus analysis highlighted the
   potential intra-tumor heterogeneity and differences in putative
   cell-of-origin within non-ccRCC subtypes. Fundamentally, this study
   provides a comprehensive proteogenomic data resource to enable further
   in-depth exploration of the biology of these rare kidney tumors.

Limitations of the study

   This study evaluates a wide range of non-ccRCC subtypes with an
   extensive array of multi-omic analyses but has its limitations. The
   specific tissue procurement protocols necessary to facilitate
   high-quality protein-based multi-omics limited this study to
   prospective sample recruitment, thereby limiting the number of tumors
   analyzed. Lack of samples representing other non-ccRCC subtypes such as
   FH-deficient RCC, clear cell pRCC, among others, due to nonavailability
   of those rare subtypes is another limitation. Some subtypes are
   represented by one or two samples, and do not account for any
   heterogeneity within these subtypes. However, currently there is little
   or no high-quality multi-omics data available for most of these tumor
   subtypes, therefore observations presented here can represent a
   foundation for further, targeted analyses of specific features. This
   future work will be essential for confirming and refining this study’s
   observations, which serve as an initial stepping stone for a deeper
   understanding of the complexity of non-ccRCCs.

Consortia

   The members of the National Cancer Institute Clinical Proteomic Tumor
   Analysis Consortium are Eunkyung An, Shankara Anand, Andrzej Antczak,
   Alexander J. Lazar, Meenakshi Anurag, Jasmin Bavarva, Chet Birger,
   Michael J. Birrer, Melissa Borucki, Shuang Cai, Anna Calinawan, Wagma
   Caravan, Steven A. Carr, Daniel W. Chan, Feng Chen, Lijun Chen, Siqi
   Chen, David Chesla, Arul M. Chinnaiyan, Hanbyul Cho, Seema Chugh,
   Marcin Cieslik, Sandra Cottingham, Reese Crispen, Felipe da Veiga
   Leprevost, Aniket Dagar, Saravana M. Dhanasekaran, Rajiv Dhir, Li Ding,
   Marcin J. Domagalski, Brian J. Druker, Nathan J. Edwards, David Fenyö,
   Stacey Gabriel, Gad Getz, Yifat Geffen, Michael A. Gillette, Charles A.
   Goldthwaite Jr., Anthony Green, Shenghao Guo, Jason Hafron, Sarah
   Haynes, Tara Hiltke, Barbara Hindenach, Bart Williams, Katherine A.
   Hoadley, Alex Hopkins, Noshad Hosseini, Galen Hostetter, Andrew
   Houston, Yi Hsiao, Scott D. Jewell, Xiaojun Jing, Ivy John, Corbin D.
   Jones, Karen A. Ketchum, Iga Kołodziejczak, Chandan Kumar-Sinha, Anne
   Le, Toan Le, Ginny Xiaohe Li, Yize Li, W. Marston Linehan, Tao Liu, Yin
   Lu, Jie Luo, Weiping Ma, Avi Ma’ayan, D.R. Mani, Rahul Mannan, Peter B.
   McGarvey, Rohit Mehra, Mehdi Mesri, Nataly Naser Al Deen, Alexey I.
   Nesvizhskii, Chelsea J. Newton, Kristen Nyce, Gilbert S. Omenn, Amanda
   G. Paulovich, Samuel H. Payne, Francesca Petralia, Daniel A. Polasky,
   Sean Ponce, Barb Pruetz, Ratna R.Thangudu, Boris Reva, Christopher J.
   Ricketts, Ana I. Robles, Karin D. Rodland, Henry Rodriguez, Eric E.
   Schadt, Michael Schnaubelt, Yvonne Shutack, Richard D. Smith, Mathangi
   Thiagarajan, Pamela VanderKolk, Negin Vatanian, Josh Vo, Pei Wang,
   Xiaoming Wang, George Wilson, Maciej Wiznerowicz, Fengchao Yu, Kakhaber
   Zaalishvili, Cissy Zhang, Hui Zhang, Yuping Zhang, Stephanie Miner,
   Bing Zhang, Zhen Zhang, and Xu Zhang.

STAR★Methods

Key resources table

   REAGENT or RESOURCE SOURCE IDENTIFIER
   Antibodies
     __________________________________________________________________

   Goat Polyclonal IgG Human Osteoactivin/GPNMB antibody R&D Systems
   Catalog: AF2550,
   RRID: [435]AB_416615
   Rabbit Polyclonal IgG Human MAPRE3 antibody Atlas Antibodies Catalog:
   HPA009263
   RRID: [436]AB_1078716
   Mouse Monoclonal IgG Human FOXI1 antibody Origene Technologies Catalog:
   TA800146
   RRID: [437]AB_2625262
   Rabbit Monoclonal IgG Human Cyclin D1 Cell Marque Catalog: 241R-18
   RRID: [438]AB_1158233
   Mouse Monoclonal IgG Human PIGR antibody Santa Cruz Catalog: SC-374343,
   RRID: [439]AB_10989564
   Rabbit Polyclonal IgG Human PYCR1 antibody Cell Signaling Technology
   Catalog: 47935
   Rabbit Polyclonal IgG Human AKT antibody Cell Signaling Technology
   Catalog: 9272
   Rabbit Polyclonal IgG Human Phospho-AKT (Ser473) Antibody Cell
   Signaling Technology Catalog: 9271
   Rabbit Polyclonal IgG Human p44/42 MAPK (Erk1/2) Antibody Cell
   Signaling Technology Catalog: 9102
   Rabbit Monoclonal IgG Human Phospho-p44/42 MAPK (Erk1/2)
   (Thr202/Tyr204) antibody Cell Signaling Technology Catalog: 4376
   Rabbit Polyclonal IgG Human Vinculin Antibody Cell Signaling Technology
   Catalog: 4650
     __________________________________________________________________

   Biological samples
     __________________________________________________________________

   Primary tumor and normal adjacent tissue samples See [440]experimental
   model and study participant details See [441]Table S1
     __________________________________________________________________

   Critical commercial assays
     __________________________________________________________________

   Discovery CC1 Roche-Ventana Medical System Catalog: 950-500
   Discovery CC2 Roche-Ventana Medical System Catalog: 950-123
   OptiView Universal DAB Detection Kit Roche-Ventana Medical System
   Catalog: 760-700
   UltraView Universal DAB Detection Kit Roche-Ventana Medical System
   Catalog: 760-500
   Discovery mRNA DAB Detection RUO Roche-Ventana Medical System Catalog:
   760-224
   RNAscope® 2.5 HD Reagent Kit -BROWN Advanced Cell Diagnostics, Inc
   Catalog: 322300
   RNAscope® VS Universal HRP Reagent Kit Advanced Cell Diagnostics, Inc
   Catalog: 323200
   RNAscope Target Probe - Hs-PIGR Advanced Cell Diagnostics, Inc Catalog:
   472681
   RNAscope Target Probe - Hs-PYCR1 Advanced Cell Diagnostics, Inc
   Catalog: 509259
   RNAscope Target Probe - Hs-SOSTDC1 Advanced Cell Diagnostics, Inc
   Catalog: 469929
   RNAscope PositiveProbe - Hs-PPIB Advanced Cell Diagnostics, Inc
   Catalog: 313901/313909
   RNAscope Negative Probe – DapB Advanced Cell Diagnostics, Inc Catalog:
   310043/312039
   Rabbit Polyclonal IgG Human IGF2BP3 Proteintech Catalog: 14642-1-AP,
   RRID: [442]AB_2122782
   Rabbit Polyclonal IgG Human ADGRF5 (GPR116) Proteintech Catalog:
   14047-1-AP, RRID: [443]AB_2113095
     __________________________________________________________________

   Deposited data
     __________________________________________________________________

   CPTAC non-ccRCC clinical data and proteomic data This manuscript
   [444]https://pdc.cancer.gov/
   CPTAC ccRCC genomic, transcriptomic, and snRNA-seq data This manuscript
   [445]https://portal.gdc.cancer.gov/projects/CPTAC-3
   CPTAC non-ccRCC pathology and radiology images This manuscript
   [446]https://portal.imaging.datacommons.cancer.gov/
   TCGA KIRC Cancer Genome Atlas
   Research Network[447]^108 [448]https://portal.gdc.cancer.gov/
   TCGA KIRP Cancer Genome Atlas
   Research Network[449]^75 [450]https://portal.gdc.cancer.gov/
     __________________________________________________________________

   Software and algorithms
     __________________________________________________________________

   CNVEX [451]https://github.com/mctp/cnvex
   R v4.1 R Development Core Team [452]https://www.R-project.org
   Python Python Software Foundation [453]https://www.python.org/
   Philosopher da Veiga Leprevost et al.[454]^109
   [455]https://philosopher.nesvilab.org/
   MSFragger Kong et al.[456]^110 [457]https://msfragger.nesvilab.org/
   PTM-Shepherd Geiszler et al.[458]^111
   [459]https://ptmshepherd.nesvilab.org/
   TMT-Integrator Djomehri et al.[460]^112
   [461]https://github.com/Nesvilab/TMT-Integrator
   ARD-NMF Tan et al.[462]^30
   [463]https://github.com/getzlab/getzlab-SignatureAnalyzer
   CancerSubtypes Xu et al.[464]^113
   [465]https://www.bioconductor.org/packages/release/bioc/html/CancerSubt
   ypes.html
   Limma Ritchie et al.[466]^90
   [467]https://bioconductor.org/packages/release/bioc/html/limma.html
   PTM-SEA Krug et al.[468]^114
   [469]https://github.com/broadinstitute/ssGSEA2.0
   KSA-2D Han et al.[470]^115 [471]https://github.com/ginnyintifa/KSA2D
   CLUMPS-PTM Geffen et al.[472]^53
   [473]https://github.com/getzlab/CLUMPS-PTM
   IMPaLA Kamburov et al.[474]^116 [475]http://impala.molgen.mpg.de/
   ClusterProfiler Yu et al.[476]^117
   [477]https://bioconductor.org/packages/release/bioc/html/clusterProfile
   r.html
   pySCENIC Van de Sande et al.[478]^118
   [479]https://github.com/aertslab/pySCENIC
   BayesDeBulk Petralia et al.[480]^119 [481]http://www.bayesdebulk.com/
   DreamAI Ma et al.[482]^120 [483]https://github.com/WangLab-MSSM/DreamAI
   OmniPathR D Turei et al.[484]^99^,[485]^121
   [486]https://www.bioconductor.org/packages/release/bioc/html/OmnipathR.
   html
   Survival Therneau et al.[487]^122
   [488]https://cran.r-project.org/web/packages/survival/index.html
   [489]Open in a new tab

Resource availability

Lead contact

   Further information and requests for resources and reagents should be
   directed to and will be fulfilled by the Lead Contact, Alexey
   Nesvizhskii, nesvi@med.umich.edu.

Materials availability

     * This study did not generate new unique reagents.

Data and code availability

     * •
       Clinical data and raw proteomic data reported in this paper can be
       accessed via the CPTAC Data Portal at:
       [490]https://cptac-data-portal.georgetown.edu/cptac. Genomic,
       transcriptomic, and snRNA seq data files can be accessed via
       Genomic Data Commons (GDC) at:
       [491]https://portal.gdc.cancer.gov/projects/CPTAC-3. Proteomic data
       files can be accessed via Proteomic Data Commons (PDC) at:
       [492]https://pdc.cancer.gov/with following accession codes:
       PDC000464, PDC000465, and PDC000466. Processed data used in this
       publication can be found in the CPTAC PDC. An interactive ProTrack
       web portal[493]^30 is also provided to visualize multi-omics data
       in interactive heatmap and boxplot visualizations, as well
       reviewing histological images, exploring the cohort with a sample
       dashboard, and reviewing quality control results for ccRCC and
       non-ccRCC data ([494]http://ccrcc-conf.cptac-data-view.org/).
     * •
       This paper does not report original code.
     * •
       Any additional information required to reanalyze the data reported
       in this work paper is available from the [495]STAR Methods upon
       request.

Experimental model and study participant details

Human subjects

   A total of 151 participants were included in this study.Institutional
   review boards at each Tissue Source Site (TSS) reviewed protocols and
   consent documentation, in adherence to Clinical Proteomic Tumor
   Analysis Consortium (CPTAC) guidelines.

Clinical data annotation

   Clinical data were obtained from TSS and aggregated by the CPTAC
   Biospecimen Core Resource (BCR, at the Pathology and Biorepository Core
   of Van Andel Research Institute (Grand Rapids, MI)). Data forms were
   stored as Microsoft Excel files (.xls). Clinical data can be accessed
   and downloaded from the CPTAC Data Portal. Demographics of patients can
   be viewd at ProTrack ([496]http://ccrcc-conf.cptac-data-view.org/).
   Patients with any prior history of other malignancies within twelve
   months or any systemic treatment (chemotherapy, radiotherapy, or
   immune-related therapy) were excluded from this study.

Method details

Sample processing

   The CPTAC BCR manufactured and distributed biospecimen kits to the TSS
   located in the US, and Europe. Each kit contains a set of
   pre-manufactured labels for unique tracking of every specimen
   respective to TSS location, disease, and sample type, used to track the
   specimens through the BCR to the CPTAC proteomic and genomic
   characterization centers. Tissue specimens averaging 200 mg were
   snap-frozen by the TSS within a 30 min cold ischemic time (CIT) (CIT
   average = 15 min) and an adjacent segment was formalin-fixed paraffin
   embedded (FFPE) and H&E stained by the TSS for quality assessment to
   meet the CPTA tissue requirements. Routinely, several tissue segments
   for each case were collected. Tissues were flash-frozen in liquid
   nitrogen (LN2) and then transferred to a liquid nitrogen freezer for
   storage until approval for shipment to the BCR. Specimens were shipped
   using a cryoport that maintained an average temperature of under −140°C
   to the BCR with a time and temperature tracker to monitor the shipment.
   Receipt of specimens at the BCR included a physical inspection and
   review of the time and temperature tracker data for specimen integrity,
   followed by barcode entry into a biospecimen tracking database.

   Specimens were again placed in LN2 storage until further processing.
   Acceptable non-ccRCC tumor tissue segments were determined by TSS
   pathologists based on the percent viable tumor nuclei (>80%), total
   cellularity (>50%), and necrosis (<20%). Segments received at the BCR
   were verified by BCR and Leidos Biomedical Research (LBR) pathologists
   and the percent of the total area of tumor in the segment was also
   documented. Additionally, disease-specific working group pathology
   experts reviewed the morphology to clarify or standardize specific
   disease classifications and correlation to the proteomic and genomic
   data. The cryopulverized specimen was divided into aliquots for DNA
   (30 mg) and RNA (30 mg) isolation and proteomics (50 mg) for molecular
   characterization. Nucleic acids were isolated and stored at −80°C until
   further processing and distribution; cryopulverized protein material
   was returned to the LN2 freezer until distribution. Shipment of the
   cryopulverized segments used cryoports for distribution to the
   proteomic characterization centers and shipment of the nucleic acids
   used dry ice shippers for distribution to the genomic characterization
   centers; a shipment manifest accompanied all distributions for the
   receipt and integrity inspection of the specimens at the destination.

Sample cohort details

   In this study, we performed proteogenomics profiling of 194 tumor and
   NAT samples from the discovery cohort[497]^27 (110 tumors profiled with
   proteomics and RNA-seq, 84 NATs profiled with proteomics and 73 NATs
   profiled with RNA-seq), 4 samples from confirmatory[498]^13 (2 tumors
   and 2 NATs profiled with both proteomics and RNA-seq) and 56 samples
   from non-ccRCC cohorts (39 tumors profiled with proteins and RNA-seq,
   17 NATs profiled with proteomics and 14 NATs profiled with RNA-seq).
   Within the 110 tumor samples from the discovery ccRCC cohort,[499]^27
   103 were confirmed ccRCC and 7 were non-ccRCC.

   Across all three cohorts, we profiled 103 ccRCC tumor samples (all from
   the discovery cohort) and 48 non-ccRCC tumor samples (7 samples from
   the discovery cohort, 2 samples from the confirmatory cohort, 39 from
   the non-ccRCC cohort). Within the 48 non-ccRCC samples, we counted 15
   ROs (3 RO type 1, 8 RO type 2, 4 RO variant), 13 papillary RCC (pRCC, 8
   pRCC with the previous defined type 1 features (pRCC-1) and 5 without
   (pRCC-2)), 3 chromophobe RCC (chRCC), 2 angiomyolipoma (AML), 2
   eosinophilic solid and cystic RCC (ESCRCC), 1 Birt-Hogg-Dube
   syndrome-associated renal cell carcinoma (BHD), 1 mixed epithelial and
   stromal tumor of the kidney (MEST), 1 MTOR mutated RCC, 1 translocation
   RCC (TRCC), 8 molecularly divergent to histology RCC (MDTH), and 1
   plasmacytoid urothelial carcinoma (PUC). The following three samples
   were excluded from all downstream analysis: 2 NAT samples (C3N-00314-N
   and C3N-01524-N) that were found to be contaminated with tumor tissue
   and 1 PUC sample (C3L-02212-T) which is not a renal cell carcinoma. One
   tumor (C3N-02204-T) indicated with asterix in [500]Figure 1A is
   excluded from the wGIIanalysis since genomics data was not fully
   available at the time of data freeze.

Immunohistochemistry (IHC)

   Immunohistochemistry (IHC) was performed on 4-micron formalin-fixed,
   paraffin-embedded (FFPE) tissue sections. The Ventana Benchmark XT
   staining platform with Discovery CCI and CC2 (Ventana cat#950-500 and
   950-123) were used for antigen retrieval. The immune complexes were
   developed with either the ultraView or optiView Universal DAB
   (diaminobenzidine tetrahydrochloride) Detection Kit (Ventana
   cat#760-500 and cat#760-700). he details of the panel of primary
   antibodies utilized is as follows: polymeric immunoglobulin receptor
   (PIGR/Anti-SC; Santa Cruz, mouse monoclonal, catalog no. SC-374343),
   cyclin D1 (CCND1; Cell Marque, rabbit monoclonal, catalog no. 241R-18),
   transmembrane glycoprotein NMB (GPNMB, R&D systems, goat polyclonal,
   catalog no. AF2550), microtubule-associated protein RP/EB family
   member-3 (MAPRE3, Atlas antibodies, rabbit polyclonal, catalog no.
   HPA009263), and forkhead boxI1 (FOXI1, Origene antibodies, mouse
   monoclonal, catalog no. TA800146). Brown pigmentation within the
   subcellular component (cytoplasmic and or membranous for PIGR, GPNMB,
   MAPRE3 and nuclear for FOXI1 and CCND1) were taken as positive
   expressions. In addition for PIGR the presence and intensity of
   cytoplasmic staining were scored where the percentage of PIGR positive
   neoplastic cells and the staining intensity (none, 0; weak, 1;
   moderate, 2; strong, 3) were recorded for each tumor as described
   previously.[501]^8 Appropriate positive and negative control tissue
   were run in each assay batch.

RNA in situ hybridization (RNA-ISH)

   RNA-ISH was performed using the RNAscope 2.5 HD Brown kit (Advanced
   Cell Diagnostics, Newark, CA) and target probes against PIGR (472681
   Hs-PIGR targeting [502]NM_002644.3, 2-903nt), PYCR1 (509259 Hs-PYCR1
   targeting [503]NM_001282281.1, 64-1770nt), and SOSTDC1 469929
   Hs-SOSTDC1 targeting [504]NM_015464.2, 2-938nt) according to the
   manufacturer’s instructions. RNA quality was evaluated in each case
   utilizing a positive and a negative control probe against human
   housekeeping gene Peptidylprolyl Isomerase B (PPIB) (313901 for manual
   and 313909 for Ventana automated system) and bacillus bacterial gene
   DapB (310043 for manual and 312039 for Ventana automated system)
   respectively. The assay was run

   according to the protocol previously described.[505]^6^,[506]^9

   Stained slides were evaluated under a light microscope at ×100 and ×200
   magnification for RNA-ISH signals in neoplastic cells by multiple study
   investigators. Each RNA molecule in this assay’s result is represented
   as a punctate brown dot. The expression level was evaluated according
   to the RNAscope scoring criteria: score 0 = no staining or <1 dot per
   10 cells; score 1 = 1–3 dots per cell, score 2 = 4–9 dots per cell, and
   no or very few dot clusters; score 3 = 10–15 dots per cell and <10%
   dots in clusters; score 4 = >15 dots per cell and >10% dots in
   clusters. The H-score was calculated for each examined tissue section
   as the sum of the percentage of cells with score 0–4 [(A% × 0) +
   (B% × 1) + (C% × 2) + (D % × 3) + (E% × 4), A + B + C + D + E = 100],
   using previously published scoring criteria.[507]^6^,[508]^9

Sample processing for genomic DNA and total RNA extraction

   Our study sampled a single site of the primary tumor from surgical
   resections, due to the internal requirement to process a minimum of
   125 mg of tumor issue and 50 mg of adjacent normal tissue. DNA and RNA
   were extracted from tumor and blood normal specimens in a co-isolation
   protocol using Qiagen’s QIAsymphony DNA Mini Kit and QIAsymphony RNA
   Kit. Genomic DNA was also isolated from peripheral blood (3–5 mL) to
   serve as matched normal reference material. The Qubit dsDNA BR Assay
   Kit was used with the Qubit 2.0 Fluorometer to determine the
   concentration of dsDNA in an aqueous solution. Any sample that passed
   quality control and produced enough DNA yield to go through various
   genomic assays was sent for genomic characterization. RNA quality was
   quantified using both the NanoDrop 8000 and quality assessed using
   Agilent Bioanalyzer. A sample that passed RNA quality control and had a
   minimum RIN (RNA integrity number) score of 7 was subjected to RNA
   sequencing. Identity match for germline, normal adjacent tissue, and
   tumor tissue was assayed at the BCR using the Illumina Infinium QC
   array. This beadchip contains 15,949 markers designed to prioritize
   sample tracking, quality control, and stratification.

Preparation of libraries for cluster amplification and WGS sequencing

   An aliquot of genomic DNA (350 ng in 50 μL) was used as the input into
   DNA fragmentation (aka shearing). Shearing was performed acoustically
   using a Covaris focused-ultrasonicator, targeting 385bp fragments.
   Following fragmentation, additional size selection was performed using
   an SPRI cleanup. Library preparation was performed using a commercially
   available kit provided by KAPA Biosystems (KAPA Hyper Prep without
   amplification module) and with palindromic forked adapters with unique
   8-base index sequences embedded within the adapter (purchased from
   IDT). Following sample preparation, libraries were quantified using
   quantitative PCR (kit purchased from KAPA Biosystems), with probes
   specific to the ends of the adapters. This assay was automated using
   Agilent’s Bravo liquid handling platform. Based on qPCR quantification,
   libraries were normalized to 1.7 nM and pooled into 24-plexes.

Cluster amplification and sequencing (HiSeq X)

   Sample pools were combined with HiSeq X Cluster Amp Reagents EPX1,
   EPX2, and EPX3 into single wells on a strip tube using the Hamilton
   Starlet Liquid Handling system. Cluster amplification of the templates
   was performed according to the manufacturer’s protocol (Illumina) with
   the Illumina cBot. Flow cells were sequenced to a minimum of 15x on
   HiSeq X utilizing sequencing-by-synthesis kits to produce 151bp
   paired-end reads. Output from Illumina software was processed by the
   Picard data processing pipeline to yield BAMs containing demultiplexed,
   aggregated, aligned reads. All sample information tracking was
   performed by automated LIMS messaging.

Whole exome sequencing library construction

   Library construction was performed as described in Fisher
   et al.,[509]^123 with the following modifications: initial genomic DNA
   input into shearing was reduced from 3 μg to 20–250 ng in 50 μL of
   solution. For adapter ligation, Illumina paired-end adapters were
   replaced with palindromic forked adapters, purchased from Integrated
   DNA Technologies, with unique dual-indexed molecular barcode sequences
   to facilitate downstream pooling. Kapa HyperPrep reagents in 96-
   reaction kit format was used for end repair/A-tailing, adapter
   ligation, and library enrichment PCR. In addition, during the
   post-enrichment SPRI cleanup, elution volume was reduced to 30 μL to
   maximize library concentration, and a vortexing step was added to
   maximize the amount of template eluted.

In-solution hybrid selection

   After library construction, libraries were pooled into groups of up to
   96 samples. Hybridization and capture were performed using the relevant
   components of Illumina’s Nextera Exome Kit and following the
   manufacturer’s suggested protocol, with the following exceptions.
   First, all libraries within a library construction plate were pooled
   prior to hybridization. Second, the Midi plate from Illumina’s Nextera
   Exome Kit was replaced with a skirted PCR plate to facilitate
   automation. All hybridization and capture steps were automated on the
   Agilent Bravo liquid handling system.

Preparation of libraries for cluster amplification and sequencing

   After post-capture enrichment, library pools were quantified using qPCR
   (automated assay on the Agilent Bravo) using a kit purchased from KAPA
   Biosystems with probes specific to the ends of the adapters. Based on
   qPCR quantification, libraries were normalized to 2 nM.

Cluster amplification and sequencing

   Cluster amplification of DNA libraries was performed according to the
   manufacturer’s protocol (Illumina) using exclusion amplification
   chemistry and flowcells. Flowcells were sequenced utilizing
   sequencing-by-synthesis chemistry. The flow cells were then analyzed
   using RTA v.2.7.3 or later. Each pool of whole-exome libraries was
   sequenced on paired 76 cycle runs with two 8 cycle index reads across
   the number of lanes needed to meet coverage for all libraries in the
   pool. Pooled libraries were run on HiSeq 4000 paired-end runs to
   achieve a minimum of 150x on target coverage per each sample library.
   The raw Illumina sequence data were demultiplexed and converted to
   fastq files; adapter and low-quality sequences were trimmed. The raw
   reads were mapped to the hg38 human reference genome, and the validated
   BAMs were used for downstream analysis and variant calling.

Quality assurance and quality control of RNA analytes

   All RNA analytes were assayed for RNA integrity, concentration, and
   fragment size. Samples for total RNA-seq were quantified on a
   TapeStation system (Agilent, Inc. Santa Clara, CA). Samples with RINs
   >8.0 were considered high quality.

Total RNA-seq library construction

   Total RNA-seq library construction was performed from the RNA samples
   using the TruSeq Stranded RNA Sample Preparation Kit and bar-coded with
   individual tags following the manufacturer’s instructions (Illumina,
   Inc. San Diego, CA). Libraries were prepared on an Agilent Bravo
   Automated Liquid Handling System. Quality control was performed at
   every step and the libraries were quantified using the TapeStation
   system.

Total RNA sequencing

   Indexed libraries were prepared and run on HiSeq 4000 paired-end 75
   base pairs to generate a minimum of 120 million reads per sample
   library with a target of greater than 90% mapped reads. Typically,
   these were pools of four samples. The raw Illumina sequence data were
   demultiplexed and converted to FASTQ files, and adapter and low-quality
   sequences were quantified. Samples were then assessed for quality by
   mapping reads to the hg38 human genome reference, estimating the total
   number of reads that mapped, amount of RNA mapping to coding regions,
   amount of rRNA in sample, number of genes expressed, and relative
   expression of housekeeping genes. Samples passing this QA/QC were then
   clustered with other expression data from similar and distinct tumor
   types to confirm expected expression patterns. Atypical samples were
   then SNP typed from the RNA data to confirm the source analyte. FASTQ
   files of all reads were then uploaded to the GDC repository.

Single-nuclei RNA library preparation and sequencing

   About 20–30 mg of cryopulverized powder from ccRCC specimens was
   resuspended in Lysis buffer (10 mM Tris-HCl (pH 7.4); 10 mM NaCl; 3 mM
   MgCl2; and 0.1% NP-40). This suspension was pipetted gently 6–8 times,
   incubated on ice for 30 s, and pipetted again 4-6 times. The lysate
   containing free nuclei was filtered through a 40 μm cell strainer. We
   washed the filter with 1 mL Wash and Resuspension buffer (1X PBS +2%
   BSA +0.2 U/μL RNase inhibitor) and combined the flow through with the
   original filtrate. After 6-min centrifugation at 500 x g and 4°C, the
   nuclei pellet was resuspended in 500 μL of Wash and Resuspension
   buffer. After staining by DRAQ5, the nuclei were further purified by
   Fluorescence-Activated Cell Sorting (FACS). FACS-purified nuclei were
   centrifuged again and resuspended in a small volume (about 30 μL).
   After counting and microscopic inspection of nuclei quality, the nuclei
   preparation was diluted to about 1,000 nuclei/μL.

   About 20,000 nuclei were used for single-nuclei RNA sequencing (snRNA
   seq) by the 10X Chromium platform. We loaded the single nuclei onto a
   Chromium Chip B Single Cell Kit, 48 rxns (10x Genomics, PN-1000073),
   and processed them through the Chromium Controller to generate GEMs
   (Gel Beads in Emulsion). We then prepared the sequencing libraries with
   the Chromium Single Cell 3′ GEM, Library & Gel Bead Kit v3, 16 rxns
   (10x Genomics, PN 1000075) following the manufacturer’s protocol.
   Sequencing was performed on an Illumina NovaSeq 6000 S4 flow cell. The
   libraries were pooled and sequenced using the XP workflow according to
   the manufacturer’s protocol with a 28 × 8 × 98bp sequencing recipe. The
   resulting sequencing files were available as FASTQs per sample after
   demultiplexing.

Illumina Infinium methylationEPIC beadchip array

   The MethylationEPIC array uses an 8-sample version of the Illumina
   Beadchip capturing >850,000 DNA methylation sites per sample. 250 ng of
   DNA was used for the bisulfite conversation using Infinium
   MethylationEPIC BeadChip Kit. The EPIC array includes sample plating,
   bisulfite conversion, and methylation array processing. After scanning,
   the data was processed through an automated genotype calling pipeline.
   Data generated consisted of raw idats and a sample sheet.

Sample processing for protein extraction and tryptic digestion

   All samples for the current study were prospectively collected as
   described above and processed for mass spectrometric (MS) analysis at
   Johns Hopkins University. Tissue lysis and downstream sample
   preparation for global proteomic, phosphoproteomic and glycoproteomic
   analysis were carried out as previously
   described.[510]^24^,[511]^25^,[512]^124 Each of cryopulverized renal
   tumor tissues or NATs were homogenized separately in an appropriate
   volume of lysis buffer (8 M urea, 75 mM NaCl, 50 mM Tris, pH 8.0, 1 mM
   EDTA, 2 μg/mL aprotinin, 10 μg/mL leupeptin, 1 mM PMSF, 10 mM NaF,
   Phosphatase Inhibitor Cocktail 2 and Phosphatase Inhibitor Cocktail 3
   [1:100 dilution], and 20 μM PUGNAc) by repeated vortexing.

   Proteins in the lysates were clarified by centrifugation at 20,000 x g
   for 10 min at 4C, and protein concentrations were determined by BCA
   assay (Pierce). The proteins were diluted to a final concentration of
   8 mg/mL with a lysis buffer for the downstream reduction, alkylation
   and digestion. 1.2 mg of protein was reduced with 5 mM dithiothreitol
   (DTT) for 1 h at 37 C and subsequently alkylated with 10 mM
   iodoacetamide for 45 min at RT (room temperature) in the dark. Samples
   were then diluted by 1:4 with 50 mM Tris-HCl (pH 8.0) and subjected to
   proteolytic digestion with LysC (Wako Chemicals, at 1:50
   enzyme-to-substrate weight ratio for 2 h incubation at RT) followed by
   the addition of sequencing-grade modified trypsin (Promega, at a 1:50
   enzyme-to-substrate weight ratio for overnight incubation at RT). The
   digested samples were then acidified with 50% formic acid (FA, Fisher
   Chemicals) to pH < 3. Tryptic peptides were desalted on reversed-phase
   C18 SPE columns (Waters) and dried using a Speed-Vac (Thermo
   Scientific).

TMT labeling of peptides

   Tandem-mass-tag (TMT) quantitation utilizes reporter ion intensities to
   determine protein abundance and facilitate quantitative proteomic
   analysis.[513]^125 The samples from the discovery cohort were labeled
   with TMT-10plex as described in the ccRCC discovery paper,[514]^24
   while the samples from the non-ccRCC cohort were labeled with
   TMT-11plex reagents (Thermo Fisher Scientific). 70 non-ccRCC samples
   were co-randomized to 7 TMT 11-plex sets. The sample-to-TMT channel
   mapping is available in the PDC portal
   ([515]https://proteomic.datacommons.cancer.gov/). 300ug desalted
   peptides from each non-ccRCC and NAT sample were dissolved in 120 μL of
   100 mM HEPES, pH 8.5 solution. 5mg TMT reagent was dissolved in 500 μL
   of anhydrous acetonitrile, and 45 μL of each TMT reagent was added to
   the corresponding aliquot of peptides. After 1 h incubation at RT, the
   reaction was quenched by incubation with 5% hydroxylamine at RT for
   15 min. The reference sample used in the ccRCC discovery cohort
   study[516]^24 was included in all TMT 11-plexes as a reference channel
   in the non-ccRCC cohort study, labeled with the TMT-131 reagent.
   Following labeling, peptides were mixed according to the sample-to-TMT
   channel mapping, concentrated and desalted on reversed-phase C18 SPE
   columns (Waters), and dried using a Speed-Vac (Thermo Scientific).

Peptide fractionation by basic reversed-phase liquid chromatography

   To reduce the likelihood of peptides co-isolating and co-fragmenting in
   these highly complex samples, we employed extensive, high-resolution
   fractionation via basic reversed-phase liquid chromatography (bRPLC).
   The desalted and dried peptides from each TMT set were reconstituted in
   900 mL of 5 mM ammonium formate (pH 10) and 2% acetonitrile (ACN) and
   loaded onto a 4.6 mm × 250 mm RP Zorbax 300 A Extend-C18 column with
   3.5 μm size beads (Agilent). Peptides were separated at a flow-rate of
   1 mL/min using an Agilent 1200 Series HPLC instrument with Solvent A
   (2% ACN, 5 mM ammonium formate, pH 10) and a non-linear gradient of
   Solvent B (90% ACN, 5 mM ammonium formate, pH 10) as follows: 0%
   Solvent B (7 min), 0%–16% Solvent B (6 min), 16%–40% Solvent B
   (60 min), 40%–44% Solvent B (4 min), 44%–60% Solvent B (5 min), and
   holding at 60% Solvent B for 14 min. Collected fractions were
   concatenated into 24 fractions by combining four fractions that are 24
   fractions apart as described previously[517]^25; a 5% aliquot of each
   of the 24 fractions was used for global proteomic analysis, dried in a
   Speed-Vac, and resuspended in 3% ACN/0.1% formic acid prior to
   ESI-LC-MS/MS analysis. The remaining sample was utilized for
   phosphopeptide enrichment.

Enrichment of phosphopeptides by Fe-IMAC

   The remaining 95% of the sample was further concatenated into 12
   fractions before being subjected to phosphopeptide enrichment using
   immobilized metal affinity chromatography (IMAC) as previously
   described.[518]^25 In brief, Ni-NTA agarose beads (Qiagen) were
   conditioned and incubated with 10mM FeCl3 to prepare Fe3+-NTA agarose
   beads. Dried peptides from each fraction were reconstituted in 80%
   ACN/0.1% trifluoroacetic acid and incubated with 10 μL of the Fe3+-IMAC
   beads for 30 min. Samples were then centrifuged at 1000∗g for 1 min to
   collect the beads coupled with phophopeptides, and the supernatant
   containing unbound peptides was removed for the subsequent
   glycopeptides enrichment (Cao, PDA paper, cell, 2021). The beads were
   resuspended with 80% ACN/0.1% trifluoroacetic acid and then transferred
   onto equilibrated C-18 Stage Tips. Tips were washed twice with 80%
   ACN/0.1% trifluoroacetic acid followed by 1% formic acid.

   The flowthroughs were collected and combined with the supernatants for
   subsequent glycopeptides enrichments. Phosphopeptides were eluted from
   the Fe3+-IMAC beads onto the C-18 Stage Tips with 70 μL of 500 mM
   dibasic potassium phosphate, pH 7.0 three times. C-18 Stage Tips were
   then washed twice with 1% formic acid to remove salts, followed by
   elution of the phosphopeptides from the C-18 Stage Tips with 50%
   ACN/0.1% formic acid twice. Eluted phosphopeptides were dried down and
   resuspended in 3% ACN/0.1% formic acid prior to ESI-LC-MS/MS analysis.

Enrichment of intact glycopeptides by MAX columns

   All unbound peptides from phosphopeptide enrichment were desalted on
   reversed phase C18 SPE column (Waters). The glycopeptides were enriched
   with OASIS MAX solid-phase extraction (Waters). The MAX cartridge was
   conditioned with 3 × 1 mL ACN, then 3 × 1 mL of 100 mM triethylammonium
   acetate buffer, followed by 3 × 1 mL of water, and finally 3 × 1 mL of
   95% ACN (1% TFA). The peptides were loaded twice. The cartridge was
   washed with 4 × 1 mL of 95% ACN (1% TFA) to remove non-glycosylated
   peptides. The glycopeptide fraction was eluted with 50% ACN (0.1% TFA),
   dried down, and reconstituted in 3% ACN, 0.1% FA prior to ESI-LC-MS/MS
   analysis.

ESI-LC-MS/MS for global proteome, phosphoproteome, and glycoproteome analysis

   The TMT-labeled global proteome, phosphoproteome, and glycoproteome
   fractions were analyzed using Orbitrap Fusion Lumos mass spectrometer
   (Thermo Scientific). Approximately 0.8 μg of peptides were separated on
   an in-house packed 28 cm × 75 mm diameter C18 column (1.9 mm
   Reprosil-Pur C18-AQ beads (Dr. Maisch GmbH); Picofrit 10 mm opening
   (New Objective)) lined up with an Easy nLC 1200 UHPLC system (Thermo
   Scientific). The column was heated to 50°C using a column heater
   (Phoenix-ST). The flow rate was set at 200 nL/min. Buffer A and B were
   3% ACN (0.1% FA) and 90% ACN (0.1% FA), respectively. The peptides were
   separated with a 6%–30% B gradient in 84 min. Peptides were eluted from
   the column and nanosprayed directly into the mass spectrometer. The
   mass spectrometer was operated in a data-dependent mode.

   Parameters for global proteomic and phosphoproteomic samples were set
   as follows: MS1 resolution - 60,000, mass range – 350 to 1800 m/z, RF
   Lens – 30%, AGC Target – 4.0e5, Max injection time – 50 ms, charge
   state include – 2–6, dynamic exclusion – 45 s. The cycle time was set
   to 2 s, and within this 2 s the most abundant ions per scan were
   selected for MS/MS in the orbitrap. MS2 resolution – 50,000,
   high-energy collision dissociation activation energy (HCD) – 34,
   isolation width (m/z) – 0.7, AGC Target – 2.0e5, Max injection time –
   100 ms. Parameters for glycoproteomic samples were set as follows: MS1
   resolution - 60,000, mass range – 500 to 2000 m/z, RF Lens – 30%, AGC
   Target – 5.0e5, Max injection time – 50 ms, charge state include – 2–6,
   dynamic exclusion – 45 s. The cycle time was set to 2 s, and within
   this 2 s the most abundant ions per scan were selected for MS/MS in the
   orbitrap. MS2 resolution – 50,000, high-energy collision dissociation
   activation energy (HCD) – 35, isolation width (m/z) – 0.7, AGC Target –
   1.0e5, Max injection time – 100 ms.

Metabolomic acquisition

   To extract metabolites, a solution consisting of 80% (v/v) mass
   spectrometry-grade methanol and 20% (v/v) mass spectrometry-grade water
   were used to extract the metabolites from the tissue samples as
   described previously.[519]^126^,[520]^127^,[521]^128 The metabolite
   samples then underwent speed vacuum processing to evaporate the
   methanol and lyophilization to remove the water. The dried metabolites
   were re-suspended in a solution consisting of 50% (v/v) acetonitrile
   and 50% (v/v) mass spectrometry-grade water before data acquisition.
   Data acquisition was performed using a Vanquish ultra-performance
   liquid chromatography (UPLC) system and a Thermo Scientific Q Exactive
   Plus Orbitrap Mass Spectrometer.

   The samples were kept at 4°C inside the Vanquish UPLC auto-sampler. The
   injection volume for each sample was 2 μL. A Discovery HSF5 reverse
   phase HPLC column (Sigma) kept at 35°C with a guard column was used for
   reverse-phase chromatography. The mobile aqueous phase was mass
   spectrometry-grade water containing 0.1% formic acid, while the mobile
   organic phase was acetonitrile containing 0.1% formic acid. Mass
   calibration was performed prior to data acquisition to ensure the
   sensitivity and accuracy of the system. The total run time for each
   sample was 15 min, for which 11 min was used for data acquisition. Full
   MS data were acquired to quantify the metabolites while Full MS/ddMS2
   data were also acquired to identify the metabolites based on
   fragmentation matching.

Quantification and statistical analysis

Somatic mutation calling

   WES reads were aligned FASTQ files to the GRCh38 references, including