Abstract

   Correctly identifying perturbed biological pathways is a critical step
   in uncovering basic disease mechanisms and developing much-needed
   therapeutic strategies. However, whether current tools are optimal for
   unbiased discovery of relevant pathways remains unclear. Here, we
   create “Benchmark” to critically evaluate existing tools and find that
   most function sub-optimally. We thus develop the “Pathway Ensemble
   Tool” (PET), which outperforms existing methods. Deploying PET, we
   identify prognostic pathways across 12 cancer types. PET-identified
   prognostic pathways offer additional insights, with genes within these
   pathways serving as reliable biomarkers for clinical outcomes.
   Additionally, normalizing these pathways using drug repurposing
   strategies represents therapeutic opportunities. For example, the top
   predicted repurposed drug for bladder cancer, a CDK2/9 inhibitor,
   represses cell growth in vitro and in vivo. We anticipate that using
   Benchmark and PET for unbiased pathway discovery will offer additional
   insights into disease mechanisms across a spectrum of diseases,
   enabling biomarker discovery and therapeutic strategies.

   Subject terms: Cancer genomics, Tumour biomarkers, Computational
   biology and bioinformatics, Prognostic markers, Drug discovery
     __________________________________________________________________

   Multiple cellular pathways are altered in cancer and identifying them
   is relevant for prognosis and therapy. Here, the authors develop
   Benchmark and Pathway Ensemble Tool (PET), two computational approaches
   to optimise pathway discovery in cancer and predict related biomarkers
   and therapeutic avenues.

Introduction

   The pathogenesis of complex diseases is often underpinned by
   dysregulation of several cellular pathways. This is well exemplified by
   neoplastic diseases, where abnormalities in multiple pathways
   contribute to cancerous transformation or in which cancer-disposing
   events perturb multiple cellular pathways. Despite notable successes in
   some cancers, limited therapeutic gains have been made by
   identification and targeting of single contributory genes. Precisely
   pinpointing key dysregulated celluar pathways in cancer and ranking
   them by order of importance is, therefore, of major biological and
   clinical significance. Doing so would enable molecular subtyping,
   determination of diagnostic/prognostic biomarkers and inform choice of
   effective and cancer-specific drug regimens. This is especially
   important at the present time because the incidence of cancer is
   increasing^[78]1.

   The contemporary approach to elucidate aberrations in biological
   pathways is dominated by omics methods, such as RNA-sequencing
   (RNA-seq). As expected, vast amounts of omics data have been generated
   for most cancers. However, unlocking their full potential has remained
   challenging. Biological pathway analysis, a computational search to
   identify and rank by importance the key relevant and/or dysregulated
   pathways among the thousands of known potential biological pathways, is
   a critical means for deciphering these data. Numerous in silico methods
   (reviewed here^[79]2,[80]3), for example GSEA^[81]4 and Enrichr^[82]5,
   have been developed and broadly applied to conduct pathway analyses.
   One common application of these methods has been to test whether
   specific pathways hypothesized by researchers are, in fact, enriched in
   a dataset. In this approach, pathway enrichment analyses have generally
   proven successful, but pre-suppose a priori knowledge or hypothesis. An
   alternative, highly desired application, especially in the context of
   exponentially growing omics datasets, is the adoption of unbiased
   discovery-based approaches. Here, the objective is to rank the “correct
   pathways” above all other pathways in the reported results without
   imparting bias from a priori knowledge. However, the performance and
   suitability of existing pathway analysis methods for this form of
   blinded or unbiased discovery remains largely unknown. In other words,
   it is not known how often the top pathway(s) returned by an analysis
   method represent(s) the most important perturbation. This is likely due
   to the fact that most methods were developed and evaluated
   (“benchmarked”) using computer simulated pathways^[83]6 or verified
   only in very small scale biological datasets from historical,
   low-resolution microarray technology^[84]2,[85]7–[86]9. Therefore, a
   key limitation is lack of proper benchmarking and assessment of the
   capacity of these tools for pathway discovery under experimental
   settings, which often contain substantial “noise” due to stochastic
   events and biological variations. Such lack of a suitable evaluation
   platform predicated on large-scale experimental data significantly
   impedes informed deployment of pathway analysis tools for blinded
   discovery of perturbations in cancer.

   Here, we designed an extensive “Benchmark” to explicitly measure the
   performance of many commonly used pathway analysis tools for unbiased
   discovery under experimental settings. Benchmark was constructed from
   large and diverse high throughput experimental data generated by
   ENCODE^[87]10, such that expected pathways are well-defined. Using
   Benchmark, we found that commonly used pathway analysis tools
   consistently underperformed in unbiased pathway discovery, even after
   extensive optimization. Given these findings, we next developed another
   method for biological pathway discovery, termed Pathway Ensemble Tool
   (PET), which statistically combines the rank metrics from multiple
   input methods. PET significantly outperformed all existing tools for
   unbiased identification of dysregulated pathways with high accuracy and
   was resistant to biological noise. Importantly, we applied PET to
   systematically determine biological pathways associated with favorable
   or unfavorable prognosis in 12 distinct cancer types. Combinations of
   leading genes from PET-identified prognostic pathways offered superior
   prognostication power compared to existing biomarkers, dividing
   patients into highly resilient and highly vulnerable categories. As a
   way of validation, these findings were confirmed in independent cancer
   datasets and found to be consistent with established aggressive
   molecular subtypes. We next carried out computational screening for
   repurposing existing drugs to normalize genes from PET-derived
   prognostic pathways for each cancer type. This was done by specifically
   looking for drugs that can down-regulate the leading genes of pathways
   associated with unfavorable prognosis or up-regulate the leading genes
   of pathways associated with favorable prognosis. This resulted in the
   identification of both established effective drugs and additional drug
   candidates. To explore these, we validated the top predicted drug,
   CCT068127, a CDK2/9 inhibitor, for bladder and cervical cancer in vitro
   and in vivo settings. We confirmed significant repression of cancer
   growth, which was superior to that seen with known clinically
   efficacious drugs in this category. We demonstrated that the predicted
   drug indeed restricts cancer cells by normalizing the expression of
   genes belonging to predicted prognostic pathways. We anticipate that
   Benchmark will enable development of highly accurate, unbiased pathway
   discovery methods such as PET, which can lead to tangible insights into
   molecular mechanisms and identify additional diagnostic/prognostic
   biomarkers and therapeutics.

Results

Benchmark enables identification of optimal settings and best practices for
systematic pathway discovery in experimental datasets

   We developed Benchmark as an evaluation platform to explicitly evaluate
   performance of pathway/geneset analysis tools (Fig. [88]1a, b; please
   note that the terms ‘pathway’ and ‘geneset’ are used interchangeably
   throughout the manuscript). Briefly, Benchmark comprises three
   components: 1) “input genesets” (IGS), representing genesets to be
   investigated – these are most commonly user lists of genes derived from
   high throughput assays, such as RNA-seq or Chromatin
   Immunoprecipitation sequencing (ChIP-seq) performed during projects of
   interest; 2) “target genesets” (TGS), representing curated biological
   pathways (e.g. in the KEGG and Gene Ontology databases) to be examined
   for their relationship to the IGS; 3) pathway enrichment analysis tools
   which test statistical relationships between IGS and TGS (Fig. [89]1a).
   The genesets used in Benchmark were extracted from large numbers
   (~1000) of high-throughput sequencing experimental assays from
   ENCODE^[90]10 (Supplementary Data [91]1a). Every geneset in Benchmark
   was composed of genes bound in ChIP-seq by a defined transcription
   factor (TF), in eCLIP-seq by a defined RNA binding protein (RBP), or in
   RNA-seq as genes differentially expressed after small interfering or
   hairpin RNA-mediated knockdown of a specified target (gKD). For each
   TF, RBP or gKD, at least two genesets from distinct cell lines (e.g.
   K562 and HepG2) or species (e.g. human and mouse) were represented.
   Importantly, Benchmark was designed to simulate a common scenario where
   the investigated genesets (i.e. IGSs) are derived from different
   sources (e.g. cell lines, species, conditions) than established
   biological pathways (i.e. TGSs). To achieve this, Benchmark was created
   such that genesets from the same TF, RBP or gKD but from distinct cell
   lines are better matched to each other than any other genesets. As an
   example, the IGS that contains STAT1-bound genes in K562 cells is more
   closely related to the TGS that contains STAT1-bound genes in HepG2
   cells than any other TGS.

Fig. 1. Benchmark for evaluating and optimizing existing pathway analysis
methods.

   [92]Fig. 1
   [93]Open in a new tab

   a Schematic representation of Benchmark for each pathway analysis
   method. For every input geneset (IGSi), the outputs, of pathway
   enrichment analysis against all target genesets (TGSi) are recorded in
   the score matrix M. Benchmark is created such that TGSi is the most
   related geneset to IGSi compared to all other TGSs. Thus, the rank of
   diagonal entries in M represent how well the correct genesets are
   scored for each IGS. The best score is 1. b, c Cumulative distribution
   function (CDF) plots of correct pathway ranks comparing indicated
   pathway analysis methods on shRNA-seq genesets. Median ranks for
   correct pathways, the fraction of correct pathways among top 10
   reported pathways (precision@10/P@10, and average precision@10/AP@10)
   are shown. d Plots showing the correct pathway ranks of the indicated
   methods in shRNA-seq genesets. Boxes span from the first to the third
   quartile, with the median highlighted, and whiskers covering the
   minimum to maximum values. The precision statistics of each method at
   finding the correct pathway amongst reported pathways are reported
   below each bar. Target genesets in b–d were defined as the top 200
   upregulated genes by DESeq2 in K562 cells; input genesets in b–d for
   default and optimized settings were extracted using signal2noise
   (default setting of GSEA) and DESeq2 in HepG2 cells, respectively. e
   Plots showing the performance of multiple methods as increasing
   percentage of unrelated random genes (“noise”) are introduced to target
   genesets. This evaluates the robustness of methods to inaccuracies (or
   noise) in target genesets. Lines and error bars show median +95%
   confidence interval at each noise level. *p < 0.05; **p < 0.01;
   ****p < 0.0001 by two-sided Wilcoxon rank sum test corrected for
   multiple hypothesis. ora! and ora are essentially identical, with the
   only difference being that they are implemented using different
   programming languages. Specifically, ora! is implemented in R by egsea,
   while ora is implemented in Python by PET. Source data are provided as
   a Source Data file.

   Consequently, for each IGS, an ideal pathway analysis method should
   identify the most closely matched (i.e. “correct”) TGS and to rank it
   as number one compared to all other TGSs. Thus, in Benchmark, the
   enrichment of TGS[1 to n] is calculated for each IGS using the user’s
   pathway analysis method of choice to generate a score matrix, from
   which the rank of correct pathways can be determined for that method
   (Fig. [94]1a). Thus, by evaluating how accurately pathway analysis
   methods rank matched IGSs and TGSs compared to all other TGSs,
   Benchmark can assess the performance of different methods in
   identifying and ranking correct biological pathways in experimental
   settings.

   Using Benchmark, we first evaluated the performance of different
   pathway analysis tools, namely, decoupler^[95]11, piano^[96]12,
   egsea^[97]6, ssgsea^[98]4, camera^[99]13, ora^[100]6, enrichment
   browser^[101]14 (e.browser), zscore^[102]15, gsva^[103]16,
   padog^[104]17, roast^[105]18, plage^[106]19, gage^[107]20 and
   safe^[108]21. We compared three statistics for all pathway analysis
   methods. First, we determined the median rank of the correct pathway
   returned by each method. As experimental biologists typically focus on
   the top 10 reported pathways, we also determined the frequency by which
   the correct pathway is among the top 10 reported pathways, termed
   “Precision@10” (P@10). We also adopted another reliable performance
   metric commonly used in information retrieval, known as Average
   Precision at 10 (AP@10). AP@10 is the average (mean) of the precision
   scores at each of the first 10 positions, i.e. P@1, P@2,…, P@10. We
   tested four different settings for how the genes can be ranked and
   genesets can be generated in these methods (i.e. ratio of classes or
   signal2noise or edgeR- or DESeq2-based ranking (Fig. [109]1b,
   Supplementary Fig. [110]S1a). The genes generated from matched IGSs and
   TGSs using these settings exhibited a significantly higher degree of
   overlap compared to randomly generated control IGSs and TGSs
   (Supplementary Fig. [111]S1b). The top three methods,
   decoupler^[112]11, piano^[113]12, egsea^[114]6, notably all ensemble
   approaches, achieved a median correct pathway rank of 1–8, P@10 of
   52–76%, and AP@10 of 44–69%, under the best setting (Fig. [115]1b).
   This creates a scenario where pathways with significant biological
   importance are regularly not among the top 10 reported pathways,
   hindering the utility of these methods for unbiased discovery.

   We next focused on evaluating the performance of the three most used
   individual pathway analysis methods ora^[116]6, GSEA^[117]4 and
   Enrichr^[118]5 in more detail. We tested the four different settings
   described above (Supplementary Fig. [119]S1c) as well as 3 different
   options for sizes of genesets (i.e. 200, 500 and 1000 genes)
   (Supplementary Fig. [120]S1d). Specifically, under the default settings
   (i.e. signal2noise ranking), the median ranks of the correct pathway
   (7–14) and P@10 (45–54%) values were similar to the top three
   performing methods above (Fig. [121]1c, Supplementary Fig. [122]S1c),
   indicating that ora, GSEA and Enrichr are not ideal for unbiased
   pathway discovery under default settings. We found that optimization of
   input parameters did significantly improve pathway discovery - a
   geneset size of 200 genes was near optimal for both GSEA and Enrichr
   (Supplementary Fig. [123]S1d) and using DESeq2-ranked lists of
   differentially expressed genes returned the correct pathway at a median
   rank of 3 and 63–68% of the time among the top 10 reported pathways
   (Fig. [124]1c, Supplementary Fig. [125]S1c). Similar improvement could
   be achieved when selecting the optimal binding intensity statistics in
   ChIP-seq or eCLIP-seq. For example, using the signal value of binding
   in ChIP-seq data to rank genes and create genesets usually leads to an
   improved pathway discovery performance compared to utilizing adjusted
   p-values (or q-values) of binding signals (Supplementary
   Fig. [126]S1e). It is worth noting that the overlap size of matched
   IGSs and TGSs generated from ChIP-seq and eCLIP-seq data was
   significantly larger compared to that of shRNA-seq data (Supplementary
   Fig. [127]S1f), which could explain the overall trend of better
   performance observed in pathway analysis methods on these genesets.

   Collectively, these data indicate that Benchmark can be used for
   evaluating existing and future pathway analysis methods for discovery.
   Notably, although the most widely used individual methods can be
   optimized, they remain sub-optimal for unbiased pathway discovery, and
   ensemble approaches typically outperform these individual methods.

Pathway Ensemble Tool (PET) enables unbiased pathway discovery

   We next sought to develop a high precision method to enable unbiased
   systematic pathway discovery. Leveraging the accuracy and robustness of
   ensemble approaches, we created Pathway Ensemble Tool (PET). PET works
   by calculating the unweighted mean of the rank statistics from GSEA,
   ora, and Enrichr, all operating under the optimal settings determined
   above (see Methods). We tested the accuracy of PET using Benchmark. PET
   significantly improved the median rank of the correct genesets to 1,
   indicating that PET tends to rank the correct pathway as number 1
   (Fig. [128]1d). PET also significantly enhanced P@10 and AP@10
   statistics, with 73–82% of the time the correct pathways being among
   the top 10 pathways reported by PET (Fig. [129]1d).

   Pathway analysis tools are expected to conform to the false discovery
   rate (FDR) estimates, meaning that at 5% FDR levels, 5% of reported
   pathways are expected to be false. To determine the actual false
   discovery rates of pathway analysis methods, we randomly sampled and
   divided replicated RNA-seq datasets into two groups and ran pathway
   analysis methods comparing the two groups for enrichment of biological
   pathways (see Methods). Since the samples were randomly selected
   biological replicates, any reported pathways significantly different
   between the groups represent false discoveries. Surprisingly, we found
   that some of the pathway analysis methods had much higher than the
   expected false discovery rate, in some cases as high as 25%
   (Supplementary Fig. [130]S1g, h). In fact, specific pathways, such as
   the KEGG Ribosome pathway, were common false discoveries (Supplementary
   Fig. [131]S1i and Supplementary Data [132]1b), suggesting that caution
   should be exercised when these pathways are identified as significant
   during unbiased discovery approaches. However, we found that PET
   operated at the expected 5% false discovery rate (Supplementary
   Fig. [133]S1g, h).

   To further validate PET reported results and ensure its ability to
   discern pathways associated with alterations that drive differences in
   transcriptional programs within groups, we applied PET to compare
   fibroblast growth factor receptor 3 (FGFR3) mutant vs. wild-type (WT)
   bladder cancer cell lines, estrogen receptor (ER)^+ vs. ER^– breast
   cancer cell lines, and epidermal growth factor receptor (EGFR) mutant
   vs. WT lung cancer cell lines, available in CCLE^[134]22 (Supplementary
   Data [135]1c), assessing all canonical pathways. In all three
   comparisons, the top reported pathways were supported by the literature
   (Supplementary Fig. [136]S2). For instance, the top two enriched
   pathways in ER^+ breast cancer cell lines were “Estrogen dependent gene
   expression” and “Estrogen receptors (ESR)-mediated signaling”, which
   corresponds with the established understanding of estrogen receptor
   signaling in ER^+ breast cancer^[137]23,[138]24. In EGFR mutant lung
   cancer cell lines, the top two enriched pathways were “Surfactant
   metabolism” and “Integrin5 pathway”, which are both recognized for
   their cross-talk with EGF signaling in lung cancer^[139]25–[140]29.
   Moreover, the “MYC activity pathway” and its regulated “Ribosomal RNA
   processing” pathway were among the top enriched pathways in FGFR3
   mutant bladder cancer cell lines, aligning with known interactions
   between FGFR3 and MYC in bladder cancer, as supported by existing
   literature^[141]30–[142]34 (Supplementary Fig. [143]S2). These data
   demonstrate that PET is able to discern subtle yet significant
   variations in pathway activities across different genetic backgrounds.

   Because most pathways are defined under real experimental conditions,
   there are multiple variables, such as cell type, species and
   stimulation, that contribute a degree of differences or inaccuracies in
   genesets. This “noise” manifests as missing or unrelated genes among
   matched IGSs and TGSs. To test the robustness of common pathway
   analysis tools, we evaluated their performance under conditions of
   simulated noise by replacing genes from the target genesets with
   increasing proportions of unrelated genes selected randomly from the
   genome. We found most available pathway analysis methods to be highly
   sensitive to such differences as their performance significantly
   dropped after >30% of unrelated random genes were introduced into
   existing genesets (Fig. [144]1e). Importantly, however, PET was highly
   robust to even as high as 60% added noise (Fig. [145]1e). Collectively,
   these results indicate that PET is robust, has controlled false
   discovery rate and high precision, thus can enable reliable and
   systematic identification of key dysregulated pathways in omics
   datasets.

PET identifies prognostic pathways in 12 distinct cancer types

   We next sought to deploy PET for unbiased pathway discovery in cancer.
   We selected 12 distinct cancer types from the cancer genome atlas
   (TCGA) that had cancer tissue RNA-seq data from the earliest stages of
   disease (based on the pathological stage or primary tumor size/extent)
   in at least 20 subjects that had unfavorable prognosis (i.e. succumbed
   to death) during the follow-up period (Fig. [146]2a; Supplementary
   Data [147]2a, b). We used PET to identify the pathways enriched in
   samples from patients with favorable (survivors) or unfavorable
   (deceased) prognosis for each cancer type, since these pathways could
   act as potential biomarkers of outcomes. To ensure that age, sex and
   cancer stage were not confounders, patients with favorable prognosis
   were randomly matched with those with unfavorable prognosis (see
   Methods). We confirmed that the groups were matched in age and sex and
   that age and sex were not determinants of outcome (Supplementary
   Fig. [148]S3). Since tumor infiltrating lymphocytes, classically
   depicted by CD8-expression, are known determinants of outcome, we also
   noted that CD8 expression did not differ between subjects with
   favorable and unfavorable outcomes and that CD8 expression could not
   distinguish survivors from deceased subjects (Supplementary
   Fig. [149]S4a). We next examined >1000 canonical biological pathways
   curated in MSigDB^[150]4 that could potentially differentiate cancer
   patients that survived from those that did not using PET. Out of these
   pathways, PET identified tens of biological pathways significantly
   associated with favorable or unfavorable prognosis for each cancer type
   (Fig. [151]2b, Supplementary Data [152]2c). Examples of the top
   favorable and unfavorable prognostic pathways in each cancer types are
   shown in Supplementary Fig. [153]S4b and Fig. [154]2c, respectively. As
   proof-of-principle, we validated the reported pathways associated with
   unfavorable outcome in bladder cancer in three independent
   cohorts^[155]35–[156]37 (Supplementary Fig. [157]S4c). Some pathways
   were significantly dysregulated in multiple different cancers (e.g.
   cell cycle in bladder carcinoma (BLCA), cervical squamous cell
   carcinoma (CESC), kidney renal papillary cell carcinoma (KIRP) and
   pancreatic adenocarcinoma (PAAD)) whereas others were cancer-specific
   (e.g. glycerolipid metabolism in breast invasive carcinoma (BRCA) and
   arginine and proline metabolism in KIRP. Some of these pathways have
   been shown to have biological significance in previous studies but have
   not yet been formally recognized as prognostic markers. For example, we
   identified triacylglyceride synthesis as a significant poor prognostic
   pathway in breast carcinoma (Fig. [158]2c). This is consistent with
   fatty acid synthesis as a requisite for breast cancer metastasis to
   brain^[159]38 and association of upregulated cellular triacylglycerol
   with enhanced survival of human breast cancer cells^[160]39. Another
   identified poor prognostic pathway that was shared between multiple
   cancer types including bladder and pancreatic cancers was extracellular
   matrix (ECM) organization (Fig. [161]2c). This is in line with prior
   knowledge about the role of ECM components, such as laminin, collagen
   and fibronectins in invasion, progression, and metastasis of bladder
   and pancreatic cancers and restriction of antitumor drug
   delivery^[162]40,[163]41. Similarly, we found that Vitamin D (VitD)
   receptor pathway to be a strong favorable prognostic pathway in stomach
   cancer (Supplementary Fig. [164]S4b). This is congruent with the fact
   that in gastric cancer patients, VitD levels serve as an independent
   prognostic factor and VitD supplementation can reduce the risk of
   developing gastric cancer^[165]42,[166]43.

Fig. 2. PET identifies pathways and gene combinations associated with
unfavorable prognosis.

   [167]Fig. 2
   [168]Open in a new tab

   a Cancers used in this study, along with tissue type, pathological
   stage (roman numerals) or pathological primary tumor TNM T score (e.g.
   T1, T2, T3) and number of samples. BLCA bladder carcinoma, BRCA Breast
   invasive carcinoma, CESC Cervical squamous cell carcinoma, COAD Colon
   adenocarcinoma, HNSC Head and Neck squamous cell carcinoma, KIRC Kidney
   renal clear-cell carcinoma, KIRP Kidney renal papillary cell carcinoma,
   LIHC Liver hepatocellular carcinoma, LAUD Lung adenocarcinoma, LUSC
   Lung squamous cell carcinoma, PAAD Pancreatic adenocarcinoma, STAD
   Stomach adenocarcinoma. To adjust for confounders, samples from later
   stages of diseases have also been included in cancers denoted by #. b
   Plot showing the number of significantly enriched pathways identified
   using PET among >1000 MSigDB canonical pathways in patients with
   favorable (blue) or unfavorable (red) prognosis in the indicated
   cancers. c Top pathways from each cancer associated with unfavorable
   survival. FDR values are from PET. d Area under the curve (AUC)
   distribution of indicated sets used to predict deceased status in each
   cancer type. Leading genes were selected from top 20 significant
   pathways associated with unfavorable prognosis. DEGs, differentially
   expressed genes; error bars show mean ± s.d. AUCs of leading gene
   combinations are higher than all genes and the AUCs of 5-gene
   combinations are higher than individual leading genes (Wilcoxon rank
   sum test p-value < 1e−9) (Supplementary Data [169]2d). e Elbow plots
   showing significant AUCs (low to high) from all 1–5 leading gene
   combinations. Statistics were based on FDR-adjusted logrank test of
   Kaplan–Meier (KM) survival curves. BRCA and LIHC did not yield any
   significant unfavorable predictors. KM plots showing of the top
   combination biomarker (darker lines in f), the top DEG (lighter lines
   in f) of unfavorable survival, PAI-1 expression (orange lines in g) and
   uPA expression (blue lines in g) to stratify overall survival. Samples
   in f and g were evenly split into two groups according to
   z-score/expression values of the indicated biomarker. Hazard ratio (HR)
   and logrank test p-values are reported. *p < 0.01; **p < 0.001;
   ***p < 0.0001. See Supplementary Fig. [170]S4 for information on
   favorable biomarkers. Source data are provided as a Source Data file.

PET-identified biological pathways guide combination biomarker discovery

   We next explored the genes in significant prognostic pathways to
   identify a subset of genes with high prognostic power that might have
   clinical utility. Single gene expression values have provided some
   level of prognostic power for different cancers^[171]44,[172]45.
   However, expression of multiple biomarkers are consistently more
   accurate diagnostic/prognostic tools to stratify patients and improve
   cancer management^[173]46,[174]47. This has fueled a growth in studies
   identifying gene signatures associated with response to therapy and/or
   survival. However, these signatures are typically derived from lists of
   differentially expressed genes and characteristically consist of
   hundreds of genes, making clinical application of these large
   signatures challenging^[175]48. The alternative brute force strategy,
   which involves iteratively uncovering combinations of genes with
   prognostic power among ~20,000 protein-coding genes in the genome, is
   computationally challenging (or impossible) and statistically
   unfeasible. For example, an exhaustive search space for a 3-gene
   combination has 8 trillion permutations, which is statistically
   unfeasible given the number of cancer samples available for study. We
   postulated that limiting the search space of combinatorial biomarker
   discovery to leading genes from prognostic pathways might alleviate
   this problem. We therefore selected up to 20 of the most significant
   prognostic pathways reported by PET and extracted their leading genes
   from the corresponding GSEA results. We then generated all possible 1–5
   gene combinations from the leading genes and quantified prognostic
   power by calculating the area under the curve (AUC) statistic of the
   receiver operating characteristic (ROC) curve for the ability of a gene
   or gene combinations to distinguish survivors from deceased subjects
   (see Methods). Combinations of 3–5 leading genes from prognostic
   pathways were found to be the best predictor of prognosis. Notably,
   none of the individual gene biomarkers displayed a higher AUC than the
   combinations identified. These are shown in Fig. [176]2d, e,
   Supplementary Fig. [177]S4d, e and Supplementary Data [178]2d, e,
   together with their corresponding overall survival plots (Fig. [179]2f,
   Supplementary Fig. [180]S4f). As an example, high expression of two
   collagen genes (COL5A1, COL3A1) and vitronectin (VTN) provides a
   significantly unfavorable overall survival outcome (<20% 10-year
   survival; HR 9.5; p < 0.0001) in renal papillary cancers (KIRP;
   Fig. [181]2e, f). The ability of this 3-gene combination to distinguish
   survivors from non-survivors over a prolonged period of follow-up is
   superior to any other biomarkers of renal cancer in current
   use^[182]49. Previous studies have linked the expression of individual
   collagen genes such as COL5A1 and COL23A1 to renal
   cancer^[183]50,[184]51. Collagen expression has also been associated
   with tumor grade^[185]52 and plays a key role in promoting tumor cell
   invasion^[186]53 in renal cancers. Furthermore, vitronectin is linked
   to inducing the differentiation of cancer stem cells and the formation
   of tumors^[187]54. Thus, it is biologically conceivable that high
   expression of this 3-gene combination biomarker provides the observed
   dramatically unfavorable overall survival outcome, surpassing the
   prognostic power of individual collagen genes alone.

   Moreover, the combination biomarkers outperformed any individual gene
   including differentially expressed genes (DEGs; Supplementary
   Data [188]2f) in all cancers, and no individual gene was found to be
   the most effective on its own (Fig. [189]2d, e, Supplementary Fig.
   [190]S4d, e). Illustrative examples are indicated in Fig. [191]2f and
   Supplementary Fig. [192]S4f, showing Kaplan–Meier overall survival
   curves of the top combination biomarker compared to the top DEG in each
   cancer type. As further illustration, we compared the performance of
   the PET-derived combination biomarkers (Fig. [193]2f) to that of PAI-1
   (Plasminogen activator inhibitor 1) and uPA (Urokinase plasminogen
   activator) (Fig. [194]2g), because these two genes are frequently
   investigated as potential cancer prognostic markers^[195]55. PAI-1 is
   associated with poor prognosis in some cancer types, such as breast,
   colon, and lung cancer. uPA is a protease involved in the degradation
   of the extracellular matrix, and linked to tumor invasiveness,
   angiogenesis, and poor prognosis in some cancers, including breast,
   prostate, and ovarian cancers. We found that although elevated
   expression of PAI-1 in CESC, LUSC, PAAD, and STAD, and uPA in PAAD may
   have prognostic value (Fig. [196]2g), they are both inferior to the
   identified top combination biomarkers (Fig. [197]2f). Moreover, the
   identified combination biomarkers accurately predicted outcomes across
   independent cohorts of patients. For example, >70% of biomarkers in
   TCGA cohorts exhibited consistent favorable or unfavorable overall
   survival outcomes in several independent cohorts of
   bladder^[198]35–[199]37, breast^[200]56, pancreatic^[201]57, and
   stomach^[202]58,[203]59 cancers (Supplementary Fig. [204]S4g;
   Supplementary Data [205]2g). We also investigated the percentage of
   combination biomarkers that consistently exhibited predictive power in
   later-stage samples. We found that in certain cancer types (e.g. HNSC,
   PAAD), the early-stage biomarkers consistently predict outcomes even in
   later stages of cancer. However, in other cancer types (e.g. BLCA,
   KIRP), only a small fraction of the early-stage biomarkers demonstrate
   consistency in their predictive ability (Supplementary Data [206]2h,
   i).

   Mutations play an important role as biomarkers in clinical settings,
   therefore we sought to compare the common cancer mutations versus gene
   expression-based biomarkers for their power in determining overall
   survival outcome. To this end, we focused on the three most frequently
   mutated genes in cancer samples, BRAF, TP53 and KRAS, that often confer
   poor prognosis^[207]60–[208]62. For each TCGA cancer type with
   sufficient number of samples exhibiting mutations (n > 15), we split
   the samples into two groups based on their mutational status: wild-type
   samples without the mutation and samples containing mutant oncogenes.
   To assess whether these mutations could serve as indicators of
   survival, we conducted log-rank tests between the wild-type (WT)
   samples and mutated (mut) samples for each cancer type. Among all the
   cancer types examined, pancreatic cancer (PAAD) was the only one in
   which the presence of KRAS mutation served as a significant indicator
   of poorer overall survival. (Hazard ratio = 1.8; logrank
   p-value = 0.02) (Supplementary Fig. [209]S4h). Nevertheless, the KRAS
   mutation exhibited less discriminatory power compared to gene
   expression-based biomarkers in distinguishing between favorable
   (Supplementary Fig. [210]S4f) and unfavorable (Fig. [211]2f) survival
   outcomes.

   We also assessed the susceptibility of the identified biomarkers or
   conventional DEGs and their combinations to overfitting, a commonly
   encountered pitfall in the search for trustworthy biomarkers that leads
   to poor performance when tested on different samples. We found that
   combination biomarkers were significantly less prone to overfitting
   compared to those derived from conventional DEGs (Supplementary
   Fig. [212]S4g). This indicates that combination biomarkers have a
   better balance between complexity and robustness, which makes them more
   suitable for real-world applications. Collectively these data indicate
   that combination biomarkers identified using PET have great potential
   to stratify patient outcomes and perform better than existing known
   biomarkers.

   Identifying resilient and vulnerable cancer patients is crucial as it
   allows healthcare providers to personalize treatment plans, improve
   prognosis predictions, prioritize resources, and match patients with
   appropriate clinical trials. This leads to more effective and efficient
   care, better outcomes, and improved quality of life for patients. We
   thus considered whether combining the top biomarkers of favorable with
   the top biomarkers of unfavorable outcome would provide added
   prognostication power. To do this, we divided the samples into two
   groups based on expression of the top favorable combination biomarker
   and two groups based on expression of the top unfavorable combination
   biomarker. By overlapping these groups, patients can be further divided
   into subgroups according to simultaneous low and high expressions of
   the top biomarkers of favorable and unfavorable survival. The resilient
   category was defined as having high levels of the top favorable
   combination biomarker and low levels of the top unfavorable combination
   biomarker, while the vulnerable category was defined as having low
   levels of the top favorable combination biomarker and high levels of
   the top unfavorable combination biomarker. Indeed, this approach
   identified highly resilient and highly vulnerable cancer patients
   (Fig. [213]3a, Supplementary Fig. [214]S5a, Supplementary
   Data [215]3a). For example, resilient patients with papillary or clear
   cell kidney cancer (KIRP and KIRC) had nearly 90–100% survival, while
   vulnerable patients had only 10–20% overall survival over a prolonged
   period of follow-up (~4000 days) (Fig. [216]3b). This prognostication
   strategy also corroborated with established independent prognostic
   biomarkers in these patients^[217]63. Specifically, vulnerable patients
   have more than two-fold higher frequency of repressed CDKN2A expression
   via copy number loss or DNA methylation when compared to resilient
   patients (Fig. [218]3c, Supplementary Data [219]3b). The same approach
   in other cancers also yielded added benefit, because in almost all
   cancers combining the top biomarker of favorable and the top biomarker
   of unfavorable outcome provided highly accurate prognostication
   (Supplementary Fig. [220]S5a, Supplementary Data [221]3a). For
   instance, the 10-year overall survival rate for resilient patients with
   bladder cancer is nearly 9 times higher than that of vulnerable
   patients (HR: 4.6; p < 0.0001; Supplementary Fig. [222]S5a). Such
   accurate prognostication can have a major influence on clinical
   decision making and treatment course and can potentially help
   rationalize less or more aggressive individualized treatment for
   patients within different vulnerability.

Fig. 3. Juxtaposition of favorable and unfavorable prognostic biomarkers
offers precise overall survival stratification.

   Fig. 3
   [223]Open in a new tab

   a Venn diagrams showing the distribution of patients with kidney renal
   papillary cell carcinoma (KIRP) (left) and kidney renal clear cell
   carcinoma KIRC (right) based on expression of the top favorable
   (labeled in Supplementary Fig. [224]S4e) and the top unfavorable
   (labeled in Fig. [225]2e) combination prognostic markers on RNA-seq
   from tumor samples. b Kaplan–Meier (KM) overall survival curves for 4
   subgroups of KIRP or KIRC patients whose tumors express indicated
   combination of prognostic biomarkers. The four subgroups are
   highlighted with a bold font in a. Hazard ratio (HR) and logrank test
   p-values denote comparisons with the lines in red. ns not significant;
   *p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001. c Plots showing
   the fraction of KIRP (left) and KIRC (right) tumors which have
   alterations (methylation or copy number loss) in CDKN2A^[226]63 in each
   group. Source data are provided as a Source Data file.

   The molecular subtype of tumors, particularly basal and luminal
   subtypes, is one of the factors implicated in predicting the outcome of
   bladder cancer. The basal subtype is associated with muscle
   invasiveness and worse overall prognosis but is more responsive to
   platinum-based chemotherapy^[227]64. To verify if identified prognostic
   markers corroborate with this known prognostic factor, we sourced three
   independent cohorts of bladder cancer samples^[228]35,[229]64,[230]65
   that was annotated with the two major molecular subtypes. Indeed, we
   found that combination biomarkers of unfavorable prognosis were
   expressed at high levels in the basal molecular subtype across all
   three cohorts (Supplementary Fig. [231]S6a, Supplementary
   Data [232]3c).

   We had previously demonstrated using Benchmark that PET is highly
   robust to noise (Fig. [233]1e). To formally test this under real-world
   settings, we obtained RNA-seq datasets from a different species of
   mammal. We selected canine bladder cancers that included annotations
   for luminal and basal subtypes^[234]66 as these sub-types behave
   similarly to humans and are considered as having good and poor
   prognosis, respectively. The genes associated with poor prognosis in
   human bladder cancer effectively differentiated luminal and basal
   subtypes of canine bladder cancer as well (Supplementary
   Fig. [235]S6b), demonstrating the robustness of these biomarkers even
   across different species. Collectively, these data indicate that our
   biomarker discovery approach can identify combination prognostic
   biomarkers with significant clinical value and to stratify molecular
   subtypes of cancer with differential prognosis.

PET-identified biological pathways guide drug repurposing

   Since prognostic markers are derived from dysregulated pathways, we
   postulated that manipulating gene expression to promote the biomarkers
   of favorable prognosis and demote the biomarkers of poor prognosis by
   re-purposing known drugs might provide clinical benefit. Major
   consortium efforts have created a valuable library of integrated
   cellular signatures (LINC) that includes cellular transcriptional
   responses to large numbers of pharmacologic perturbagens^[236]67.
   Similar signatures have also been derived from mining large numbers of
   publicly available sequencing read archive (SRA) data^[237]68. Recently
   we showed that computational screening for drugs that normalize
   expression of genes in highly prognostic and pathogenic pathways using
   cellular signatures of those drugs provides significant therapeutic
   potential^[238]69,[239]70. Thus, we computationally screened >1600
   drugs (mostly FDA-approved) for which the up- or down-regulated drug
   targets are known^[240]68 (Supplementary Data [241]4a). Drugs were
   ranked by their ability to suppress unfavorable and to promote
   favorable prognostic biomarkers. The rank was calculated based on the
   significance of overlap between genes up/down regulated by a drug and
   genes from favorable/unfavorable prognostic pathways in cancer
   (Fig. [242]4a; see Methods). This in silico screening predicted several
   drugs already in clinical use for each cancer type, as well as many
   drugs that have not yet been tested (Fig. [243]4b and Supplementary
   Data [244]4b). Key interactions between highly significant unfavorable
   prognostic pathways and drugs that can normalize these are summarized
   for each cancer type in Fig. [245]4c. By way of validation, the top
   predictions included drugs currently in clinical practice or under
   trial for some of these cancers. For example, doxorubicin, a commonly
   used chemotherapy agent for the management of metastatic breast
   cancer^[246]71,[247]72, was the second top prediction for breast
   cancer. Celastrol, the top prediction for colon cancer, was recently
   shown to inhibit proliferation of colorectal cancer cells and migration
   via the PI3K/AKT pathway^[248]73. Ruxolitinib, the top prediction for
   liver cancer, is known to have marked tumoricidal effects on
   hepatocellular carcinoma by inhibiting JAK/STAT signaling^[249]74.
   Interestingly, irinotecan, a topoisomerase inhibitor, which regulates
   the cell cycle, was the top prediction for pancreatic cancer (PAAD), a
   tumor type in which “cell cycle” is among the top pathways associated
   with unfavorable prognosis (Fig. [250]2c). Last year, the FDA approved
   investigational use of liposomal irinotecan for first line treatment of
   pancreatic cancer patients^[251]75–[252]77. Two other pathways strongly
   associated with unfavorable outcome from pancreatic adenocarcinoma are
   the Erbb1 (also known as EGFR) and MYC pathways; both of these have
   previously been associated with worse outcomes from this cancer
   type^[253]78,[254]79 and are also inhibited by irinotecan^[255]80.
   These data indicate that matching drug signatures and prognostic
   markers can be used for repurposing existing therapeutics.

Fig. 4. PET-derived prognostic pathways guide drug repurposing.

   [256]Fig. 4
   [257]Open in a new tab

   a Schematic of drug screening aims; computational screen searches for
   drugs that upregulate expressions of genes from pathways with favorable
   prognosis and downregulate expressions of genes from pathways with
   unfavorable prognosis. b Elbow plots showing the significance of
   overlap between drug up (or down) regulated genes with leading genes
   from pathways with favorable (or unfavorable) prognostic potential.
   Drugs are sorted based on the significance (see Methods). Top two drugs
   for each cancer type are highlighted. c Sankey plot showing top
   pathways associated with unfavorable prognosis for each cancer type on
   the left and top two predicted drugs that demote expression of genes in
   indicated pathway on the right. The thicknesses of links are
   proportional to the significance of the pathway (E-value > 5; left; see
   Methods) or the number of genes affected by drug (right). Only drugs
   that have a significant impact on the top prognosis pathways for each
   cancer are displayed. As a result, cancers without drugs affecting
   their top predicted prognosis pathways are not displayed. Source data
   are provided as a Source Data file.

The CDK2/9 inhibition restricts cervical and bladder cells in vitro and/or in
vivo

   We next sought to test the efficacy of one of our predicted drugs. The
   top predicted drug for bladder and cervical cancer was a
   cyclin-dependent kinase (CDK)2/9 inhibitor named 0175029
   (Fig. [258]4b). This was somewhat surprising because inhibitors of
   other CDKs, specifically CKD4/6 inhibitors, have previously been
   proposed as monotherapy for bladder cancer. The use of CDK4/6
   inhibitors have been justified because elevated expressions of either
   CDK4 or CDK6 are associated with poor survival in bladder cancer,
   whereas high expression of CDK2/9 have been associated with better
   survival (Supplementary Fig. [259]S7a). However, CDK4/6 inhibitors
   including Palbociclib did not meet the primary endpoint during phase II
   clinical trials for bladder cancer^[260]81–[261]84. Moreover, taking
   the conventional DEG-based approach, where one normalizes the most
   differentially expressed genes between cancer survivors and
   non-survivors, does not identity CDK2/9 inhibitor as a significant lead
   in either bladder or cervical cancers (Supplementary Fig. [262]S7b,
   Supplementary Data [263]4c). However, it is worth noting that although
   (CDK)2/9 inhibitor emerges as the top prediction for bladder cancer
   using DEG-based approach, it does not reach the statistical
   significance threshold (Supplementary Fig. [264]S7b). We therefore
   scrutinized whether PET-based drug prediction specifically advocates
   CDK2/9 or CDK4/6 inhibition in bladder cancer. Since, the drug database
   used for screening above^[265]68 did not include any CDK4/6 inhibitors,
   we computationally screened an additional 5425 compounds from the
   LINC1000 database^[266]85 that included the CDK4/6 inhibitor
   Palbociclib. PET-based drug predictions did not advocate Palbociclib
   (p-value = 1; rank 1601) but returned 0175029, a CDK2/9 inhibitor as
   the number 1 scoring drug that can significantly normalize genes of
   prognostic pathways (Supplementary Fig. [267]S7c, Supplementary
   Data [268]4d).

   For experimental validation, we then tested the ability of a panel of
   CDK inhibitors to restrict the growth of bladder and cervical cancers
   in vitro, including Palbociclib (a CDK4/6 inhibitor), GW8510 (a CDK2
   inhibitor), and Seliciclib (a CDK2/7/9 inhibitor) (Supplementary
   Fig. [269]S7d). Because 0175029 is not commercially available we used
   its recently identified analog CCT068127 which is also a potent CDK2/9
   inhibitor^[270]86. Consistent with PET-based drug prediction,
   CCT068127, but not Palbociclib, significantly inhibited the growth of
   both cervical and bladder cancer cell lines, whereas the others either
   did not, or had minimal effect (Fig. [271]5a). We also found that in
   two out of three independent cohorts of bladder cancer samples, the
   expression of CDK9 (but not CDK2) was higher in the basal subtypes of
   bladder cancers compared to luminal subtypes (Supplementary
   Fig. [272]S7e), suggesting that CDK9 inhibition might be more
   beneficial for the basal subtype of bladder cancer. CDK9 expression
   also showed a significant correlation (Pearson r = 0.71;
   p-value = 0.006) with unfavorability scores across bladder cancer cell
   lines available in the CCLE database^[273]22 (see Fig. [274]5b and
   Supplementary Data [275]5a). The unfavorability score for each sample
   was calculated as the mean (z-score) expression of genes associated
   with an unfavorable outcome (Supplementary Data [276]2e). To expand on
   this, we referenced publicly available data from the GDSC
   database^[277]87 where a range of bladder cancer cell lines were
   treated with a different CDK9 specific inhibitor (i.e. CDK9_5038) and
   IC50 values were calculated. Despite the difference in inhibitors, we
   observed a trend where cell lines with higher unfavorability scores
   required higher concentrations of the inhibitor to achieve similar
   inhibition of proliferation (Pearson r = 0.54, p < 0.06)
   (Fig. [278]5b). These associations suggest a broader efficacy of CDK9
   inhibitors across bladder cancer cell lines, with efficacy correlating
   inversely with unfavorability scores.

Fig. 5. Predicted CDK2/9 inhibition restricts the growth of cervical and
bladder cancers in vitro and in vivo.

   [279]Fig. 5
   [280]Open in a new tab

   a Plots showing the survival rate, measured by CCK-8 assay, of cervical
   (Hela) and bladder (T24 and UMUC3) cancer cell lines treated with
   different concentration of vehicle or indicated drugs for 72 h. The
   selective targets of CDK inhibitors are shown in parenthesis.
   Mean +/− sem from n > 3 biological replicates. b Plot showing
   expression of the CDK9 gene versus the unfavorability score for each
   bladder cancer cell line in the CCLE database^[281]22. The color scale
   represents the IC50 values reported by the GDSC database^[282]87 for
   the CDK9-specific inhibitor CDK9_5038. Cell lines for which IC50 values
   are not available are highlighted in gray. The Spearman correlation
   value and corresponding p-value are shown in red. The center line and
   error bars indicate the trend line and 95% confidence interval.
   Selected cell lines are labeled. c Heatmap showing differentially
   expressed genes (DEGs; 3-fold at adjusted p-value < 0.05) between DMSO
   or CCT068127 (1uM) treated T24 cells for 48 h in n = 3 independent
   biological replicates. GSEA plots showing enrichment of genes repressed
   by the CDK4/6 inhibitor Palbociclib (top) or genes repressed by the
   CDK9 inhibitor VIP152 (bottom) in T24 cells treated 48 h with the
   CDK2/9 inhibitor CCT068127 (1 uM) or DMSO control. Palbociclib
   (SRP404373) or VIP152^[283]101 inhibited genes are publicly sourced. d
   GSEA plot showing enrichment of transcriptomes of CCT068127 (1 uM) or
   DMSO treated T24 cells in favorable (top) or unfavorable (bottom) PET
   identified prognosis genes in bladder cancer. Shown are the FDR value
   and normalized enrichment score (NES). e Schematic of xenograft
   experiment and treatment plan with 30 mg/kg daily i.p. injections of
   CCT068127 or DMSO carrier. p-values compare DMSO-treated to
   CCT068127-treated groups at each time-point. f Plot showing daily tumor
   volumes during course of experiment. g Bar plots showing the mass of
   tumors at explant on day 23. Error bars show mean ± s.d. p-values in a
   and f are by two-way ANOVA with multiple comparisons adjustment and in
   g by two-sided Wilcoxon rank sum test. ns non-significant; *p < 0.01;
   **p < 0.001; ***p < 0.0001. Panel e Created with BioRender.com released
   under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0
   International license
   ([284]https://creativecommons.org/licenses/by-nc-nd/4.0/deed.en).
   Source data are provided as a Source Data file.

   At present, the first line of treatment for bladder cancer consists of
   chemotherapy regimens that feature cisplatin. As expected, cisplatin
   also significantly inhibited the growth of bladder cancer cell lines
   (Fig. [285]5a). It is worth noting that our approach also predicted
   that cisplatin desirably normalizes prognostic genes (Supplementary
   Fig. [286]S7c). To further investigate the impact CCT068127 on bladder
   cancer cells, we performed RNA-sequencing on T24 bladder cancer cells
   treated with carrier or CCT068127 (see Methods). 822 genes were found
   to be up-regulated and 490 genes down-regulated (RPKM > 2, Fold
   change ≥ ±3 at adjusted p-value ≤ 0.05; Fig. [287]5c; Supplementary
   Data [288]5b). CCT068127 significantly reduced the expression of genes
   associated with CDK9 inhibition, while showing no significant effect on
   genes associated with CDK4/6 inhibition (Fig. [289]5c). To test whether
   CCT068127 can normalize the predicted prognostic signature, we gathered
   all the genes linked to favorable or unfavorable prognosis from the
   prognostic pathways identified by PET and conducted a gene set
   enrichment analysis. As anticipated from the drug predictions, genes
   that were upregulated by CCT068127 were significantly enriched in those
   with PET-predicted favorable prognosis. Conversely, genes that were
   downregulated by CCT068127 were significantly enriched in in those with
   PET-predicted unfavorable prognosis (Fig. [290]5c, d). The up-regulated
   genes included PRDX4 and DIO3, which were determined by PET to be key
   markers of a favorable prognosis (Supplementary Fig. [291]S4e).
   Conversely, the down-regulated genes including XAF1, KPNB1 and PAS1
   which were found to be significant markers of an unfavorable prognosis
   (Fig. [292]2e). These data corroborated the expectation that the top
   predicted drug normalized identified prognostic genes.

   We next tested the in vivo efficacy of CCT068127 in mice.
   Pharmacokinetic studies identified daily administration of up to
   30 mg/kg of CCT068127 for two weeks to be well-tolerated by mice, as
   evidenced by stable weight during that time (Supplementary
   Fig. [293]S7e) and lack of gross pathology on necropsy. We then
   implanted bladder cancer cells subcutaneously in immunodeficient mice
   and after two weeks, when tumors were approximately 100 mm^3 in size,
   randomly assigned mice to control or treatment groups, injecting them
   with 30 mg/kg/day of DMSO or CCT068127 i.p. for 10 days, respectively
   (Fig. [294]5e). CCT068127-treated mice demonstrated a significant
   reduction in both tumor volumes and weights (Fig. [295]5f, g),
   indicating that CCT068127 was effective at restricting tumor growth in
   vivo.

   Taken together, these data indicate that, despite being unexpected from
   traditional methods, CDK9 inhibition predicated on favorable and
   unfavorable prognostic biomarkers suppress tumor cell growth both in
   vitro and in vivo. Furthermore, they corroborate the notion that a
   PET-based approach can guide repurposing drugs that prove effective in
   the treatment of cancer.

Discussion

   The contemporary investigation of the underlying mechanisms in cancer
   relies disproportionately on obtaining genome-wide omics data.
   Biological pathway analysis is a crucial tool in interpreting these
   data. As such, several pathway analysis methods have been developed,
   with the implicit assumption that they identify and prioritize the most
   important pathways among their top-ranked outputs (reviewed
   here^[296]2,[297]3). Indeed, these methods have been successful in
   confirming some of the speculated and/or dysregulated pathways. Many of
   these methods have also been tested on simulated data^[298]6 or
   anecdotally verified in small scale biological
   datasets^[299]2,[300]8,[301]9. However, the performance and limitations
   of these methods for an unbiased discovery of related biological
   pathways has been largely unknown.

   Here, we developed a platform named Benchmark to explicitly measure
   performance of pathway analysis tools for discovery under experimental
   settings. It is worth noting that Benchmark has several key
   advantageous features, making it superior to existing gold-standard
   evaluation platforms^[302]7. First, IGSs and TGSs that are used for
   evaluation in Benchmark are derived from large numbers (~1000) of
   experimental datasets, where expected pathways are well-defined. This
   is in clear contrast to the few existing evaluation platforms that
   utilize small experimental datasets^[303]2,[304]7, computer-simulated
   pathways^[305]88, and/or subjective measures for defining
   pathways^[306]2. Second, since IGSs and TGSs in Benchmark are derived
   from experimental data, they eliminate systematic biases introduced by
   computer simulations. For example, IGSs and TGSs in Benchmark are free
   from biases introduced due to the assumption by most simulations that
   genes within each pathway are independent of each other. Third, IGSs
   and TGSs recapitulate a realistic setting where genesets under
   investigation (i.e., IGSs) are often derived under conditions (e.g.,
   cell lines, species, conditions) that are different from those of known
   biological pathways (TGSs). Fourth, because Benchmark uses experimental
   data, it contains noise inherent to biological data, therefore provides
   a platform for a faithful readout of the performances of pathway
   analysis tools under real-life settings. Lastly, we present two
   evaluation measures, namely P@10 and AP@10, which accurately reflect
   the practical application of pathway analysis methods for biologists.

   Our findings from Benchmark show that while common pathway analysis
   methods are effective for confirming speculated pathways, they are not
   optimal for unbiased and systematic discovery of the most disrupted
   pathways. Our study found that PET, which integrates the ranking
   statistics of several pathway analysis tools, is significantly more
   effective for discovery and has high accuracy and noise resistance. It
   is worth noting that ensemble approaches, such as decoupler^[307]11,
   piano^[308]12, and egsea^[309]89, have been previously developed for
   pathway analysis and have generally resulted in improved outcomes
   compared to single method approaches. In our evaluation using
   Benchmark, we found that these ensemble methods, when employed with
   default settings and when the underlying methods and input parameters
   were chosen ‘blindly’ without careful consideration, either performed
   marginally better or were comparable to their underlying methods. Our
   results demonstrated that optimizing the input parameters of individual
   pathway analysis methods not only significantly improves their
   performance but also enhances the results of the ensemble approach
   overall. Here we have optimized and combined results from the three
   most common pathway analysis methods. However, additional research is
   required to assess how to effectively integrate statistics from other
   pathway analysis methods to improve pathway discovery. The number of
   replicates in each group is an important consideration in controlling
   for false discovery rates^[310]90. Our Benchmark is generally
   constructed from 2 replicates per group available in ENCODE^[311]10.
   However, the average number of overlapping genes among related genesets
   (n = 16) compared to unrelated genesets (n = 1.3) across cell lines,
   obtained from 2 replicates (Supplementary Fig. [312]S1b, f), suggests
   that this sample size is sufficient for accurately distinguishing
   related from unrelated genesets. Future studies will extend the
   Benchmark to include additional genesets and replicates, and to
   evaluate the suitability of pathway analysis methods for increasingly
   available single cell data.

   We next applied PET to systematically determine pathways associated
   with prognosis in 12 different well-characterized cancer types. The
   predicted pathways from PET were correlated with prognostic features
   hypothesized from biological experiments and were confirmed in
   independent datasets. We further leveraged genes from the prognostic
   pathways to identify single gene and combination prognostic biomarkers.
   It is important to note that exhaustive genome-wide searches to
   identify combinations of a few genes as biomarkers (even as few as 3)
   is computationally and statistically unfeasible. Thus, many studies
   extract large gene signatures (>50 genes) associated with favorable or
   unfavorable prognosis by identifying differentially expressed genes
   (DEGs). Although useful, these signature sets are challenging to apply
   in a clinical setting^[313]48. Our data indicate that a search space of
   only 3–5 leading edge genes from prognostic pathways yeilds gene
   combinations with superior prognostic power when compared to single
   genes or DEGs. Importanly, combining leading biomarkers of favorable
   and unfavorablle prognoses resulted in the categorization of patients
   into highly resilient and highly vulnerable groups, providing a
   guidance for determining the level of treatment intensity. Moreover,
   our data show that genes from prognostic pathways distingush known
   molecular subtyes of cancer with high precision. Thus it is likely that
   these genes will also identify previously unknown molecular subtypes,
   but further investigations are needed to provide proof. It is important
   to recognize that the selection of samples can influence the
   identification of prognostic pathways and biomarkers. In our analyses,
   we classified patients with unfavorable outcomes as those in early
   cancer stages who passed away, and age and sex matched patients with
   favorable outcomes as those in later stages who were marked as alive in
   the TCGA cohort. A more precise alternative selection criterion would
   have been to classify deceased patients with survival times less than
   the median, and age and sex matched alive individuals with days to last
   follow-up more than the median of all samples as unfavorable and
   favorable selections, respectively.

   Pathways with prognostic potential could provide a means for
   identification of effective therapeutics. Indeed, our data indicate
   that a computational search for therapeutics that can normalize the
   expression of genes from prognostic pathways can be used for drug
   repurposing. Proof-of-concept in vitro and in vivo validations of the
   top predicted repurposed drug in bladder and cervical cancer cells
   indicate the potential of our approach. The prediction of CDK9 but not
   CDK4/6 by our approach was somewhat unexpected but aligned with the
   observation that Palbociclib (a CDK4/6 inhibitor) did not meet the
   primary endpoint during phase II clinical trials in bladder
   cancer^[314]81–[315]84. Previous studies have indicated that
   alterations in the CDK4/6 pathway could potentially drive bladder
   cancer^[316]91–[317]93 but the cell lines and assays used in these
   studies are slightly different than ours. Consistently, our cell
   viability measurements, conducted using the CCK-8 assay, also
   demonstrated a modest yet significant effect of Palbociclib on T24
   cells. Further in vivo investigation using patient-derived bladder
   cancer xenografts (PDX) models are warranted^[318]94, to compare the
   effects of CDK9 inhibitors and CDK4/6 inhibitors on primary bladder
   tumors in future studies. Of note, our data driven approach will also
   provide the mechanistic rationale (i.e., the involved pathways) for
   identified therapeutics and offer insights into basic mechanisms that
   can be exploited for drug development. However, whether predicted
   prognostic pathways contribute directly to the pathophysiology (as
   opposed to merely representing biomarkers) can only be determined via
   experimentation. It is also important to note that prognostic
   biomarkers, which predict patient outcomes, and predictive biomarkers,
   which determine the efficacy of individual drugs, are often distinct
   from each other. With increasing omics data from responders and
   non-responders, future research will be needed to apply pathway
   enrichment analysis to identify not only prognostic pathways but also
   predictive pathways.

Methods

Ethics statement

   All procedures involving mice were conducted in accordance with the
   guidelines established in the Guide and Use of Laboratory Animals of
   the National Institutes of Health, USA. The protocol was approved by
   the Institutional Animal Care and Use Committee at the Purdue
   University (PACUC Protocol #1908001941). Animals were housed in Hansen
   mouse facility at Purdue University. All mice are fed and watered ad
   libitum, with consistent access to food and water, and housed in a
   facility maintained at ambient temperature and humidity with 12-h
   light/12-h dark cycles. All mice were sacrificed prior to the tumors
   reaching a mean diameter of 2.0 cm or in the event of ulceration with
   necrosis/infection as outlined in the PACUC policy on “Human endpoint
   criteria for rodent tumor studies”.

Cell culture

   T-24 and UMUC3, human urinary bladder cancer cells were acquired from
   ATCC (HTB-4^TM and CRL-1749^TM, respectively). Both the cells were
   maintained in Roswell Park Memorial Institute medium (RPMI 1640; Gibco)
   supplemented with 10% fetal bovine serum (FBS; Hyclone Lab, Logan,
   Utah), 1% penicillin/streptomycin (PS) (Invitrogen, Carlsbad, CA, USA)
   in humidified chamber with 5% CO[2] at 37 °C. HeLa, human cervical
   carcinoma cells were acquired from ATCC (CCL-2™) and maintained in
   complete Eagle’s Minimum Essential Medium (MEM, Gibco) at similar
   conditions. The cell lines used here were not independently
   authenticated but were regularly tested for mycoplasma contamination
   using the Mycoplasma Detection Kit (abm, cat#G238).

Cell viability assay

   CDK inhibitors GW8510 (CDK2), Palbociclib (CDK4/6), Seliciclib
   (CDK2/7/9), CCT068127 (CDK2/9) were purchased from AOBIOUS, INC
   (Gloucester, MA, USA) and were dissolved in dimethyl sulfoxide (DMSO)
   to make stock solutions of 10 mM and stored at −20 °C. Working
   concentrations, as indicated, were further diluted with either cell
   culture media or saline. Hela, UMUC3 and T-24 cells (approximately
   20,000 per well) were seeded in a 96-well plate 24 h before prior to
   treatment with various concentrations of the drugs for 72 h at 37 °C.
   Cell viability was measured using Cell Counting Kit-8 (APExBIO;
   Houston, USA) reagent following manufacturer’s recommendations.
   Briefly, 10 ul of CCK-8 reagent per 100 ul media was added to each well
   and kept at 37 °C for 2 h. The absorbance of WST-8 formazan was
   recorded using a Veritas Microplate Luminometer at 450 nm (BioTec;
   Epoch2). The viability (%) was calculated by using formula: Cell
   viability (also called survival rate) (%) = [(Absorbance of tested
   compound − Absorbance of blank)/(Absorbance of control − Absorbance of
   blank)] × 100.

RNA-sequencing and data analysis

   T-24 cells (approximately 1 × 10^6 per well) were seeded in a 6 well
   plate. Next day, cells were treated with either vehicle (DMSO) control
   or CCT068127 at 1uM concentration for 48 h. Cells were stored with TRI
   Reagent (ZYMO Research International, Tustin, CA) at −80 °C. Genomic
   RNA was extracted from cells using a Direct-zol RNA MiniPrep Plus (ZYMO
   Research International; Tustin, CA). An Agilent 4200 TapeStation system
   was used to determine quality and quantity of RNA. Samples having RNA
   integrity number (RIN) value above 8.5 were used for library
   preparation. 500 ng total RNA for each sample was subjected to NEBNext
   Poly(A) mRNA Magnetic Isolation (E7490) and resulting mRNA libraries
   were prepared for RNA-seq by NEB Ultra II kit (New England Biolabs)
   following manufacturer’s protocols and sequenced by NovaSeq (Illumina).
   All sequenced reads were aligned to hg38 genome using RSEM^[319]95 with
   parameters “rsem-calculate-expression --bowtie-n 1 --bowtie-m 100
   --seed-length 28 --bowtie-chunkmbs 1000”. Differential expression
   analysis was performed using edgeR. Average RPKM ≥2 in at least one
   condition, Fold change ≥3 and adjusted p-value ≤0.05 was used to
   determine DEGs.

   The TCGA data analysis used read count, normalized expression profiles,
   FPKM, and clinical information from samples of 12 cancer types (as
   shown in Fig. [320]2a). Stage selection was based on
   ajcc_pathologic_stage (NCI Thesaurus Code: [321]C28257) and
   ajcc_pathologic_t (NCI Thesaurus Code: [322]C48881 and [323]C48739)
   from clinical data. In cases where both indicators were present but
   contradictory, the indicator that corresponds to an earlier stage of
   tumor staging is prioritized. To account for age as a potential
   confounder, all deceased samples and those annotated as alive with an
   age higher than the first quantile of all deceased patients in the
   dataset were included. The selected samples were tested to confirm that
   age and sex is not a significant confounder for all cancers.

   For canine data analysis, we employed PET to discover prognostic
   pathways in later stages of human Bladder Urothelial Carcinoma (BLCA)
   patients (annotated as T3 and T4). The ability to predict outcomes for
   single genes and combinations of genes from the identified pathways was
   determined (see below). Finally, the genes with the strongest
   prognostic abilities were used to distinguish molecular subtypes of
   canine tumors.

Xenograft mouse model

   The original strain of all mice used in this study were purchased from
   Jackson Laboratory, Strain# 025216, NOD.Cg-Prkdc^scid H2-K1^b-tm1Bpe
   H2-Ab1^g7-em1Mvw H2-D1^b-tm1Bpe Il2rg^tm1Wjl/SzJ and were acclimatized
   and maintained in Hansen mice facility at Purdue University. Due to
   availability, 10–11 weeks-old female mice were used in all experiments.
   Tumors were implanted into the flanks of these mice. Briefly, cultured
   UMUC3 cells were resuspended at 1 × 10^6 cells/100 ul in PBS. Mice were
   anesthetized with isoflurane. Cells were injected subcutaneously into
   left and right flank of the mouse using 29G1/2 needle. Animals were
   monitored and weighed every day. Tumor volumes were measured by using
   digital caliper and calculated by the formula: Tumor
   volume = (length × width × height)0.52. On 14^th day post inoculation,
   mice developed about 100 mm^3 size tumors. Mice were randomized into
   two groups and treated for the next 10 days with vehicle (DMSO in
   100 ul saline) and CCT068127 in saline (30 mg/kg). All treatments were
   administered by intra-peritoneal injections.

Pharmacokinetic study

   Groups of NSG female mice (n = 4) were injected intraperitoneally every
   day for two weeks with doses of 10 mg/kg or 20 mg/kg or 30 mg/kg of
   body weight of CCT068127 in saline. Mice were monitored daily for
   weight changes and clinical signs of toxicity. After completion of the
   experiment, mice were euthanized by CO[2] inhalation. A necropsy study
   was done in a systemic manner to identify any changes in anatomical
   features due to drug toxicity. We selected a concentration of 30 mg/kg
   of CCT068127 for further experiment.

Benchmark data

   We curated raw and processed data from ENCODE project^[324]10 listed in
   Supplementary Data [325]1a. The expected counts and normalized gene
   expression values, FPKM, and statistics from differential expression
   analysis were extracted from HepG2 and K562 shRNA-seq datasets. For
   ChIP-seq, we annotated the processed peak to the nearest gene using
   HOMER annotatePeaks.pl^[326]96 from A549, CH12.LX, GM12878, HEK293,
   HepG2, K562 and MEL cell lines. RNA protein binding profiles (i.e.
   eCLIP-seq) were processed similarly to ChIP-seq for HepG2 and K562.

Pathway ensemble tool

   PET combines the outputs from three underlying pathway analysis
   methods: GSEA, ora and Enrichr. Reported pathways were ordered by
   normalized enrichment score (NES), ora p-values by Fisher exact test,
   Enrichr “combined scores”, respectively, and the ranks were determined.
   For each pathway, an average rank between the three methods was
   calculated and a combined p-value was estimated by Stouffer’s Method
   using scipy.stats.combine_pvalues with method = “stouffer”. Finally,
   all pathways were ordered based on the average rank.

   Ora was re-implemented using scipy.stats.fisher_exact function. Enrichr
   was re-implemented in Python as described in Computing enrichment
   section of the original manuscript^[327]5 using N = 1000 permutations.
   GSEA was invoked using different indicated ranking metrics in the
   manuscript. For “signal2noise” and “ratio of classes”, the normalized
   expression values were used with GSEA parameters set as -collapse
   No_Collapse -set_max 500 -set_min 20”. For other modes (e.g. DESeq2,
   edgeR), a pre-ranked gene list was generated from statistics of
   differential expression analysis or peak calling results, where
   -log10(p-value)*sign(log(fold change)) was used as the gene weight. The
   “prerank” mode parameters were set as -collapse No_Collapse -set_max
   500 -set_min 20”. All the other parameters were kept as default. PET
   implementation in python incorporates the option to use DESeq2^[328]97
   or PyDESeq2^[329]98, as well as GSEA^[330]4 or GSEApy^[331]99,
   providing flexibility in selecting the platform for their analysis.

Running ensemble methods

   To conduct pathway analyses using e.browser^[332]14, we utilized the
   built-in function deAna (expr_mat, de.method = “DESeq2”) to make the
   differential gene expression analyses consistent with PET. Every
   underlying pathway analysis method supported by e.browser was executed
   (except padog which had technical errors), and their results were
   combined using “combResults” and then ranked by “gsRanking” function.

   To conduct pathway analyses using piano^[333]12, the geneLevelStats
   were provided as the p-values from DESeq2 analyses for all the methods
   allowed or log2 fold change as required by the remaining methods. Every
   underlying pathway analysis method supported by piano was executed, and
   their results were combined using consensusScores (res_list,
   class = ‘distinct’, direction = ‘up’).

   To conduct pathway analyses using decoupler^[334]11, we utilized the
   function run_consensus. The DESeq2 statistics, defined as
   -log10(p-value) for each gene were provided as the required ‘mat’
   argument. All Benchmark genesets were provided as the required ‘net’
   argument, with each geneset ‘connected’ to its genes. Then, all
   genesets were ranked based on ‘consensus_estimate’ score returned by
   run_consensus function.

   All other indicated pathway analysis methods were executed using egsea
   Bioconductor toolkit^[335]89 provided with read count matrix. The
   “average rank” setting for egsea resulted in the best overall median
   rank, precision and average precision scores among all its other
   ranking metrics and thus was chosen to be reported here.

Simulated noise and false-discovery rate calculation

   To simulate inaccuracies and noise in biological pathways, a portion of
   genes in each pathway were substituted with an equal number of randomly
   chosen genes from the entire human genome. To calculate false discovery
   rates, we first collected raw read counts and normalized gene
   expression data from TCGA breast invasive carcinoma (BRCA), TCGA lung
   adenocarcinoma (LUAD), TCGA prostate adenocarcinoma (PRAD), and GTEx
   breast, lung and prostate tissues. We randomly selected 2 N samples
   (i.e. N = 2–10, 15, 20, 25, 30) from each tissue/cancer and split them
   into two groups, and ran pathway analysis methods on the expression
   matrix of the two groups against all canonical pathways
   (c2.cp.v7.1.symbols.gmt), where the permutations are repeated 100 times
   for each N. For methods requiring a pre-ranked gene list or set of
   genes, we utilized DESeq2 to perform the differential expression
   analysis and used –log10(p-value) or top 200 DEGs as input for those
   methods. Since the samples were randomly selected biological
   replicates, the expectation is that any reported pathway significantly
   different between the groups represents false discovery, where
   significance is defined as the adjusted p-value ≤ 0.05. Thus, the
   proportion of significant pathways among all the pathways is defined as
   the false-discovery rate.

Prognostic power of individual genes, gene combinations and independent
validation

   The prognostic power of each biomarker was quantified using the area
   under the curve (AUC) based on the ability of a gene or a gene
   combination biomarker to distinguish different sample groups, e.g.,
   alive vs. deceased or progression vs. no-progression to the next tumor
   stage. The gene expression values were first transformed to z-scores
   (across samples). For gene combination biomarkers, the average z-score
   of the participating genes were calculated. The z-scores were then used
   to calculated AUC values by sklearn.metrics.roc_auc_score. To calculate
   the significance of individua genes or combination biomarkers, the
   p-value for logrank test of survival were generated by LIFELINES
   [10.21105/joss.01317] and were FDR-adjusted for multiple testing by
   statsmodels.stats.multitest.fdrcorrection. Adjusted FDR value less than
   0.05 were considered as significant.

   To identify gene combination biomarkers from prognostic pathways
   identified by PET or differentially expressed genes (DEGs) in each
   cancer type in TCGA dataset, the following steps were taken: For PET,
   we identified the top 20 pathways associated with unfavorable prognoses
   and the top 20 pathways associated with favorable prognoses according
   to PET p-values. From each pathway, we extracted up to 10 leading genes
   based on optimized GSEA results and generated all possible combinations
   of 1–5 leading genes (n = 637). Finally, we calculated the AUC values
   for all the combinations as mentioned above. For individual DEGs, we
   obtained the top 100 up and top 100 down-regulated DEGs (sorted by
   DESeq2 adjusted p-value) between alive vs. deceased groups and
   calculated the AUC values for these DEGs as mentioned above
   (Fig. [336]2d, Supplementary Fig. [337]S4d). Since generating
   combinations of 1–5 genes from 100 DEGs would be impractical, for
   combination biomarkers derived from DEGs, we focused on the top 10 most
   upregulated and top 10 most downregulated genes separately. We
   enumerated all possible combinations of 1–5 genes from the upregulated
   genes (n = 637) and another set of combinations from the downregulated
   genes (n = 637). Finally, we calculated the AUC values for these
   combination biomarkers, following the same method mentioned above. Only
   significant combination biomarkers based on FDR-adjusted logrank of
   survival (see above) were kept for visualization (e.g. Fig. [338]2e,
   Supplementary Fig. [339]S4e) and further analysis.

   To independently validate the combination biomarkers derived from
   either DEGs or PET analysis, we extracted the top 200 significant
   combination biomarkers based on their AUC values. We then tested the
   performances of these biomarkers on several independent datasets
   (Supplementary Fig. [340]S4g and Supplementary Data [341]2g).
   Specifically, for each cancer type, we calculated the proportion of
   identified favorable or unfavorable markers identified from
   corresponding TCGA cancer dataset that remained as favorable or
   unfavorable markers in the indicated independent dataset.

Computational drug prediction

   The drug-target dataset was downloaded from DSigDB V1^[342]68. For each
   drug, the down-regulated and up-regulated target genesets were
   designated as drug_repressed_geneset and drug_induced_geneset. The gene
   biomarkers that are intended to be upregulated or downregulated were
   designated as favorable_geneset and unfavorable_geneset respectively.
   To identify the “favorable” or “unfavorable” geneset in each type of
   cancer from PET analysis, we extracted the top 20 pathways associated
   with either favorable or unfavorable prognoses. From these pathways, we
   selected up to 10 leading genes based on optimized GSEA results and
   compiled them as the respective “favorable” or “unfavorable” geneset.
   For DEG-based drug prediction, we extracted the top 200 upregulated and
   the top 200 downregulated DEGs according to DESeq2 adjusted p-value
   between the alive vs. deceased patient groups as the corresponding
   “favorable” or “unfavorable” geneset. For each pair of drug and cancer,
   we then performed the following four Fisher exact tests with respect to
   the gene universe and calculated the corresponding p-values using
   scipy.stats.fisher_exact: a) drug_induced_geneset and
   unfavorable_geneset, b) drug_repressed_geneset and favorable_geneset,
   c) drug_induced_geneset and favorable_geneset, d)
   drug_repressed_geneset and unfavorable_geneset. The drug effect on a
   prognostic marker is then calculated as
   E = −log10(c) − log10(d) + log10(a) + log10(b). Finally, all the drugs
   were ranked by the predicted effect E.

Reporting summary

   Further information on research design is available in the [343]Nature
   Portfolio Reporting Summary linked to this article.

Supplementary information

   [344]Supplementary Information^ (1.5MB, pdf)
   [345]Peer Review File^ (6.5MB, pdf)
   [346]41467_2024_51859_MOESM3_ESM.pdf^ (113.8KB, pdf)

   Description of Additional Supplementary Files
   [347]Supplementary Data 1^ (233KB, xlsx)
   [348]Supplementary Data 2^ (2.5MB, xlsx)
   [349]Supplementary Data 3^ (84.7KB, xlsx)
   [350]Supplementary Data 4^ (1.8MB, xlsx)
   [351]Supplementary Data 5^ (1.2MB, xlsx)
   [352]Reporting summary^ (2.6MB, pdf)

Source data

   [353]Source Data^ (341.4KB, xlsx)

Acknowledgements