Abstract Identification of clinically applicable molecular subtypes of pancreatic ductal adenocarcinoma (PDAC) is crucial to improving patient outcomes. However, the traditional tissue-dependent transcriptional subtyping strategies are invasive and not amenable to routine clinical evaluation. In this study, we developed a circulating extracellular vesicle (cEV) long RNA (exLR)-based PDAC subtyping method and provided exLR-derived signatures for predicting immunogenic features and clinical outcomes in PDAC. We enrolled 426 individuals, among which 227 PDACs served as an internal cohort, 118 PDACs from two other medical centers served as an independent validation cohort, and 81 healthy individuals served as the control. ExLR sequencing was performed on all plasma samples. We found that PDAC could be categorized into three subtypes based on plasma exLR profiles. Each subpopulation showed its own molecular features and was associated with patient clinical prognosis. The immunocyte-derived cEV fractions were altered among PDAC subtypes and interconnected with tumor-infiltrating lymphocytes in cancerous tissue. Additionally, we found a significant concordance of immunoregulators between tissue and blood EVs, and we harvested potential PDAC therapeutic targets. Most importantly, we constructed a nine exLR-derived, tissue-applicable signature for prognostic assessment of PDAC. The circulating exLR-based features may offer an attractive platform for personalized treatment and predicting patient outcomes in multiple types of cancer. Keywords: circulating EVs, extracellular vesicle long RNA, tissue-cellular origin, immune checkpoint genes, tumor microenvironment Graphical abstract graphic file with name fx1.jpg [47]Open in a new tab __________________________________________________________________ This is the first study to use large-scale cEV transcriptome analysis for cancer risk stratification and uncover a complex interaction network of immunogenic components with clinical implications between circulatory particles and primary nidus from PDAC cases. Introduction Pancreatic ductal adenocarcinoma (PDAC) is currently the fourth leading cause of cancer-related death worldwide.[48]^1 The treatment efficiency of PDAC is restricted by the aggressive tumor-intrinsic characteristics and suppressive tumor microenvironment (TME).[49]^2 Previous studies showed that personalized treatment approaches based on PDAC genotypic and phenotypic characteristics may potentially improve patient outcome.[50]^3^,[51]^4 However, most of these reports focused on gene subtyping and exploring biomarkers in resectable tumors. Resection is possible only in a minority of patients, as most patients present at an advanced stage at the time of diagnosis,[52]^5 and safely obtaining sufficient quantities of pancreatic tumor tissue for molecular analysis is difficult at unresectable stages. Therefore, the development of specific, robust, and noninvasive molecular subtyping strategies for the stratification and prognostic evaluation of PDAC are required. Extracellular vesicles (EVs) have become a hotspot in liquid biopsy for cancer precision medicine. EVs are nanometer-sized membrane-bound particles that are shed from both tumor and nonneoplastic cells into the peripheral circulation. EVs deliver a milieu of cargo material, such as proteins, genomic DNA, and lipids, as well EV long RNAs (exLRs), from parent cells to receptive cells.[53]^6 Circulating EVs (cEVs) contain approximately 10,000 exLRs, mainly mRNAs, long non-coding RNAs, and circular RNAs, which are stably expressed and have shown clinical significance.[54]^7 Tissue- and tumor-originated exLRs are specifically enriched in cEVs and reflect the biological activities, metabolic conditions, and immunogenic features of their host cells.[55]^8 Previous studies showed that PDAC tumor-derived EVs regulate the immunosuppressive TME and deliver anti-tumoral transcriptomic signals, and that plasma-originated exLRs are critical indicators of PDAC antitumor immune activities.[56]^9^,[57]^10 Additionally, exLRs are involved in critical processes of PDAC carcinogenesis, including tumor growth, metastasis, drug resistance, and immune modulation.[58]^11^,[59]^12 Therefore, the mixture of non-enriched cEVs containing both tumor-derived and non-tumor-derived exLRs[60]^13 may reflect tumor-intrinsic and non-tumor signatures, and evaluation of exLRs may represent a new strategy for noninvasive molecular stratification, prognostic evaluation, and treatment monitoring in PDAC. We previously determined the abundance of exLRs in the plasma of cancer patients[61]^14 and developed a strategy for exLR sequencing (exLR-seq) in human plasma.[62]^15 Our study indicated that cEVs contain approximately 10,000 exLRs, mainly mRNAs, which are stably expressed, harbor tissue-cellular specificity, and could serve as biomarkers for PDAC early screening.[63]^13 In this study, we evaluated the potential usefulness of the circulating exLR profile for PDAC subtyping. We traced the tissue-cellular source contribution of cEVs and explored the correlation of the immune landscape between tumors and cEVs. Our study established a prognostic signature enabling noninvasive prediction of immunogenic features and clinical outcome for PDAC. Results Circulating exLR-based subtyping of PDAC The exLR-seq approach was implemented on plasma samples from an internal cohort of 227 PDAC patients ([64]Table S1). Approximately 14,000 exLRs were reliably detected, including protein-coding genes (10,253, 74.91%), circRNAs (1,322, 9.66%), pseudogenes (1,218, 8.9%), lncRNA genes (764, 5.58%), and other small non-coding RNAs (130, 0.95%) ([65]Figure S1A). We included the top 60% aberrantly expressed exLRs as candidate targets to explore underlying exLR-based PDAC subtypes. We assessed clustering stability using the consensus-clustering algorithm, which indicated the existence of three major clusters ([66]Figures S1B–S1F; see [67]Materials and methods); 60, 97, and 70 cases were stratified to clusters C-I, C-II, and C-III, respectively ([68]Figure 1A). Principal-component analysis demonstrated a clear distinction between each subgroup ([69]Figure 1B). The number of detected exLR species shows a significant difference between three clusters. In particular, the amount of enriched protein-coding genes in C-III was greater than in other group sets (C-I and C-II), which indicates that plasma EVs of the third cases were additionally enriched with exLRs, which could conduct exclusive biological functions ([70]Figure 1C). Figure 1. [71]Figure 1 [72]Open in a new tab PDAC prognostic subtyping based on circulating exLR-seq profile (A) Consensus clustering matrix of exLR-seq data from the internal cohort for aggregating number k (k = 3), representing the clustering stability using 500 iterations of the hierarchical clustering approach. The heatmap depicts the relative abundance of signature exLRs. (B) Two-dimensional scatterplot of principal-component analysis (PCA); first two principal components derived from expression intensities of variable exLRs, with samples connected by centroids according to exLR-based PDAC subsets (red, C-I; blue, C-II; yellow, C-III), are shown. The ellipse indicates the 90% confidence interval for each clustered group. (C) The relative distribution of exLR species per sample among PDAC clusters. The diagram shows exLRs reliably enriched in plasma vesicles, with a frequency > 0.1 and mean expression value > 1. The center of the boxplots indicates median values; the boundaries of the boxes show 25% and 75% quantiles. (D) Alluvial diagram shows differences among PDAC clusters according to overall survival (OS), patient status, tumor stage, and carbohydrate antigen 19-9 (CA19-9). CA19-9 level greater than 100 U/mL was defined as high. ∗∗p < 0.01; n.s., not significant. (E) Kaplan-Meier curves for OS in the 227 PDAC patients in the internal exLR-seq cohort indicate that exLR-based phenotypes strongly correlated with PDAC prognosis (log-rank test, p < 0.001). Evaluation of the clinical characteristics of the three clusters revealed a significantly increased number of patients with advanced tumor stage and shorter overall survival (OS) time in C-III; the other characteristics (age, sex, and CA19-9) were not associated with exLR-based stratification ([73]Figure 1D). Survival analysis showed that C-III correlated with the shortest OS, while C-I correlated with the longest OS (log-rank test, p < 0.001; [74]Figure 1E). Therefore, our results suggested that circulating exLR-based subtyping enables noninvasive prediction of clinical outcome for pancreatic cancer. Molecular features and pathway annotation of exLRs in exLR-based clusters To assemble these dysregulated circulating exLRs and explore their underlying biological functions in each exLR-based cluster, the weighted correlation network analysis (WGCNA) approach was used to generate co-expressing exLR modules based on circulating exLR-seq expression profiles, and it correlated each module with phenotypes and clinical parameters (see [75]Materials and methods and [76]Figure S2A). The variably expressed exLRs were distributed in eight independent modules according to their degree of connectivity. Six gene modules (M2–M7) were significantly associated with C-III and one module (M1) was particularly referred to C-I (Spearman’s correlation analysis, p < 0.001, Rho > 0.45; [77]Figures 2A and [78]S2B). The exLRs that could not be contained in any modules were placed into M8. All C-III-related modules were positively correlated with tumor aggressiveness, whereas the inverse was observed with the C-I-specific module; none of the modules showed any demographic bias ([79]Figure 2A). Figure 2. [80]Figure 2 [81]Open in a new tab Systematic annotation of molecular features and pathway activation of exLR-based clusters (A) The module-trait correlating matrix displays the correlation coefficients and significance levels of PDAC clusters and clinical and demographical factors for each exLR module. (B) Systematic Gene Ontology (GO) annotation network for genes in the eight modules in biological process, cellular component, and molecular function aspects. We filtered the top 10 significant annotating terms of each module and displayed these terms on the network diagram. The annotated GO terms are shown on each ellipse; the size of the ellipse reflects the degree of enrichment (−log10[p value]), and the transparency of the text indicates the number of exLR genes from each enriched GO term (log[2] transferred). The gray connection communicates each gene module with its corresponding annotation information. (C) Unsupervised clustering heatmap of ssGSEA enrichment score of 191 cluster-specific signaling pathway signatures in the exLR-seq data from the internal cohort. Hierarchical clustering was implemented with Euclidean distance and Ward linkage. Rows represent the signaling pathways derived from the current MSigDB 7.2 database that includes three major gene set collections (Hallmark, Reactome, and KEGG sets), and columns represent plasma samples. The right panels display the average normalized enrichment score (NES) of represented signaling pathways in PDAC clusters I and III. (D) The scatterplot demonstrates the tissue-cEV correlated signaling pathways from 61 PDAC patients with cEV and matched tissue transcriptomic expression profiles. Gene Ontology (GO) enrichment analysis was conducted to annotate exLR genes in each module ([82]Figure S2B). The exLRs of M1 (corresponding to C-I) were involved in the adhesion and migration of hemopoietic cells and represented favorable indicators for PDAC prognosis ([83]Figures 2B and [84]S2C). Conversely, the exLRs of M2–M7 (C-III) were involved in the activation of tumoral and immunogenic features, including RNA anabolic and catabolic processes, energy metabolism, cellular and immune responses, and DNA replication ([85]Figures 2B and [86]S2C; [87]Tables S2–S8). The pre-ranked gene set enrichment analysis (GSEA) strategy identified 191 pathways that were significantly different in each cluster (C-I, n = 57; C-II, n = 42; C-III, n = 92; false discovery rate [FDR] < 0.1, normalized enrichment score > 1.5; [88]Table S9). Unsupervised hierarchical clustering of the absolute enrichment score determined two distinct pathway signatures enriched in C-I and C-III samples ([89]Figure 2C). C-I was exclusively enriched in vesicle-mediated molecule transport, GPR signal transduction, and cell motility-related pathways, while C-III was enriched in cell proliferation and metastasis, metabolism, and immune-related pathways ([90]Table S9). The activation of chemokines, T cell receptor (TCR)/B cell receptor (BCR) and antigen processing and presentation signaling, as well as complement component genes overexpressed in cEVs suggests that patients in the C-III group showed primary and adaptive immune reactions in response to PDAC development. Correlation analysis revealed 77 signaling pathways that showed similar patterns in cEVs and tissue profiles, including PDAC- and immune-related hallmark signatures, such as JAK-STAT, transforming growth factor β (TGF-β), and WNT/β-catenin pathways and TLR-9/10 cascades ([91]Figure 2D; [92]Table S10). These results suggest that circulating exLRs serve as noninvasive indicators to jointly reflect tumor-intrinsic and antitumor immune signatures. Comprehensive landscape of tissue-cellular origins for cEVs We previously developed a deconvoluting approach, termed EV-origin, as a computational framework to infer the tissue-cellular origins of cEVs and cell type-specific genes from transcriptomic profiles of cEVs.[93]^16 In this study, we expanded the number of traceable hemopoietic cell fractions. A total of 39 types of tissue-cellular fractions were relatively or absolutely quantified, including 23 hematopoietic components and 16 types of tissue constituents. Detailed information of the candidate tissue/blood components with their hierarchical relationships are shown in [94]Table S11. Using the updated EV-origin strategy, we portrayed an integrated atlas of a traceability system of cEVs in exLR-seq samples from the internal cohort. In evaluation of hemopoietic components, cEVs predominantly originated from the hematopoietic stem cell fraction (platelets, 57%; red blood cells, 13%), followed by immunocyte components (30%) ([95]Figure 3A, left). Concerning the proportion of immune cell sources, lymphocytes accounted for 81%, and the remaining myeloid component was 19% ([96]Figure 3A, right). These immune cell constituents were categorized into adaptive (71%) and innate (29%) immune-related components ([97]Figure 3A, right; [98]Table S11). CD8/CD4^+ T cells accounted for 66% of all immunocyte-derived constituents, and there was also a considerable abundance of immunosuppressive cell components (regulatory T cells, 5%; monocytes, 16%). We also evaluated the abundance of tissue/solid organ-originated cEVs in PDAC plasma. Adipose tissue was the most prominent tissue contributor to plasma EVs (72%), followed by muscle (10%), lung (4%), and liver (4%); the pancreas accounted for only 0.24% ([99]Figure 3B; [100]Table S8). Figure 3. [101]Figure 3 [102]Open in a new tab Comprehensive landscape of tissue-cellular origins for cEVs (A) Relative distribution of all blood cell components estimated using the EV-origin approach from exLR-seq data from the internal PDAC cohort. The bar chart on the left displays the relative abundance of cEVs from erythrocytes, immune cells, and platelets. The multilayer pie chart on the right shows the relative proportions of immunocyte-originated cEV fractions. The diagram represents the categorization of immune cells belonging to each hemopoietic constituent and indicates the relative proportion of each subset in the total immune fraction. (B) Relative distribution of tissue/solid organ-derived EV fractions from the internal PDAC exLR-seq cohort. n.s., not significant. (C) Integral comparison of hemopoietic- and tissue-derived cEV absolute fractions from healthy control and PDAC plasma exLR-seq samples. The mean and upper and lower standard deviation values are indicated by bars in each panel; the pink and yellow boxes show the absolute scores of hemopoietic- and tissue-derived EV fractions for healthy control and PDAC exLR-seq samples, respectively. (D) Overview of the results of tissue-cellular originated cEVs in the PDAC and healthy control exLR-seq samples. The y axis shows the mean fold change of all absolute fractions for PDAC versus healthy individuals. We additionally evaluated the cEV fractions from 81 healthy individuals. In the comparison of absolute fractions of components, there was no statistically significant difference in blood cell/tissue-derived fractions between healthy and PDAC individuals (Mann-Whitney U test, p > 0.05; [103]Figure 3C). Source contribution heterogeneity analysis revealed that 80% (12/15) of the tissue-derived components were increased in the PDAC group, while 65% (15/23) of hemopoietic-sourced fractions were highly enriched in the healthy cohort (absolute score, mean fold change > 1; [104]Figure 3C). The enriched level of the pancreatic tissue-sourced cEV fraction demonstrated a modest upregulation in the PDAC cohort ([105]Figure 3D). Immunocyte-derived EVs perturbed in PDAC subtypes and interconnected with tumor-infiltrating lymphocytes (TILs) in tissue We next investigated the underlying heterogeneity of cEVs and their immunogenic source contribution among the PDAC phenotypes. The immunocyte-derived EV fractions can be categorized into innate and adaptive immunity categories to decode complex TME components. The three clusters showed significant differences in four types of innate immune constituents (i.e., monocytes, natural killer [NK] cells, dendritic cells [DCs], and γδ T cells). C-III exhibited the highest proportion of all four innate immune cell types while C-I showed the lowest proportions (Kruskal-Wallis rank-sum test, p < 0.05; [106]Figure 4A, top). Among the adaptive immunity constituents, T helper (Th)2 and Th17 cell fractions were upregulated in C-I, while high proportions of T regulatory (Treg) cells, naive Th cells, central memory CD8 T cells, memory B cells, and plasmablasts were observed in C-III ([107]Figure 4A, bottom). We further examined the absolute numbers to determine the true abundance of cEVs derived from each immune cell population. The immunogenic composition of cEVs gradually increased from C-I to C-III (mean absolute score: C-I, 0.38; C-II, 0.40; C-III, 0.51; [108]Figure 4B). The levels of innate and adaptive immunogenic cell fractions were both significantly upregulated in C-III compared with C-I and C-II (Mann-Whitney U test, p < 0.05; [109]Figure 4B). Figure 4. [110]Figure 4 [111]Open in a new tab Immunocyte-derived EVs perturbed in PDAC subtypes and interconnect with TILs in tissue (A) The graph indicates the relative enrichment results of EV fractions from 21 types of immunocytes in the three PDAC subtypes from the discovery exLR-seq dataset. The graph shows 7 types of innate immunity components on the top and 14 adaptive immune subsets on the bottom. The significantly different immune constituents are highlighted in red color. (B) Comparison of the absolute fractions of cEVs from innate and adaptive immunocytes in the three exLR-based subtypes. The points and error bars indicate the mean value and standard deviation of the absolute score, respectively. ∗p < 0.05; n.s., not significant. (C) OS in high- and low-enriched groups for absolute immunocyte fractions by calculating the hazard ratio (HR) through Cox regression analysis. High and low expression was defined by the optimal cutoff of the quantitative score. The dots represent the HR of univariate Cox test and the error bars show the 95% confidence intervals. (D) The panel displays the correlation between exLR-based PDAC classifications and immune cell-originated EV fractions. The dots show the odds ratios of univariate logistic regression for immune cells in innate and adaptive categories. The error bars show 95% confidence levels of odds ratios. (D) Venn diagram illustrating the number of immunocyte-originated EV constituents that are significantly perturbed in PDAC subtypes and may function as prognostic indicators for PDAC. The bottom panel lists the three hub immune cell components along with their cell origin and prognostic impact. (F) Co-enrichment network of tissue/cEV immune-relevant signatures. The three candidate immunocyte-originated EV constituents are shown in the inside track (yellow triangles). The outside panel refers to the TILs in paired PDAC tissue. Each line represents the correlation coefficient of the immune cell signature between tissue and circulating vesicles among the same PDAC cases. The bottom panel shows the prognostic value (high level, orange line; low level, purple line) of the three hub TIL signatures in TCGA cohort. In addition, we dichotomized the absolute enrichment score of each fraction and identified six innate immune subpopulations, including five risk components (monocytes, basophils, DCs, NK cells, and neutrophils; hazard ratio [HR] > 1) and one potential protective factor (γδ T cells; HR < 1), that were significantly related to OS in PDAC (p < 0.05; [112]Figure 4C, left). In addition, 8 of 12 adaptive immune populations were significantly correlated with OS in PDAC, including five unfavorable (naive Th cells, Treg cells, plasmablasts, effector memory CD8 T cells, and terminal effector CD8 T cells) and three protective cell subsets (memory B, naive CD8 T, and Th2 cells) ([113]Figure 4C, right). Univariate regression analysis indicated that 4 of 7 innate cell fractions and 9 of 12 adaptive-lymphocyte-derived components were likely to be associated with exLR-based PDAC phenotypes (p < 0.05; [114]Figure 4D). We then used a stepwise variable selection procedure from Cox’s model to identify the most prognostic immunogenic fractions as variables, including Th2 cells, monocytes, Treg cells, neutrophils, and effector memory CD8 T cells. By taking the intersection of the variables and the significant cell sets with regard to OS ([115]Figure 4C) and subtypes ([116]Figure 4D), we identified three hub immunocyte-originated cEV fractions, including two adaptive immune components (Treg and Th2 cells) and one innate immune fraction (monocytes) that displayed critical roles for PDAC clinical outcome, and these components showed significant differences among PDAC subsets ([117]Figure 4E). The enrichment results of immunogenic cEVs raises questions regarding whether EVs from blood could represent the abundance of TILs in PDAC tissue. By deciphering the detailed portraits of immunocyte disaggregation from transcriptomic profiles of cEVs and paired tissues, we generated a full-scale correlation network between TILs and hub cEV fractions ([118]Figure 4F). The Treg cell-originated EV component positively correlated with DC and CD8^+ T cell infiltration, whereas it was inversely associated with Th2 cell abundance in paired tissue. There was a parallel enrichment pattern between the congenital immunity cell-derived vesicle components (monocytes) and the infiltration of NKT cells in tissues. We externally validated the prognostic significance of the three hub constituents on PDAC patient survival using The Cancer Genome Atlas (TCGA) dataset. The enrichment of immunogenic signatures from Treg and monocyte cells exhibited equivalent prognostic value in tissue and cEVs, which is consistent with their immunosuppressive properties, whereas the Th2 cell-derived fraction showed opposite results ([119]Figure 4F, bottom). The two candidate immune cell (Treg cells and monocytes)-originated cEV components could jointly serve as unprotective factors to forecast the clinical outcome for PDAC. Taken together, these data suggested perturbed immunogenic cEVs that were associated with immunosuppressive TILs in the TME. Concordance of immunoregulatory targets in cEVs and cancer tissues Immunomodulators (IMs) are critical regulators for cancer chemoradiotherapy and immunotherapy, and numerous IM agonists and antagonists have been explored and applied in clinical oncology.[120]^17^,[121]^18 We collected a list of 76 IMs from the literature that exhibit multiple effects on the tumor immunoreaction, with functions in antigen presentation, cell adhesion, and immune checkpoint responses[122]^19 ([123]Table S12). The exLR level of IMs perturbed and unsupervised clustering results demonstrated that the IM profiles in cEVs could distinctly segregate patients into three groups, which precisely matched the exLR subtypes aforementioned in this study ([124]Figure 5A). Eight immunogenic exLRs that are upregulated in C-I show antitumor effects such as major histocompatibility complex (MHC) class I antigen presentation and stimulatory immune checkpoint responses. C-III was over-enriched with both antitumor and immunosuppressive IMs, with an expression pattern that was completely different from that of C-I. In addition, we compared candidate immune-related targets in paired cEV and tumor transcriptomes from 61 patients and observed close correlations of expression between the two IM profiles (Rho = 0.65; [125]Figures 5B and 5D, top two tracks). Figure 5. [126]Figure 5 [127]Open in a new tab Concordance of gene expression of immunoregulatory targets in cEVs and neoplastic tissues (A) The heatmap of the expressional heterogeneity of immunomodulator (IM) genes in exLR-seq data from the internal PDAC cohort, scaled by row. Each row refers to an immune-related gene and columns represent exLR-seq samples. Detailed classification and underlying functions of these immunoregulatory targets are listed in [128]Table S12. (B) Scatterplot displaying the relationship of IM gene expression values between cEVs and paired tissue in 61 PDAC patients. (C) Correlation of ligand-receptor immune checkpoint genes in cEVs and corresponding paired tissues. (D) Circular chart represents the differently expressed IMs among the three PDAC subtypes. From the outside to the inside, the diagram displays the average gene expression level of IMs in cEVs and matched tumor tissue, the mean fold-change enrichment of IMs among PDAC subtypes, the p value (−log transformed), and the hazard ratio from univariate Cox analysis. The innermost arc-shaped curve highlights the prognostically consistent targets for tissue and plasma vesicles. (E) Bubble plot indicates the prognostic concordance of 17 IM genes in cEVs, paired tumor tissue, and TCGA expression profiles. Three of the IMs (dotted box marked) predicted PDAC prognosis (p < 0.05) in both blood and tissue datasets. Considering that the ligand-receptor pairs of immune checkpoint genes (ICGs) are key factors that reflect the magnitude and duration of the immune checkpoint response,[129]^20 we evaluated the expression correlation of 19 ICG pairs in cEVs and matched cancer tissue. All 19 pairs were detected in tumors and cEVs ([130]Table S13), and 7 of 19 ICG pairs showed coincident expression patterns (p < 0.01; [131]Figure 5C). The expression levels of nine ligands displayed a significantly positive correlation with their receptors in cEV profiles (p < 0.05; [132]Figure 5C). Overall, four ligand-receptor ICG pairs were activated both in cEVs and tissue, including three inhibitory pairs (LGALS9-HAVCR2, CD80-CTLA4, and PVR-TIGIT) and one stimulatory pair (TNFRSF14-CD160) ([133]Table S14). No significant expression correlation was observed between CD274 (which encodes PD-L1) and PDCD1LG2 (PD-L2) and their co-receptor PDCD1 (PD1) in cEVs ([134]Figure 5C). The inhibitory immune checkpoint receptor HAVCR2 (TIM-3) and its ligand LGALS9 (GAL9) represented the most significant co-expressed targets activated both in cancer tissue and cEVs ([135]Figures 5C, [136]S3A, and S3B). We next investigated whether these immunogenic targets can be noninvasively used as biomarkers to track patient clinical outcomes. We identified 52 IM targets that were differentially expressed among the three PDAC clusters (Kruskal-Wallis rank-sum test, p < 0.05; [137]Figure 4A) and we correlated their expression level with OS in PDAC using HRs from univariate Cox regression model. Higher levels of 34 (65%) IMs indicated an increased risk of death in PDAC (HR > 1), whereas 18 (35%) IMs showed the opposite prognostic trend (HR < 1; [138]Figure 5D, last track). Notably, HAVCR2 (TIM-3) was significantly upregulated in C-III and represented the best survival predictor (p < 0.05; [139]Figures 5D and [140]S3C). [141]Figure 5E summarizes the risk assessment power of all IMs in PDCA from both internal and independent cohorts. Seventeen IMs showed similar prognostic patterns between cEVs and cancer tissue. Among the 17 IMs, two immunosuppressive targets, TIM-3 and VEGFA, were significantly correlated with poor prognosis, while one stimulatory target, SELP, was correlated with good prognosis ([142]Figures 5D, [143]S3D, and S3E). Construction of a tissue/cEV prognostic signature for tracking PDAC clinical outcome We next tried to simplify the above stratification strategy. The sample sizes of our included internal and external cohorts enable adequate power (1-β, more than 0.9) to predict patient survival and prognosis (see [144]Materials and methods). The workflow used to identify a tissue/cEV prognostic signature for PDAC is shown in [145]Figure 6A. We first eliminated exLRs with an expression frequency less than 25% and evaluated the differentially expressed genes among the three PDAC subtypes from exLR-seq profiles in the internal cohort (Kruskal-Wallis rank-sum test, FDR < 0.05; [146]Table S15). We further harvested markers with prognostic concordance in blood and lesions; the univariate Cox approach was used to investigate the blood-tissue prognostic valuable markers in patients with cEV and paired-tissue transcriptomic profiles (p < 0.05, see [147]Materials and methods). We obtained 986 exLRs, most of which (n = 802) were associated with poor prognosis of PDAC (HR > 1); a small number of candidate exLRs (n = 184) were indicated as protective prognostic factors (HR < 1) ([148]Table S16). GO enrichment analysis revealed that the exLRs were involved in regulation of lymphocyte activation and differentiation, innate/adaptive immune responses, RNA transport, and activation of cancer-related cell signaling ([149]Table S17). Figure 6. [150]Figure 6 [151]Open in a new tab Construction of a tissue/cEV prognostic signature for tracking PDAC clinical outcome (A) Workflow of prognostic marker selection and prognostic model construction. DEG, differentially expressed gene. (B) Characteristics of the nine markers with their coefficient and hazard ratios of a multivariate Cox model for PDAC prognosis prediction. (C) Comparison of cp-scores among the exLR-based PDAC subtypes. (D and E) OS curves of PDAC patients were stratified according to high- and low-cp-score in the discovery (D) and independent datasets (E). (F) Multivariate Cox analysis of clinicopathologic factors and cp-score with OS in the entire cohort. Variables with independently prognostic significance are highlighted in red. (G) Analysis of the tissue prognostic score (tp-score) using the paired tissue of the internal cohort; the number of patients in the high-tp-score and low-tp-score groups are n = 31 and n = 30, respectively. (H and I) Kaplan-Meier curve shows the OS (H) and time to recurrence (I) in patients according to tp-score in the TCGA-PDAC cohort. All of the high and low classifications were determined for each patient based on the dichotomized p-score. We further conducted a variable constriction in the internal exLR-seq dataset; the random-forest algorithm and the LASSO-Cox method were used to reduce the dimensionality. Nine exLR targets (HAVCR2, ADK, TP53I11, TBL3, ACD, LGALS9, CANT1, LYL1, and PKIG) were used to construct a model for predicting PDAC prognosis. The immunoregulatory factors HAVCR2 and LGALS9, which were identified as receptor-ligand ICGs, were also selected as candidate prognostic factors ([152]Figure 6B). Using the multivariate Cox algorithm, we established a prognostic model and generated an aggregate circulating exLR-based prognostic score for PDAC (cp-score; [153]Figure 6B, left). C-III exhibited a high median cp-score compared with the other exLR-based clusters (C-I, −0.39; C-II, −0.05; CIII, 0.40; Kruskal-Wallis test, p < 0.001). Kaplan-Meier curves were constructed for PDAC patients separated into high- and low-risk subgroups relative to the optimal cutoff for cp-score (see [154]Materials and methods). The median OS in the low-risk group was significantly higher than that of the high-risk group in the internal and external independent exLR-seq cohorts (p < 0.001, internal cohort, n = 227; p = 0.013, external independent cohort, n = 118; log-rank test; [155]Figures 6D and 6E). Multivariate Cox regression analysis showed that the cp-score was highly associated with the risk of death and was an independent prognostic factor of survival in all exLR-seq datasets (p < 0.001; [156]Figure 6F). We next explored whether the cp-score can be used for evaluation of prognosis using resected tissue. Using the same markers, we reconstructed a prognostic model and generated a tissue-specific p-score (tp-score) from internal paired tissue profiles. The nine candidates showed equivalent power for risk estimation in tissue consistent with that in cEVs ([157]Figure 6B, right). Patients with a high tp-score showed a shortened survival time (p < 0.001, HR = 2.9; 95% confidence interval [CI], 1.66–5.09; [158]Figure 6G). The predictive ability of the tp-score was externally validated in tissue transcriptomic profiles from three independent datasets (TCGA-PDAC, p < 0.01, HR = 2.08; ICGC-PDAC-AU, p < 0.05, HR = 2.99; ICGC-PDAC-CA, p < 0.05, HR = 1.64; [159]Figures 6H, [160]S4A, and S4B). In addition, we evaluated whether the p-score could predict PDAC therapeutic effect by assigning the patients into high or low p-score groups both in cEVs and tissue datasets. The cp-score predicted tumor recurrence for resectable PDAC (stage I/II) cases after surgery (TTR, HR, 1.71; 95% CI, 1.02–2.85; p < 0.05; [161]Figure S4C). Consistent with these findings, a low tp-score in patients showed significant association with prolonged TTR both in the internal (HR, 2.69; 95% CI, 1.49–4.83; p < 0.001; [162]Figure S4D) and independent tissue cohort (HR, 1.94; 95% CI, 1.17–3.22; p < 0.01; TCGA dataset; [163]Figure 6I). Collectively, our results suggested that the cEV-derived prognostic signature harbor equivalent as that of tumor tissue derived for predicting PDAC clinical outcome, including for predicting risk of recurrence after surgery. Discussion PDAC is characterized by a high degree of intra-tumoral heterogeneity and an immunosuppressive TME, which represent the main obstacles to accurate stratification and treatment efficiency.[164]^21^,[165]^22 The difficulty in obtaining tumor specimens from most PDAC patients as well as the molecular heterogeneity of PDAC tumors make liquid biopsy-based biomarker assays crucially important to individualize management.[166]^5^,[167]^13^,[168]^23 In this study, we deciphered the cEV transcriptomic information using exLR-seq profiles from 345 PDAC plasma samples, representing the first study of noninvasive prognostic subtyping of PDAC based on large-scale plasma EV transcriptome analysis. We comprehensively characterized the heterogeneity of tissue-cellular origins and the immunomodulatory landscape from cEV profiles, identified novel therapeutic targets, and constructed a nine exLR-based prognostic signature for PDAC. We identified three PDAC subtypes based on plasma exLR profiles. Each subtype has its own molecular features and is related to patient prognosis. C-III patients had the worst prognosis and exhibited high expression of multiple cancer/immune-related exLRs in plasma. In addition, many signaling pathways shared similar concordance both in cEVs and cancer tissue, which includes PDAC- and immune-related hallmark signatures. Wnt/β-catenin signaling plays a critical role in pancreatic carcinogenesis and therapeutic resistance and regulates cell cycle progression, apoptosis, epithelial-mesenchymal transition, angiogenesis, stemness, and the tumor immune microenvironment.[169]^24 TGF-β signaling pathways play a central role in the immunosuppressive response to cancer.[170]^25 PDAC-derived EVs could transform TGF-β signaling relative signatures into Kupffer cells and are involved in liver pre-metastatic niche formation and TME re-construction to increase PDAC metastatic burden.[171]^9 The crosstalk between TGF-β and WNT/β-catenin signaling in EVs promotes PDAC carcinogenesis. Thus, targeting these signaling pathways is an actively pursued strategy in PDAC treatment.[172]26, [173]27, [174]28 We previously demonstrated that plasma exLRs derived from pancreatic cancer were significantly over-enriched in PDAC cases compared with benign samples and healthy controls.[175]^13 Based on the enumerating results of our EV-origin approach, the quantification of cEVs from specific tissue sources may reflect organ status during disease progression. In this study, cEVs in PDAC patients had significantly dysregulated tissue/organ sources, such as pancreas and liver, and were more likely severely affected by neoplastic invasion from primary and metastatic lesions, suggesting that the tissue/solid organ tracing strategy of our approach may noninvasively monitor the occurrence and development of PDAC in real time. We also observed that the cytotoxic immune cell components with a tumor-killing effect in PDAC were significantly reduced while immunosuppressive-related components were increased, reflecting neoplastic education to the immune system in response to carcinogenesis.[176]29, [177]30, [178]31 PDAC comprises tumor cells within a microenvironment of extracellular matrix, stroma cells, endothelial cells, and immune cells.[179]^32 The therapeutic limitations of chemotherapy and immunotherapy have been attributed to the PDAC-specific TME.[180]^21 We hypothesize that analysis of the immune cell source contribution of cEVs could partially trace the body’s anti-tumor immune response and the dynamic alternation of the TME. A previous study reported that the PDAC-specific TME was predominantly affected by immunosuppressive cells, which may lead to the dysfunction of infiltrated CD8^+ T cells.[181]^33 These findings are consistent with our results and show that although cEV components from anti-tumor immune cells (such as terminal effector CD8/CD4^+ T cells) were enriched in C-III, the patient prognosis and clinical outcome were still dismal. At the same time, monocyte-derived suppressor components, as well as Treg cells, are partnering factors that jointly contribute to local immunosuppression and tumor immune escape.[182]^34^,[183]^35 In agreement with this, we found that monocyte- and Treg cell-originated EV fractions were significantly associated with aggressive phenotype and indicated poor patient outcome. In addition, pancreatic neoplasia cells may produce high levels of cytokines to recruit Treg cells, which in turn inhibit the anti-tumor effects of cytotoxic T lymphocytes.[184]^36 Treg cells engage in extended interactions with tumor-infiltrated DCs and restrain their immunogenic function by suppressing the expression of costimulatory ligands necessary for CD8^+ T cell activation.[185]^37 Consistent with the results observed in the TME, we found that the abundance of EVs derived from Treg cells was positively correlated with DCs and CD8 T cells infiltrating in tissues, suggesting that these interactions might be relevant to the immunosuppressive function of Treg cells. The immunosuppressive cEV components partially explain the poor prognosis in C-III. Therefore, exploration of these immune cell-derived components can provide a noninvasive and joint approach to dissect TME alterations and provide evidence for PDAC treatment. Advances in immune checkpoint blockade have brought a renewed appreciation of the value for reversing the immunosuppressive TME,[186]^38 and recent studies have focused on targeting PD-1/PDL1 and CTLA-4 in cancer treatment.[187]^39^,[188]^40 However, it appears that other key regulators also suppress the host’s immune system and function in immunotherapy resistance in cancers, including PDAC. By determining the heterogeneous expression pattern of ICGs within PDAC subtypes, we identified inhibitory immune checkpoint receptors, including TIM-3 and TIGIT, which were upregulated in cEVs and cancer tissue. TIM-3 and TIGIT expressed on the surface of T lymphocytes are bound by their ligands, Galectin-9 (encoded by LGALS9) and CD155 (PVR), to induce T cell exhaustion and immune escape.[189]^41^,[190]^42 Several anti-TIM-3 agents have been evaluated in clinical trials for solid and hemopoietic malignances.[191]^43 Targeting the TIM-3/LGALS9 and PVR/TIGIT axis may represent a promising approach for improvement of current immune checkpoint therapies.[192]^44^,[193]^45 We confirmed that this pattern of co-expression activation of ligand-receptor pairs can be detected in cEVs and harbored prognostic significance as in tissue. In addition, these suppressive ICGs were strictly enriched in the aggressive subsets, which may also explain why most patients with advanced PDAC are resistant to immunotherapies.[194]^46 Our study had several strengths. First, this is the first PDAC subtyping study using cEV long RNA profiling. Second, the EV-origin deconvolution study has provided the first comprehensive landscape of tissue-cellular origins for cEVs in PDAC and suggested that immune cell-derived cEV components could reflect the TME heterogeneity, thus allowing dynamic monitoring of TME remodeling in a real-time, noninvasive fashion. However, this study has some limitations. The subtyping study was based on retrospective cohorts, and therefore most patients did not have enough gene mutation and cancer tissue-based molecular information to annotate exLR-based molecular subtypes. Moreover, the tumor samples matched with plasma EVs included in our study were derived from early resectable PDAC tumor tissue, which may insufficiently explain the characteristic correlation in advanced PDAC. Conclusions Our findings demonstrated the potential of circulating exLR for PDAC subtyping. Based on a deconvolution study and paired cancer tissue profiling, we identified a profound correlation of the immune landscape between tumors and cEVs and identified novel PDAC therapeutic targets. Importantly, we constructed a robust nine exLR-based tissue-applicable signature for prognostication assessment of PDAC. Therefore, cEV long RNA-based subtyping and deconvolution enable noninvasive prediction of immunogenic signatures and clinical outcomes for PDAC. Materials and methods Study design This study employed a two-phase design. First, the subtyping, deconvoluting, and biomarker discovery phase was conducted, where we comprehensively analyzed the circulating exLR-seq transcriptomic data from patients with PDAC in the internal cohort for the identification of exLR-based PDAC phenotypes and clinical significance. We deconvoluted the tissue-cellular origin of cEVs between healthy controls and PDAC samples and investigated the potential clinical efficacy of immunogenic components at tissue and circulatory levels. In the second phase, we identified a clinically translatable signature from cEVs that predicts patient survival and tumor relapse. We performed validation of the discovered biomarkers both in cEVs and tissue profiles from multiple clinical cohorts of patients with PDAC. Patient cohorts, sample characteristics, and data generation A total of 426 individuals were enrolled in this study, including 345 patients with pathologically confirmed PDAC and 81 healthy controls receiving routine healthcare. Among the 345 PDAC cases, 227 plasma specimens were collected from Fudan University Shanghai Cancer Center (Shanghai, China) (as an internal cohort) between January 31, 2012 and June 6, 2017, and 118 PDAC cases (as an external independent cohort) were recruited from Shanghai Hospital of the Second Military Medical University (Shanghai, China; n = 100) and Xi’an Jiaotong University Affiliated Medical Center (Xi’an, China; n = 18) between September 11, 2018 and April 2, 2019. Plasma was collected from PDAC patients before surgery for patients with resectable tumors and before chemotherapy for patients with inoperable tumors. The available plasma EV transcriptomic profiles originated from our previously published study[195]^13 and can be downloaded from the Gene Expression Omnibus (GEO) dataset with the access no. GEO: [196]GSE133684. For tissue specimen collection, we obtained 61 formalin-fixed, paraffin-embedded exLR-seq-matched PDAC tumor tissues from the internal cohort. The transcriptomic expression profiles of plasma EVs and matched tissue specimens from 61 PDAC patients from the internal cohort with OS and TTR information were determined. We downloaded tissue transcriptomic profiles from three external PDAC datasets (TCGA-PDAC, n = 150; ICGC-PDAC-CA, n = 131; ICGC-PDAC-AU, n = 73) to further validate the candidate exLR signature at the tissue level. The demographics and clinicopathologic characteristics for all patient cohorts are shown in [197]Table S1. We isolated plasma EVs by affinity-based binding to a spin column using the exoRNeasy serum/plasma kit (QIAGEN, Hilden, Germany). A SMARTer stranded total RNA-seq kit (Clontech, Palo Alto, CA, USA) was used to prepare the strand-specific RNA-seq libraries. The library quality of the 10 plasma exLR-seq samples per lane was sequenced with adding 20% PhiX to control base balance. The previously published Assembling Splice Junctions Analysis (ASJA) approach was used to conduct quality control, reads mapping, and transcripts quantification.[198]^47 The raw count matrixes were transformed into transcripts per kilobase million (TPM) values with gencodeV29 annotation to make gene expression profiles more comparable between cEVs and paired tissues. The expression of a circRNA gene for each exLR-seq sample was calculated as spliced reads per billion mapping (SRPBM = number of circular reads/total mapped reads [units in billion]). For the external TCGA-PDAC cohort, the RNA sequencing data and updated clinicopathological information were downloaded from UCSCXenaTools package ([199]https://github.com/cran/UCSCXenaTools). Digital quantification of tissue-cellular-originated cEVs and TILs We used the EV-origin approach to quantify the abundance of EV fractions from blood and tissue origins from plasma exLR expression data.[200]^16 The EV-origin method was constructed from two representative signature matrices and a core SVM regression algorithm to deconvolute 7 hemopoietic cells and 16 tissue components from exLR-seq transcriptomic profiles. In this study, we expanded the number of traced hemopoietic fractions to 23, including 21 types of immunocytes and 2 hematopoietic stem cell constituents (see [201]Table S11). The new hemopoietic signature matrix was constructed by the normalized expression profiles of isolated blood cells from two datasets.[202]^16^,[203]^48 We selected cell-specific genes and re-built the comparable blood signature matrix by the same method as EV-origin. The relative proportions and absolute fractions of 23 hemopoietic cells and 16 types of tissue/solid organ-derived cEV constituents were detected in each exLR-seq sample. To determine the abundance of infiltrating immune cells in PDAC tissue, we used a single-sample GSEA (ssGSEA) integrated with TIL gene signatures,[204]^49^,[205]^50 which allows for reliably discriminating the enrichment of 17 human immune cell phenotypes from tissue expression profiles. exLR-based PDAC subtyping We used the ConsensusClusterPlus package to select the optimal cluster number and determined the clustering stability for the internal cohort.[206]^51 To capture stabilized exLR-based subtypes of PDAC, we first removed the exLRs with expression frequency no more than 50% to reduce the influence of low-frequency noises. Then we used a median absolute deviation (MAD) algorithm (a robust measure of sample bias for univariate numerical data) to fully characterized the exLR expressional viability in the internal exLR-seq cohort. We finally harvested 8,922 exLR targets (top 60% targets of MAD quantitative ranking) as input markers for consensus clustering. These exLR-seq samples were grouped using hierarchical agglomerative clustering based on Manhattan distance and the strategy of partition around medoids (PAM). We selected the iteration number as 500 and the subsampling ratio as 0.8 to identify optimal clusters from a consensus similarity matrix generated by consensus clustering. Three completely segregated subtypes from the exLR-seq data in the internal cohort were obtained. These discovered clusters were used as representative subpopulations for further analysis. Building a predictive model for PDAC clinical outcome On the basis of the exLR-seq analysis, we developed a combined circulating prognostic score (cp-score) for noninvasively estimating PDAC prognosis. We used the 227 cases in the internal cohort for discovery and 118 cases in the external independent cohort for validation. In the exLR-seq data from the internal cohort, we removed exLRs with a frequency less than 25% to obtain reliably expressed variables. A Kruskal-Wallis test was used to assess the differential expression of exLRs among the three clusters, and the p value of each marker was adjusted by the Benjamini-Hochberg method to control the FDR.[207]^52 We retained exLRs with an FDR less than 0.05 and carried out a consecutive model-based variable selection approach to identify prognostic valuable markers for predicting survival outcome.[208]^53 We first performed a univariate pre-screening procedure to remove excessive noise and obtain cancer-related targets to facilitate the subsequent computational analysis. We used whole-transcriptome markers as covariates, and the univariate Cox proportional hazards model was fitted to explore prognostic targets both in cEVs and paired tumor tissue profiles. Only markers with a p value <0.05 from the Wald statistic were retained. Overall, we obtained 986 exLRs that concurrently showed prognostic concordance in blood and tissue, in which the expression level indicates the same prognostic risk both in patient cEVs and tissues. We then used LASSO-Cox and the random forest algorithm to further shrink the marker number to a reasonable range.[209]^54^,[210]^55 For the LASSO-Cox constraint, 80% of exLR-seq samples from the training cohort were randomly subsampled to conduct LASSO regularization in 1,000 repetitions. We used 10-fold cross-validation and the Akaike information criterion to select an optimal value of “1-se” lambda parameter to construct an adaptive prognostic model for marker selection. The Boruta algorithm (a wrapper built around the random forest classification algorithm) was used to concurrently conduct dimension reduction and screen prognostically significant exLRs. This process generated nine final markers for a prognostic signature in the internal and validation cohorts. By fitting a multivariable Cox proportional hazards model on these nine targets from the internal EVs and tissue dataset, we determined the coefficients of each marker and generated two combined prognostic scores (p-score, designated as the cp-score for blood EVs and the tp-score for PDAC tumor tissue) for each individual. We then validated the significance of the prognostic scores in the independent datasets for the assessment of clinical outcome of PDAC. Functional and pathway enrichment analysis Weighted gene co-expression network analysis (WGCNA) was performed to construct co-expressed networks and identify co-expression of exLR modules in the exLR-seq data from the internal cohort.[211]^56 We used median absolute deviation to estimate the degree of variation for exLR expression and applied WGCNA based on the exLR-seq expression data constructed by the top 5,000 aberrantly expressed exLRs. We first created a matrix of adjacencies by conducting the Pearson correlation analysis between gene pairs to determine the concordances of gene expression; this matrix was then transformed into a topological overlap matrix. The resulting topological overlap is a biologically meaningful measure of gene similarity based on co-expression relationships for the comparison of each gene pair. To identify the relatively balanced scale independence and mean connectivity of the WGCNA, we carefully selected the scale-free topology fit index to find an optimal soft-thresholding power. As shown in [212]Figure S2A, the power value 16 was used to produce a hierarchical clustering dendrogram of 5,000 variable exLR genes. The module eigengenes were determined as the primary key components computed by principal-component analysis that represents the manifestation of exLRs of a functional module into a unitary characteristic quantitative value. The manifestation configurations of exLR modules related to the traits of internal individuals were quantified by gene significance (GS) and module significance (MS). The GS measure was defined as the results of the Spearman correlation analysis among the ith gene profile xi and the sample trait T, and MS was defined as the average GS for all genes in the module: [MATH: GSi =|cor(xi,T)|< /mo>. :MATH] The exLR genes were separated into eight co-expression modules according to their degree of connectivity. Each subunit was autonomously authenticated with another, which revealed a high level of individuality among the modules and the comparative independence of gene manifestation in each module. The exLRs that could not be contained in any modules were placed into module 8, which was removed in the subsequent GO analysis. All modules revealed a high correlation with PDAC phenotypes (C-I and C-III) and stage ([213]Figure 2A). We represented seven modules (M1–M7) as the most significant modules to reflect cEV inner transcriptomic heterogeneity among PDAC clusters. Gene annotation enrichment results were estimated from the clusterProfiler package[214]^57 for each module-specific exLR gene. We used a strict cutoff of FDR <0.05 to identify GO terms that were significantly enriched in each module ([215]Tables S2–S8). We identified the aberrantly enriched signaling pathways among PDAC subtypes by running a pre-ranked GSEA of the expression data for exLR genes. Gene sets were downloaded from the MSigDB platform of the Broad Institute that includes current annotations of Hallmark, Reactome, and KEGG datasets (Molecular Signatures Database v7.2).[216]^58^,[217]^59 The p value calculations were based on 5,000 permutations of GSEA, and the multiple testing was subsequently adjusted by Benjamini-Hochberg method to regulate the FDR.[218]^52 Statistical analysis We used the Shapiro-Wilk normality test to examine the normality of quantitative variables. For the comparison of variables in two groups, the statistical significance for normally distributed variates was evaluated by a two-side Student’s t tests, and the group with non-normal distributions was tested by a Mann-Whitney U test (Wilcoxon rank-sum test). For the comparison of unreplicated data in more than two groups, the Kruskal-Wallis test and Friedman rank-sum test were implemented as non-parametric statistical approaches. We applied the Benjamini-Hochberg strategy to adjust the p value of multiple DEG analysis to control the FDR.[219]^52 The logistic regression was used to determine the statistical significance of the correlation between qualitative variables and PDAC subtypes. The optimal cutoff values of each dataset were calculated based on the correlation between patient survival data and continuous variables in each separate dataset using the survminer package. The Kaplan-Meier approach was used to construct cumulative survival curves for subgroups of each cohort, and the log-rank test was implemented to determine the statistical difference between groups. We used the univariate Cox proportional hazards regression model to analyze HRs for univariate prognostic analysis. We hypothesized that the power of prognostic evaluation could be more than 0.9 with the introduction of exLRs, EV-derived components, and p-signature. With a two-side alpha (significance level) at 0.05, an overall event rate set at 0.93 for the internal cohort and 0.72 for the independent cohort, and an HR at 1.5, the included sample size was sufficient in the internal and validation cohorts to ensure a power of more than 95% for predicting PDAC prognosis (according to the PASS 11.0 software of power [1−β] calculation for the Cox proportional hazards regression analysis with covariates).[220]^60 The prognostic independence of the p-score was determined by the multivariate Cox regression model. All statistical analyses were conducted by R program (v3.6.1, [221]https://www.r-project.org/). The p values were two-sided, and a p value less than 0.05 was considered statistically significant. Ethical approval and consent to participate This study was approved by the Ethics Committee of the Fudan University Shanghai Cancer Center, Shanghai, China (approval no. 050432-4-1212B), and written informed consent was obtained from each participant in accordance with institutional guidelines. Consent for publication All of the authors have read and approved the final manuscript for publication. Availability of supporting data The transcriptomic data have been deposited at the GEO. The raw sequencing reads and integrated count matrix can be fully accessed under GEO: [222]GSE133684 and [223]GSE181273. Acknowledgments