Abstract Background Pre-mRNA processing factor 40 homolog A (PRPF40A) is an important protein involved in pre-mRNA splicing and is expressed in a variety of cell types. However, the function of PRPF40A in pancreatic cancer remains unclear. Therefore, our study is to investigate the role of PRPF40A in the pathogenesis of pancreatic cancer. Materials and methods We extracted expression data and clinical information of PRPF40A from different online databases, including the Cancer Genome Atlas (TCGA), Oncomine and the Gene Expression Omnibus (GEO). Subsequently, samples were collected from patients to validate gene expression using qPCR, Western blotting and immunohistochemical (IHC) analyses. Receiver operating characteristic (ROC) and Kaplan-Meier curve were used to evaluate the diagnostic and prognostic potential. Colony formation assays and CCK-8 assays were performed to measure the proliferative capacity of pancreatic cancer. Finally, gene ontology (GO) and pathway enrichment analyses of co-expressed genes of PRPF40A were conducted using the Database for Annotation, Visualization and Integrated Discovery (DAVID). Results We found that PRPF40A was upregulated based on data from both the online databases and our samples. PRPF40A possessed a significant diagnostic value, and its overexpression was associated with poor prognosis. PRPF40A knockdown inhibited cell proliferation in pancreatic cancer. GO and pathway analysis showed that the co-expressed genes were mainly involved in viral processing, mRNA splicing and the AMPK signaling pathway. Conclusion The results suggest that PRPF40A is an oncogene and can serve as a diagnostic and prognostic biomarker for pancreatic cancer. However, the underlying mechanisms remain to be elucidated. Keywords: PRPF40A, pancreatic cancer, diagnosis, prognosis, biomarker Introduction Pancreatic cancer is often not diagnosed until advanced stage due to its anatomical location and aggressive nature, which results in a high mortality rate. The prognosis of pancreatic cancer is typically worse than that of other digestive tumors, and the 5-year survival rate is less than 5%.[42]^1 Chemotherapy is not always effective and radical resection is currently the only curative treatment for pancreatic cancer.[43]^2^,[44]^3 Therefore, more effective methods of making early diagnosis and indicating prognosis are required to improve pre-operative condition assessment and the personalized medical treatment for patients with pancreatic cancer. At present, the use of targeted therapy for the treatment of pancreatic cancer has become a topic of increased interest.[45]^4 Previous studies have discovered several clinically useful markers of pancreatic cancer. For instance, CA19-9 and CA125 are well-established diagnostic and prognostic markers for pancreatic cancer.[46]^5^,[47]^6 Moreover, MIC-1, PAM4, S100A6 and OPN can serve as diagnostic markers for pancreatic cancer, while SPARC, CISD2 and SMAD4 are identified as promising prognostic markers.[48]^7^–[49]^13 PRPF40A is one of the two putative homologs of Pre-mRNA processing protein 40, which plays a vital role in the initiation of pre-mRNA splicing.[50]^14^,[51]^15 It has been reported that these two homologs are associated with genetic diseases such as Rett syndrome and Huntington’s disease.[52]^16^,[53]^17 Besides, PRPF40A may be one of the potential target genes of p53, which can cause malignant human cancer when it is mutated.[54]^18 Recent studies found that PRPF40A is associated with hypoxia response in lung cancer and is overexpressed in pancreatic ductal adenocarcinoma.[55]^19^,[56]^20 However, the role of PRPF40A in pancreatic cancer and its underlying molecular mechanisms remain unclear. To study the expression pattern of PRPF40A and the biological mechanisms involved in tumorigenesis and the progression of pancreatic cancer, we extracted genetic expression data and clinical information from online databases (TCGA, Oncomine and GEO) and obtained co-expressed genes of PRPF40A from GEPIA. Subsequently, gene ontology and pathway analysis were performed using DAVID and a co-expression network was constructed using GeneMANIA and Cytoscape. Colony formation assays and CCK-8 assays were also performed. In addition, specimens and clinicopathological data were collected from patients treated at our hospital and analyzed to validate the results of the bioinformatic analysis. Materials and methods Bioinformatic analysis Expression and survival data extraction The results of the differential expression analysis performed on tumor and normal tissues from patients with cancer, including pancreatic cancer, were obtained and analyzed using the Gene Expression Profiling Interactive Analysis (GEPIA) server ([57]http://gepia.cancer-pku.cn/index.html),[58]^21 which is based on the Cancer Genome Atlas (TCGA) and the Genotype-Tissue Expression (GTEx) project and contains 171 normal and 179 cancer samples in total. Overall survival and disease-free survival curves were also generated based on this database. Differential expression analysis was performed using the Oncomine database ([59]http://www.oncomine.com)[60]^22 with data from three datasets that were uploaded by Segara (6 normal and 11 cancer tissues),[61]^23 Pei (16 normal tissues and 36 cancer tissues)[62]^24 and Badea (39 tumor tissues and paired normal tissues).[63]^25 In addition, three datasets ([64]GSE74629, [65]GSE71729 and [66]GSE62452) were obtained from the Gene Expression Omnibus (GEO) database ([67]https://www.ncbi.nlm.nih.gov/geo/).[68]^26 [69]GSE74629 contains 14 normal and 36 tumor samples, [70]GSE71729 contains of 46 normal and 145 tumor samples, and [71]GSE62452 is comprised of 69 tumor and paired normal samples. Six GEO datasets were used to determine the diagnostic power of PRPF40A, including [72]GSE1542 (24 tumor vs 25 adjacent normal tissues), [73]GSE15471 (39 tumor and paired adjacent normal tissues), [74]GSE16515 (36 tumor vs 16 adjacent normal tissues), [75]GSE28735 (45 tumor and paired adjacent normal tissues), [76]GSE62452 (69 tumor vs 61 adjacent normal tissues) and [77]GSE71729 (145 tumor vs 46 adjacent normal tissues). Gene ontology (GO) and pathway enrichment analysis The top 100 co-expressed genes targeted by PRPF40A in pancreatic cancer were screened using GEPIA and imported into the Database for Annotation, Visualization and Integrated Discovery (DAVID) ([78]http://david.abcc.ncifcrf.gov/)[79]^27 to perform the GO and pathway analysis. GO analysis is a bioinformatics method that is mainly used to annotate genes and categorize them according to biological process (BP), cellular component (CC) and molecular function (MF).[80]^28 Pathway analysis was performed using the Kyoto Encyclopedia of Genes and Genomes (KEGG), which is a database that contains various kinds of data from large molecular datasets generated using high-throughput experimental technologies.[81]^29 The results were visualized using GraphPad Prism 7. Protein-protein interaction (PPI) network construction and analysis The data of PRPF40A and its co-expressed genes was uploaded into GeneMANIA ([82]http://genemania.org/), which is an online database to predict protein-protein interaction based on data collected from GEO, BioGRID, IRefIndex and I2D,[83]^30 and then imported into Cytoscape[84]^31 to generate a PPI network. GO analysis of these genes was performed and visualized using the BiNGO plugin in Cytoscape.[85]^32 Clinical samples Forty-four surgical specimens from pancreatic cancers and adjacent normal tissues that were located at least 2 cm away from the tumor margin were collected from patients who were diagnosed based on clinical features and histopathological examination at Ruijin Hospital, which is affiliated with Shanghai Jiao Tong University. Twenty-nine males and 15 females, with an average age of 63 (ranged: 40~78) years old, were included in this study. The histological type was identified by two expert pathologists independently. The tumor samples used in this study were all derived from pancreatic ductal adenocarcinoma. The clinical-pathological data were gathered from medical records, and the tumor stages were classified according to the AJCC pancreatic cancer TNM staging system (2017).[86]^33 Written informed consents were obtained from all the participating patients and the experiments involving in human specimens were conducted in compliance with the Ethics Committee of Human Experimentation and with the Declaration of Helsinki as revised in 2013. The study protocol was approved by the Ethics Committee of Shanghai Ruijin Hospital (No.179 in 2017) for the scientific research of these clinical materials. Immunohistochemical analysis The specimens were fixed in formalin, embedded in paraffin and cut into 4 μm sections. The sections were incubated overnight at 4 °C in the presence of primary antibody against PRPF40A (rabbit polyclonal; ab204371, 1:500; Abcam, USA). Negative controls were generated by incubating without the primary antibody. After the slides were stained with DAB and counterstained with hematoxylin, the PRPF40A staining was examined using Image-ProPlus (version 6.0, Media Cybernetics, Rockville, USA) as previously described.[87]^34 The specimens with negative or weak intensity staining (-/+) of PRPF40A were considered to have low expression of PRPF40A, while those with moderate or strong (++/+++) intensity staining were considered to have high expression. Cell culture and Western blotting Seven cell lines derived from pancreatic tumors (sw1990, Panc-1, patu8988, Bxpc3, Aspc1, CFPAC1, and Capan1) and human pancreatic ductal epithelial (HPDE) cell line were acquired from Cell bank of Chinese Academy of Sciences and were cultured in RPMI 1640, DMEM, IMDM supplemented with 10% fetal bovine serum (FBS) and antibiotics. Western blotting was used to analyze protein expression, the results of which were quantified using ImageJ. Briefly, each type of cell was collected in cold PBS and lysed using RIPA buffer (Sigma-Aldrich; R0278) supplemented with protease and phosphatase inhibitor cocktails (Sigma-Aldrich; P8340). The proteins were transferred onto nitrocellulose membranes following sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Antibodies against PRPF40A (rabbit polyclonal; ab204371, 1:500; Abcam, USA) and GAPDH (rabbit polyclonal; D16H11, 1:1000; CST, USA) were used to bind the corresponding proteins. Cell proliferation assays Colony formation assays were performed by plating pancreatic cancer cells (Capan1) treated with PRPF40A-siRNA or an empty vector in 6-well plates (1000 cells/well) with 5 mL culture medium. After 14 days of incubation, the colonies were stained with 0.1% crystal violet solution. The cell proliferation capacity was also measured using a cell counting kit8 assay (CCK8, Dojindo, Japan). Numbers of viable cells were quantified at each 24 h interval by measuring the OD450 using a microplate reader (Epoch; BioTek, Winooski, VT). The experiments were performed independently in triplicate. Quantitative real-time PCR Total RNA from 12 pairs of pancreatic cancer tissues were extracted using Trizol reagent (Invitrogen, Carlsbad, CA, USA) and reverse transcription (RT) reactions were performed using random primers and an M-MLV Reverse Transcriptase kit (Invitrogen). Real-time PCR was performed using the standard protocol included with the SYBR Green PCR kit (Toyobo, Osaka, Japan) on a Rotor-Gene RG-3000A (Corbett Life Science, Sydney, NSW, Australia). GAPDH was used as a reference. The primer sequences used for the detection of PRPF40A and GAPDH are shown in [88]Table 1. The ΔCt values were normalized based on the GADPH levels. Each sample was analyzed in triplicate. Table 1. Primer sequence used for mRNA detection Primer name Forward primer Reverse primer PRPF40A 5ʹ-ACACCTGCTGAGCAACTCTTA-3’ 5ʹ-TGGCCCAGCGAGATTCTTTTG-3’ GAPDH 5ʹ-GGAGCGAGATCCCTCCAAAAT-3’ 5ʹ-GGCTGTTGTCATACTTCTCATGG-3’ [89]Open in a new tab Statistical analysis SPSS version 20.0 (SPSS Inc., Chicago, IL, USA) was used to perform the statistical analysis. The correlation between PRPF40A expression and the clinicopathological characteristics of the patients was analyzed using the Chi-squared test. ROC curve was plotted to assess the diagnostic value. Area under the ROC curve (AUC) was used to determine the diagnostic power. A graded AUC value of 0.5–0.7, 0.7–0.9 or 0.9–1.0 represented a poor, moderate or high diagnostic value. Overall survival was determined using Kaplan-Meier analysis with a log-rank test. Cox regression analysis was used to assess the hazard ratio (HR) and 95% confidence intervals (CI). A value of P<0.05 was considered to denote statistical significance. Results The expression of PRPF40A is upregulated in patients with pancreatic cancer To measure the differential expression of PRPF40A in cancer, we extracted expression data from various types of cancers using Gene Expression Profiling Interactive Analysis (GEPIA), which showed that PRPF40A was overexpressed in many digestive cancers, such as gastric cancer, esophageal cancer, pancreatic cancer and colorectal cancer ([90]Figure 1A). To further investigate the expression of PRPF40A in pancreatic cancer, differential expression analysis was performed in Oncomine using three datasets, all of which showed significant upregulation of PRPF40A ([91]Figure 1B–[92]D). GEPIA analysis demonstrated that PRPF40A was significantly overexpressed in tumor tissues compared with normal tissues ([93]Figure 1E). Similarly, significant upregulation of PRPF40A was also detected in all three datasets obtained from the Gene Expression Omnibus (GEO) ([94]Figure 1F). Figure 1. [95]Figure 1 [96]Open in a new tab Differential expression of PRPF40A. Notes: Expression of PRPF40A (A) in various cancers, (B–D) in three separate Oncomine datasets that were uploaded by Segara, Pei and Badea, (E) in the Gene Expression Profiling Interactive Analysis (GEPIA) database and (F) in three Gene Expression Omnibus (GEO) datasets ([97]GSE74629, [98]GSE71729 and [99]GSE62452). *P<0.05, **P<0.01, ****P<0.0001. Abbreviations: N, normal; T, tumor. Validation of the PRPF40A expression level in pancreatic tissues and cell lines To verify the expression of PRPF40A in online database, immunohistochemical analysis was performed to examine the 44 paired samples. The results are shown in [100]Table 2, and indicate that pancreatic cancer tissues express more PRPF40A protein than adjacent tissue. Representative images are shown in [101]Figure 2A. In addition, RNA expression was detected in 12 paired tissues using qRT-PCR, and we found that PRPF40A was significantly upregulated in tumor tissues (2.968±3.265 vs 6.323±4.885; P<0.05) ([102]Figure 2B). Moreover, the expression level of PRPF40A was also studied in cell lines using qRT-PCR and Western blot, and the results showed that PRPF40A was upregulated in patu8988 and Capan1 cells ([103]Figure 2C and [104]D). All of the results mentioned above indicate that PRPF40A may be associated with carcinogenesis or the progression of pancreatic cancer. Table 2. Expression of PRPF40A in pancreatic cancer tissue and paired normal tissue by immunohistochemical analysis (n=44) Cases Expression of PRPF40A χ2 P-value Low expression High expression Pancreatic cancer tissue 44 12 (27.3%) 32 (72.7%) 5.74 0.017* Adjacent tissue 44 23 (52.3%) 21 (47.7%) [105]Open in a new tab Notes: *P<0.05 was considered to denote statistical significance. Figure 2. [106]Figure 2 [107]Open in a new tab Biological validation of PRPF40A expression. Notes: (A) Left panel: Representative images of PRPF40A expression in normal and pancreatic cancer tissues generated using immunohistochemical analysis. Right panel: Percentage of positive cells of each slide measured by Image-ProPlus (B) PRPF40A expression in 12 patient samples, as measured by qRT-PCR. (C and D) PRPF40A expression in various cell lines according to qRT-PCR and Western blotting. The results were quantified using ImageJ. *P<0.05. The correlations between the expression of PRPF40A and clinicopathological parameters To investigate the clinical relevance of PRPF40A expression in pancreatic cancer, baseline clinical characteristics of the 44 patients included in this study were collected and analyzed. As is shown in [108]Table 3, the expression of PRPF40A was correlated with the level of tumor differentiation (P=0.0157) but not the TNM stage, tumor location, the degree of nerve and vascular invasion or distant metastasis. Moreover, we extracted and analyzed clinical and mRNA expression data for pancreatic cancer from the Cancer Genome Atlas (TCGA) database. The results in [109]Table 4 suggest that no significant association exists between PRPF40A expression and the pathological stage, TNM stage or R stage. Table 3. The correlation between PRPF40A expression and clinicopathological features of patients with pancreatic cancer Characteristics n=44 Expression of PRPF40A P-value Low expression High expression Gender Male Female 29 (65.9%) 8 (27.6%) 4 (26.7%) 21 (72.4%) 11 (73.3%) 0.948 15 (34.1%) Age(years) ≥60 <60 29 (65.9%) 9 (31%) 3 (20%) 20 (69%) 12 (80%) 0.500 15 (34.1%) Tumor size(cm) ≥3 <3 30 (68.2%) 7 (23.3%) 5 (35.7%) 23 (76.7%) 9 (64.3%) 0.475 14 (31.8%) Location Head and neck Body and tail 34 (77.3%) 10 (29.4%) 2 (20%) 24 (70.6%) 8 (80%) 0.701 10 (22.7%) Differentiation Poorly differentiated Moderately and well differentiated 22 (50.0%) 2 (9%) 10 (45.4%) 20 (91%) 12 (54.6%) 0.0157* 22 (50.0%) TNM stage Ⅰ+Ⅱ Ⅲ+Ⅳ 41 (93.2%) 12 (29.3%) 0 (0%) 29 (70.7%) 3 (100%) 0.550 3 (6.8%) Lymph node metastasis Yes No 17 (38.6%) 4 (23.5%) 8 (29.6%) 13 (76.5%) 19 (70.4%) 0.739 27 (61.4%) Nerve invasion Yes No 28 (63.6%) 9 (32.1%) 3 (18.8%) 19 (67.9%) 13 (81.2%) 0.486 16 (36.4%) Vascular invasion Yes No 8 (18.2%) 1 (12.5%) 11 (30.6%) 7 (87.5%) 25 (69.4%) 0.412 36 (81.8%) Distant metastasis Yes 2 (4.5%) 0 (0%) 2 (100%) 0.375 No 42 (95.5%) 12 (28.6%) 30 (71.4%) [110]Open in a new tab Notes: *P<0.05 was considered to denote statistical significance. Table 4. The association between PRPF40A expression and clinicopathological features of patients with pancreatic cancer in TCGA database Characteristics n=163 Expression of PRPF40A P-value Low expression High expression Gender Male 91 (55.8%) 44 (48.4%) 47 (51.6%) 0.7 Female 72 (44.2%) 37 (51.4%) 35 (48.6%) Age(years) ≥60 115 (70.6%) 59 (51.3%) 56 (48.7%) 0.524 <60 48 (29.4%) 22 (45.8%) 26 (54.2%) Pathologic stage I 19 (11.7%) 11 (57.9%) 8 (42.1%) 0.457 II 136 (83.4%) 68 (50%) 68 (50%) III 3 (1.8%) 1 (33.3%) 2 (66.7%) IV 5 (3.1%) 1 (20%) 4 (80%) T stage T1 6 (3.7%) 1 (16.7%) 5 (83.3%) 0.158 T2 23 (14.1%) 15 (65.2%) 8 (34.8%) T3 131 (80.4%) 64 (48.9%) 67 (51.1%) T4 3 (1.8%) 1 (33.3%) 2 (66.7%) N stage N0 47 (28.8%) 26 (55.3%) 21 (44.7%) 0.656 N1 68 (41.7%) 32 (47.1%) 36 (52.9%) N2 48 (29.5%) 23 (47.9%) 25 (52.1%) M stage M0 77 (47.2%) 38 (49.4%) 39 (50.6%) 0.383 M1 5 (3.1%) 1 (20%) 4 (80%) Mx 81 (49.7%) 42 (51.9%) 39 (48.1%) R stage R0 103 (63.2%) 56 (54.4%) 47 (45.6%) 0.208 R1 52 (31.8%) 20 (38.5%) 32 (61.5%) R2 4 (2.5%) 2 (50%) 2 (50%) Rx 4 (2.5%) 3 (75%) 1 (25%) [111]Open in a new tab Abbreviation: TCGA, The Cancer Genome Atlas. The diagnostic and prognostic value of PRPF40A To evaluate the diagnostic value of PRPF40A, ROC curves were plotted using data from six GEO datasets mentioned above. The values of AUC were 0.617, 0.811, 0.877, 0.732, 0.733 and 0.686, respectively ([112]Figure 3A–[113]F), suggesting that PRPF40A has a moderate diagnostic value for pancreatic cancer. To evaluate the prognostic potential of PRPF40A, survival data for patients with pancreatic cancer was analyzed and visualized using GEPIA. The results indicated that the high expression of PRPF40A was an unfavorable prognostic factor for overall survival (P=0.037) and disease-free survival (P=0.013) ([114]Figure 4A and [115]B). To further assess the prognostic value of PRPF40A, we collected follow-up information from 44 patients with pancreatic cancer that were treated at our hospital and divided them into two groups (high expression and low expression groups). The Kaplan-Meier analysis showed that patients with high expression of PRPF40A exhibited poor overall survival ([116]Figure 4C). In addition, a Cox proportional hazard regression model was used to screen prognostic factors for patients in the TCGA database and those who were treated at our hospital. As is shown in [117]Table 5, using univariate analysis, the N stage (P<0.001), R stage (P=0.02) and expression of PRPF40A (P=0.016) were significantly correlated with the survival of pancreatic cancer patients. Using multivariate analysis, the N stage (P=0.006) and expression of PRPF40A (P=0.041) were identified as independent prognostic factors for survival of pancreatic cancer patients. Similarly, as shown in [118]Table 6, the results of the univariate analysis showed that tumor differentiation (P=0.04), distant metastasis (P=0.003) and the expression of PRPF40A (P=0.014) were correlated with patient survival. The multivariate analysis showed that distant metastasis (P=0.014) and the expression of PRPF40A (P=0.038) could independently predict survival status in pancreatic cancer patients. All of the above results suggest that PRPF40A could serve as a potential prognostic biomarker for pancreatic cancer. Figure 3. [119]Figure 3 [120]Open in a new tab Diagnostic value assessment of PRPF40A by ROC method using GEO datasets. Notes: (A-F) The ROC curve generated using data from [121]GSE1542, [122]GSE15471, [123]GSE16515, [124]GSE28735, [125]GSE62452 and GSE 71729 respectively. A graded AUC value of 0.5–0.7, 0.7–0.9 or 0.9–1.0 indicated a poor, moderate or high diagnostic value. Abbreviations: ROC, Receiver operating characteristic; GEO, the Gene Expression Omnibus; AUC, Area under the ROC curve. Figure 4. [126]Figure 4 [127]Open in a new tab Kaplan-Meier survival analysis of PRFP40A in pancreatic cancer. Notes: (A) Correlations of overall survival and (B) disease-free survival with PRPF40A in pancreatic cancer, based on data obtained from the GEPIA database. (C) Correlation between overall survival and PRPF40A based on data for 44 patients from our center. The patients were stratified based on high expression and low expression of PRPF40A. The ratio next to the curve shows the number of dead people in this group. P<0.05 was considered to denote statistical significance. *P<0.05, **P<0.01. Table 5. Univariate and multivariate analysis of clinicopathological parameters of patients with pancreatic cancer in TCGA database by cox-regression Univariate Analysis Multivariate Analysis Characteristics P-Value HR CI95 P-Value HR CI95 N stage <0.001* 1.6 1.21–2.116 0.006* 1.499 1.121–2.006 R stage 0.02* 1.39 1.054–1.832 0.071 1.337 0.999–1.789 Expression of PRPF40A 0.016* 1.703 1.103–2.63 0.041* 1.578 1.019–2.445 [128]Open in a new tab Notes: *P<0.05 was considered to denote statistical significance. Abbreviations: TCGA, The Cancer Genome Atlas; HR, hazard ratio; CI, confidence interval. Table 6. Univariate and multivariate analysis of clinicopathological parameters of patients with pancreatic cancer by cox-regression Univariate Analysis Multivariate Analysis Characteristics P-Value HR CI95 P-Value HR CI95 Differentiation 0.04* 0.43 0.192–0.963 0.323 0.605 0.277–1.526 Distant metastasis 0.003* 12.804 2.447–67.003 0.014* 8.324 1.549–44.738 Expression of PRPF40A 0.014* 4.604 1.365–15.523 0.038* 3.788 1.078–13.306 [129]Open in a new tab Notes: *P<0.05 was considered to denote statistical significance. Abbreviations: HR, hazard ratio; CI, confidence interval. Downregulation of PRPF40A inhibits pancreatic cell proliferation To further investigate the role of PRPF40A in the progression of pancreatic cancer, cell lines in which PRPF40A was stably downregulated were established. Colony formation assays and CCK-8 assays were performed. As is shown in [130]Figure 5A, after culturing for 14 days, Capan1 cell lines with downregulated PRPF40A expression formed significantly fewer colonies compared to untreated control cell lines. Similarly, the CCK-8 assays also indicated that the downregulation of PRPF40A inhibited pancreatic cell proliferation ([131]Figure 5B). Overall, these results suggested that PRPF40A could enhance the proliferative capacity of pancreatic cancer. Figure 5. [132]Figure 5 [133]Open in a new tab Cell proliferation assay in the presence of PRPF40A knockdown in a pancreatic cancer cell line. Notes: (A) The colony formation assay was performed using a pancreatic cell line (Capan1) treated with empty vector and three siRNAs for 14 days. (B) The cell proliferation capability of Capan1 cells treated with empty vector and three siRNAs was measured using CCK-8 assays. *P<0.05. Gene ontology and pathway enrichment analysis To reveal the molecular mechanisms operating downstream of PRPF40A in pancreatic cancer, top 100 co-expressed genes targeted by PRPF40A were determined using GEPIA and then imported into the Database for Annotation, Visualization and Integrated Discovery (DAVID) to perform gene ontology and pathway analysis. The results are presented in [134]Figure 6. It was found that, in the biological process (BP) category, the co-expressed genes were mainly involved in regulation of serine/threonine kinase activity, mRNA splicing, gene expression, regulation of cell differentiation and cell adhesion. In the cellular component (CC) category, they were mainly located in the cytosol, membrane and nucleoplasm. In regard to the molecular function (MF) category, the co-expressed genes were mainly associated with RNA and protein binding. Pathway analysis using KEGG database indicated that the enriched genes were mainly involved in the AMPK signaling pathway, endocytosis and the mRNA surveillance pathway. Figure 6. [135]Figure 6 [136]Open in a new tab Gene ontology and pathway enrichment analysis of co-expressed gene targets of PRPF40A. Notes: The top 10 genes in the biological process (BP), cellular component (CC) and molecular function (MF) categories are listed. The results of the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis is also shown. Co- expression network construction and analysis To investigate the protein-protein interactions mediated by the co-expressed genes, a protein-protein interaction (PPI) network was constructed ([137]Figure 7A). Similarly, a PRPF40A associated PPI network was constructed to identify the potential molecular targets ([138]Figure 7B). In addition, a network based on the gene ontology analysis was constructed using the BiNGO plugin ([139]Figure 8). Figure 7. [140]Figure 7 [141]Open in a new tab Protein-protein interaction (PPI) network of PRPF40A and its co-expressed genes. Notes: (A) PPI network showing the top 100 co-expressed genes. (B) PRPF40A associated PPI network showing potential molecular targets. The node size indicates the significance of the interaction. Figure 8. [142]Figure 8 [143]Open in a new tab Gene ontology analysis network of co-expressed genes. Notes: (A) Biological process and (B) molecular function. The color and size of the node indicates the significance of the interaction. Discussion In the present study, we used bioinformatic methods to extract PRPF40A expression data from public databases, including the Cancer Genome Atlas (TCGA), Oncomine and the Gene Expression Omnibus (GEO), and revealed that PRPF40A was significantly overexpressed in pancreatic cancer tissues. Furthermore, we collected 44 paired tissue samples from pancreatic cancer patients for immunohistochemical analysis, and 12 for qRT-PCR study. The results indicated that PRPF40A was upregulated in pancreatic cancer. In addition, it is also upregulated in patu8988 and Capan1 cell lines. All of the above results are consistent with previous evidence which suggests that PRPF40A was upregulated in pancreatic cancer and may be involved in carcinogenesis and/or the progression of pancreatic cancer.[144]^20^,[145]^35 There are several molecules that are significantly upregulated in pancreatic cancer and well established to be diagnostic markers for pancreatic cancer, such as CA19-9, VEGFR2 and IGF-I.[146]^36 Our study has identified PRPF40A as a potential diagnostic marker for pancreatic cancer using data from six GEO datasets. The values of AUC were 0.617, 0.811, 0.877, 0.732, 0.733 and 0.686 respectively, indicating that PRPF40A is, to some extent, capable of distinguishing pancreatic cancer patients from healthy individuals, which may contribute to preoperative diagnosis and assessment. Combined with serum level of CA19-9, the sensitivity and specificity of diagnosis by PRPF40A could be even higher, which requires further investigation to validate with large patient cohort. The 5-year survival of patients with pancreatic cancer has remained almost unchanged for many decades.[147]^37 Previous studies have identified several prognostic biomarkers in pancreatic cancer, such as TRPM8, CD74 and LKB1.[148]^38^–[149]^40 In our study, we investigated the association between clinicopathological features and PRPF40A expression in pancreatic cancer using samples from our patients and the TCGA database. We confirmed that expression of PRPF40A was associated with tumor differentiation, and that the overexpression of PRPF40A was significantly correlated with poor overall survival and disease-free survival in pancreatic cancer patients. In addition, multivariate Cox regression analysis revealed that the expression of PRPF40A could serve as an independent prognostic factor of pancreatic cancer. The results above suggest that PRPF40A is a promising prognostic biomarker to provide individualized treatment for pancreatic cancer patients, as well as a therapeutic target to improve patients’ long-term survival. To sum up, our study for the first time revealed the diagnostic and prognostic value of PRPF40A in pancreatic cancer. However, a study in a larger sample of patients is required for further validation. Malignant tumors are characterized by unsupervised cell proliferation that is caused by the mis-regulation of multiple proteins. Previous research has demonstrated that PRPF40A is involved in cell migration in untreated acute wounds and chronic wounds treated via debridement.[150]^41 In this study, we performed colony formation assays and CCK-8 assays using the Capan1 cell line. The results showed that inhibition of PRPF40A reduced the proliferation of pancreatic cancer, indicating that PRPF40A may be positively correlated with the proliferative capacity and serve as an oncogene in pancreatic cancer. However, the underlying mechanisms remain to be elucidated. To uncover the probable molecular mechanisms underlying the involvement of PRPF40A in the initiation and development of pancreatic cancer, we determined the top 100 co-expressed genes of PRPF40A in the Gene Expression Profiling Interactive Analysis (GEPIA) and imported them into the Database for Annotation, Visualization and Integrated Discovery (DAVID) to perform gene ontology and pathway analysis. In the BP category of gene ontology, the results were enriched in genes involved in regulation of serine/threonine kinase activity, mRNA splicing, gene expression, regulation of cell differentiation and cell adhesion. Each of these processes could result in the pathogenesis or progression of pancreatic cancer. Pathway analysis indicated that the enriched genes were mainly involved in the AMPK signaling pathway, which is known as a vital mediator in maintaining cellular energy homeostasis and has been recently found associated with invasion and metastasis of pancreatic cancer.[151]^42^,[152]^43 As is shown in [153]Figure 7B, we used GeneMANIA to predict genes that have physical interaction with PRPF40A. Among the top 5 related genes, FMNL3 has been well-established as an oncogene in many cancers that regulates cytoskeletal mediation, invasion and metastasis of cancer cells.[154]^44 Therefore, we are interested in the relationship between PRPF40A and FMNL3. This axis may be subject to crosstalk with the AMPK signaling pathway during the regulation of pancreatic cancer cell proliferation or other biological processes. Therefore, we will focus on this axis to further reveal the underlying molecular mechanisms. Interestingly, according to previously published study, PRPF40A is significantly associated with hypoxia markers (HIF-1α, CYGB and VEGFa) in non-small cell lung cancer,[155]^19 while in our study we found two of the top 100 co-expressed genes of PRPF40A, PGM2 (Spearman’s rho=0.632, P<10^−4) and RPE (Spearman’s rho=0.751, P<10^−4), were correlated with pentose phosphate pathway (not significant enough to be listed in [156]Figure 6), which initiates glycolysis that responses to hypoxia stress. This may account for hypoxia-induced upregulation of PRPF40A in non-small cell lung cancer. However, the underlying specific mechanisms remain to be discovered. There are some limitations in our study. First, we only collected 44 tumor and adjacent tissue samples from patients with pancreatic cancer for immunohistochemical analysis, and only 12 of these were used for qRT-PCR. Due to an inadequate number of samples, we cannot exclude the possibility that sampling error may have led to the conclusion that PRPF40A is upregulated in pancreatic cancer. Second, the 44 samples were all harvested from patients with pancreatic ductal adenocarcinoma, which accounts for almost 90 percent of pancreatic carcinomas; most previous research has focused on this type of pancreatic cancer.[157]^20 Therefore, the expression of PRPF40A in other types of pancreatic cancer, such as intraductal papillary mucinous neoplasms or neuroendocrine neoplasms, requires further investigation. Third, according to clinical correlation analysis, the expression of PRPF40A is correlated with the pathological grade ([158]Table 3) only when the data from TCGA ([159]Table 4) was excluded, which yielded contradictory results. Hence, a larger sample would be required to reveal the correlation of PRPF40A with clinical attributes. Fourth, to measure the effects of PRPF40A on cell proliferation, we performed the colony formation assays and CCK-8 assays in only one pancreatic cell line (Capan1), which is not sufficient to draw a solid conclusion. Thus, we will need to perform further functional testing to validate the results. Conclusion In conclusion, using bioinformatic approaches followed by biological validation, this study clearly confirmed that PRPF40A was significantly upregulated in pancreatic cancer, exhibited diagnostic value and was associated with poor prognosis. Downregulation of PRPF40A inhibited cell proliferation in pancreatic cancer. Overall, the results suggest that PRPF40A is an oncogene and can serve as a diagnostic and independent prognostic biomarker in pancreatic cancer. However, the underlying molecular mechanisms require further elucidation. Acknowledgments