Abstract
Prostate cancer (PCa) is the most common malignancy. New biomarkers are
in demand to facilitate the management. The role of the pinin protein
(encoded by PNN gene) in PCa has not been thoroughly explored yet.
Using The Cancer Genome Atlas (TCGA-PCa) dataset validated with Gene
Expression Omnibus (GEO) and protein expression data retrieved from the
Human Protein Atlas, the prognostic and diagnostic values of PNN were
studied. Highly co-expressed genes with PNN (HCEG) were constructed for
pathway enrichment analysis and drug prediction. A prognostic signature
based on methylation status using HCEG was constructed. Gene set
enrichment analysis (GSEA) and the TISIDB database were utilised to
analyse the associations between PNN and tumour-infiltrating immune
cells. The upregulated PNN expression in PCa at both transcription and
protein levels suggests its potential as an independent prognostic
factor of PCa. Analyses of the PNN’s co-expression network indicated
that PNN plays a role in RNA splicing and spliceosomes. The prognostic
methylation signature demonstrated good performance for
progression-free survival. Finally, our results showed that the PNN
gene was involved in splicing-related pathways in PCa and identified as
a potential biomarker for PCa.
Keywords: prognosis signature, PNN, immune infiltration, drug
prediction, methylation status, prostate cancer
Introduction
Prostate Cancer (PCa) is the third most common cancer overall ([38]Pan
et al., 2017) and the most common malignant tumour in the male
genitourinary system ([39]Ren et al., 2017; [40]Caggiano et al., 2019;
[41]Jambor et al., 2019). Its prevalence and mortality vary greatly
depending on race and geographic location ([42]Lindberg et al., 2013).
At present, PCa is usually screened and diagnosed through digital
rectal examination (DRE), prostate-specific antigen (PSA) value,
Gleason score by prostate biopsy, and magnetic resonance imaging (MRI)
of the prostate ([43]Patil and Gaitonde, 2016). New biomarkers used
with techniques such as liquid biopsy and imaging have also been used
for clinical diagnosis ([44]Kim et al., 2016; [45]Li et al., 2018;
[46]Law et al., 2020). In fact, metastatic PCa remains incurable
despite promising advances in biomedical research. Therefore, patients’
good prognosis is currently dependent on early detection. Conventional
non-surgical options for PCa therapy include androgen deprivation
therapy (ADT), radiotherapy (RT), ablation therapy, chemotherapy, and
emerging immunotherapy. However, the effectiveness of the drugs
including abiraterone and enzalutamide, are limited and temporary, but
has been established clinically.
New biomarkers for diagnosis and treatment need to explore the
mechanism deeply. In the past two decades, several mechanisms of PCa
have been continuously reported, including novel associations of
androgen signalling ([47]Caggiano et al., 2019; [48]Cioni et al.,
2020), TP53 signalling ([49]Ecke et al., 2010; [50]Liu et al., 2021),
and the Wnt signalling pathway ([51]Murillo-Garzón et al., 2018;
[52]Datta et al., 2020) with the disease. In fact, it is now believed
that various cytokines and intercellular signals regulate PCa during
its development ([53]Cucchiara et al., 2017). Thus, many potential
mechanisms of PCa remain to be explored, which may lead to new
diagnostic techniques or therapeutic strategies, especially for
metastatic PCa.
The pinin protein, reported as a desmosome-associated protein encoded
by the PNN gene, is a phosphoprotein rich in serine and arginine with a
molecular size of 140 kDa. Recently, it has been suggested that pinin
is associated with cell adhesion ([54]Tang et al., 2020; [55]Yao and
Ma, 2020). It serves as a putative tumour promoter by reversing the
expression of E-cadherin ([56]Simon et al., 2015). The upregulation of
pininhas been reported to enhance metastasis in colorectal cancer
([57]Wei et al., 2016), triple-negative breast cancer cells ([58]Kang
et al., 2020), pancreatic cancer ([59]Yao and Ma, 2020), and
nasopharyngeal carcinoma cells ([60]Tang et al., 2020). As an oncogenic
factor, PNN can protect hepatocellular carcinoma cells from apoptosis
([61]Yang et al., 2016) and promote cell adhesion in ovarian cancer
([62]Zhang et al., 2016), as well as renal cell carcinoma ([63]Jin et
al., 2021). These studies indicate the critical role of PNN in
metastasis; thus, it could be a potential biomarker for some tumours.
However, the role of pininin PCa progression has not been thoroughly
studied yet. Since the tumour microenvironment (TME) has emerged as a
critical factor in metastasis ([64]Yin et al., 2019; [65]Yuan et al.,
2022), there may also be a functional linkage between TME and PNN in
PCa, but this hypothesis remains to be investigated.
Since the PNN gene has not been comprehensively deciphered in PCa, we
conducted a series of studies on its roles in patients’ survival and
prognosis, as well as in immune infiltration in PCa through various
bioinformatic approaches. We explored the expression pattern of the PNN
gene and its potential prognostic value for PCa. We also investigated
the relationship between PNN and the tumour immune microenvironment
(TIME), which could facilitate understanding the mechanism of
immunotherapy for PCa and lead to the discovery of a prognosis
signature or novel therapeutic targets.
Materials and methods
To illustrate the function of PNN in PCa, we conducted a comprehensive
bioinformatic analysis using multiple datasets. The whole analysis
pipeline performed here is displayed in [66]Figure 1.
FIGURE 1.
[67]FIGURE 1
[68]Open in a new tab
Analysis pipeline of PNN performed in this study.
Data source
The transcriptome data [the level 3 mRNA expression data (FPKM),
normalized using
[MATH: log2(FPKM+1)
mo> :MATH]
] of normal tissues (52 cases) and tumour tissues with complete
clinical information (379 cases) were extracted from The Cancer Genome
Atlas (TCGA) database of prostate adenocarcinoma (PRAD). The mRNA
expression profiles contained in the [69]GSE116918 ([70]Jain et al.,
2018), [71]GSE29079 ([72]Börno et al., 2012), and [73]GSE6956
([74]Wallace et al., 2008) datasets, which were normalized by their
corresponding providers, were downloaded from Gene Expression Omnibus
(GEO) database. A total of 248 PCa cancer samples with clinical
information were included in the [75]GSE116918 dataset. The
[76]GSE29079 dataset contained 48 normal samples and 47 PCa samples,
while the [77]GSE6956 dataset had 18 normal samples and 69 PCa samples.
However, neither [78]GSE29079 nor [79]GSE6956 contains clinical
information. The BioGRID database offered 253 unique interactors of
pinin with experimental pieces of evidence ([80]Oughtred et al., 2021).
TSVdb offered PNN splicing variants expression ([81]Sun et al., 2018).
For PNN expression in pan-cancer, we downloaded the standardised
pan-cancer dataset TCGA TARGET GTEx (PANCAN, N = 19131, G = 60499) from
the UCSC ([82]https://xenabrowser.net/) database and further extracted
the expression data of PNN gene in each sample. In addition, we
filtered out the samples with zero expression levels, and further
transformed each expression value with log2 (x + 0.001), finally, we
excluded those with less than three samples in a single cancer species.
Protein expression analysis with the Human Protein Atlas database
The Human Protein Atlas (HPA) provides the protein expression of pinin
in normal prostate (via
[83]https://www.proteinatlas.org/ENSG00000100941-PNN/tissue/prostate)
and tumour tissues (via
[84]https://www.proteinatlas.org/ENSG00000100941-PNN/pathology/prostate
+cancer) ([85]Uhlén et al., 2015). All images of tissues in HPA
database are stained by immunohistochemistry. We extracted the
immunohistochemistry images directly from the HPA database.
Independent prognostic analysis
Correlation analysis of PNN expression and clinicopathological
characteristics was performed. The expression of PNN between the
subgroups was compared based on the following clinicopathological
features: age (<60 or ≥60 years old), N stage (N0, N1), M stage (M0,
M1), T stage (T2, T3, T4), surgical margin (R0, R1, R2, RX),
prostate-specific antigen (PSA) level (<10 or ≥10 years), and Gleason
score (6, 7, 8, 9, 10). Univariate and multivariate Cox regression
analyses were implemented to identify independent predictors of
survival in the TCGA-PRAD and [86]GSE116918 datasets.
Expression profiles of PNN gene in primary and metastatic prostate cancer
We downloaded [87]GSE38241 ([88]Aryee et al., 2013) and [89]GSE25136
([90]Sun and Goodison, 2009) datasets (the authors processed
normalisation) from GEO. For the merging of these datasets, we used the
method of COMBAT ([91]Johnson et al., 2007), implemented in the R
package inSilicoMerging ([92]Taminau et al., 2012) to obtain the
expression matrix. Finally, the PNN expression was compared using the
Kruskal-Wallis test.
Construction of the PNN co-expression network
We calculated the Pearson correlation of all genes (RNA-seq) in the
TCGA dataset with PNN using the Linkomics database
([93]http://www.linkedomics.org/) and selected the genes with
correlation coefficients > 0.8 and p < 0.05 as PNN co-expressed genes.
Functional and pathway enrichment analysis
The “clusterProfiler” R package was utilised to conduct Gene Ontology
(GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis
([94]Yu et al., 2012). GO enrichment analysis mainly described the
biological processes (BP), cellular components (CC), and molecular
functions (MF) correlated with genes. The threshold for significant
enrichment was set as a p-value < 0.05 or FDR < 0.05, as stated. Single
sample gene set enrichment analysis (ssGSEA) enrichment scores were
calculated in each sample using the “GSVA” package of R ([95]Hänzelmann
et al., 2013).
Identification of potential drugs
In this research, potential drug (or molecules) was predicted using the
Drug Signatures database (DSigDB) via Enrichr
([96]https://maayanlab.cloud/Enrichr/) based on the PNN gene as well as
the positively co-expressed gene with PNN (correlation coefficient >
0.8 and p < 0.05) ([97]Chen et al., 2013; [98]Kuleshov et al., 2016;
[99]Xie et al., 2021).
DNA methylation analysis and construction of the prognostic signature
The CpG sites in the promoter of PNN and PNN’s co-expressed genes were
obtained from the MEXPRESS database ([100]Koch et al., 2015; [101]Koch
et al., 2019). A univariate Cox analysis in R was used to determine the
association between methylation levels at each CpG site and
progression-free survival (PFS) for each patient, and p < 0.01 was
considered statistically significant. Candidate prognostic CpG sites
were selected using the Least Absolute Shrinkage and Selection Operator
(LASSO) algorithm. Based on the candidate CpG sites generated from the
above algorithm, a multivariate Cox regression model was used to
construct a prognostic signature. The RiskScore of each recipient was
calculated using the following formula:
[MATH:
RiskSco
re=Σi<
mo>=1n βi×Methi :MATH]
In which
[MATH: β :MATH]
refers to coefficient, and
[MATH: Meth :MATH]
refers to the level of methylation.
Patients were divided into the high-risk (
[MATH:
RiskSco
re≥medi<
/mi>an :MATH]
) and low-risk groups (
[MATH:
RiskSco
re<medi<
/mi>an :MATH]
) in the TCGA dataset. Then, we performed ROC analysis using the R
software package pROC (version 1.17.0.1) to obtain the AUC. The R
package “survival” was used to perform the two risk groups’
Kaplan-Meier (KM) survival analysis.
Gene set enrichment analysis
To inspect the different signalling pathways between the PNN low- and
high-expression groups in the TCGA-PRAD dataset, Gene Set Enrichment
Analysis (GSEA) was conducted by the “clusterProfiler” package in R
software ([102]Subramanian et al., 2005). Pathways with a p-value <
0.05 were considered significantly enriched.
TISIDB database
The Tumor and Immune System Interaction Database (TISIDB)
([103]http://cis.hku.hk/TISIDB) database was utilised to analyse the
associations between PNN and tumour-infiltrating lymphocytes (TIL),
immunosuppressors, and chemokines ([104]Ru et al., 2019).
Statistical analysis
Statistical analysis was performed using the R software package
(version 3.6.1). The differential mRNA expression of PNN between tumour
tissues and normal controls was compared using Student’s t-test. The
expression of PNN among the clinicopathological parameters groups was
compared using Student’s t-test and ANOVA. The area under the curve
(AUC) of receiver operating characteristic (ROC) was utilised to
determine the diagnostic ability of PNN and was calculated using the
“pROC” R package ([105]Malone et al., 2015). KM curves of disease-free
survival (DFS or PFS) of the patients were performed by setting the
median expression of PNN as the cut-off in the ‘survival’ R package.
The log-rank test was used to assess statistical differences, and a
cut-off p-value < 0.05 was deemed statistically significant.
Results
Prognostic and diagnostic value of PNN in prostate cancer
The expression levels of PNN between PCa and control samples were
compared in the TCGA-PRAD, and the PNN expression level was validated
with [106]GSE29079 and [107]GSE6956 datasets. As shown in the violin
plots, the mRNA expression level of PNN was significantly higher in the
PCa group in all datasets ([108]Figures 2A–C). Next, we used the same
datasets to evaluate the diagnostic value of the PNN gene. The accuracy
of the diagnostic model was evaluated by ROC curve analysis
([109]Figure 2D). As a result, the AUC of the PNN diagnostic model was
greater than 0.7 in all three datasets, indicating that the PNN gene
can be used to discriminate cancer from normal tissues. Moreover, we
also observed that the abundance of pinin protein was higher in PCa
tissue than in normal tissue ([110]Figures 2E,F).
FIGURE 2.
[111]FIGURE 2
[112]Open in a new tab
PNN expression profile and its diagnostic value in Prostate Cancer
(PCa). (A–C) Comparison of PNN expression levels in the TCGA-PRAD,
[113]GSE29079, and [114]GSE6956 datasets. (D) The diagnostic value of
PNN as evaluated by ROC curve. (E,F) Immunohistochemistry results of
normal (two cases) and PCa tissue (four cases) from the HPA database.
To explore the relationship between PNN expression and the
clinicopathological characteristics in PCa, we compared the PNN
expression levels according to sample clinical information. The high
PNN expression was found in the advanced stage of PCa ([115]Figure 3B),
and the Gleason scores were strongly correlated with the PNN expression
levels in PCa patients in both TCGA-PRAD datasets (p =
[MATH:
6.3×10−9
:MATH]
) and [116]GSE116918 dataset (p = 0.001) in [117]Figures 3E,I.
Collectively, the Gleason score was highly positively correlated with
PNN expression. Different the surgical margins (R0/1/2/X) found
different PNN expression ([118]Figure 3D). It has been found that the
PNN gene expression level was significantly higher in tumors than that
of the primary tissue ([119]Figure 3J, data process in
[120]Supplementary Figure S1), suggesting this gene can be used for
diagnostics in metastatic patients. Age ([121]Figures 3A,F), T stage
([122]Figures 3C,H), or PSA level ([123]Figure 3G) are not correlated
with the PNN expression’s significance.
FIGURE 3.
[124]FIGURE 3
[125]Open in a new tab
Comparison of PNN expression and clinical information in TCGA (A) Age,
(B) N stage, (C) T stage, (D) Surgical margin, and (E) Gleason score.
Comparison of PNN expression and clinical information of [126]GSE116918
(F) Age, (G) PSA level, (H) T stage, and (I) Gleason score. The t-test
was used to evaluate the difference between two groups, and analysis of
variance (ANOVA) was used to compare data divided into more than two
groups. (J) Comparision of the PNN gene expression between primary and
metastatic PCa using [127]GSE38241 and [128]GSE25136 datasets following
batch effects removal.
Univariate and multivariate Cox analyses were conducted to investigate
the independent prognostic factors in TCGA-PRAD and validated with
[129]GSE116918 datasets. The univariate analysis in the TCGA-PRAD
dataset indicated that the surgical margin, T stage, N stage, Gleason
score, and PNN expression were associated with the prognosis of PCa
patients ([130]Figure 4A). In contrast, multivariate Cox regression
analyses in the same dataset demonstrated that only the Gleason score
could be used independently to predict the prognosis of patients (
[131]Figure 4B). Similarly, the PSA levels, Gleason score, T stage, and
PNN expression were found to be significant risk factors by univariate
Cox analysis in the [132]GSE116918 dataset ([133]Figure 4C). In the
same dataset, multivariate Cox regression analyses demonstrated that T
stage and PNN expression could be used independently to predict the
prognosis of patients ([134]Figure 4D). We then validated these
findings by analysing the DFS curves of the PNN high- and
low-expression groups, which showed that the PNN high-expression group
had remarkably worse survival rates than the low-expression group in
both the TCGA-PRAD and the [135]GSE116918 datasets ([136]Figures 4E,F).
The hazard ratio of PNN was greater than 1 in both datasets. Taken
together, it suggested that PNN was a risk factor in the prognosis of
PCa. However, the independent prognostic value of PNN needed further
investigation and confirmation.
FIGURE 4.
[137]FIGURE 4
[138]Open in a new tab
PNN prognostic value in the TCGA-PRAD and the [139]GSE116918 cohorts.
Forest plots of univariate and multivariate Cox regression analysis for
the TCGA cohort (A) univariate, (B) multivariate and the [140]GSE116918
cohort (C) univariate, (D) multivariate. (E,F) DFS curves plotted
according to the KM method for the TCGA-PRAD and [141]GSE116918 cohorts
using the log-rank test.
PNN co-expression network and potential drug targets in prostate cancer
To identify pharmaceutical molecules with DsigDB database and further
uncover the biological processes PNN participated, the co-expression
pattern of PNN in PCa was explored. All co-expressed genes are listed
in [142]Supplementary Table S2.
BioGrid hosted 243 proteins interacting with pinin extracted from
published literature. A total of 368 genes were co-expressed with pinin
following the criteria of r > 0.6 and p < 0.05, of them, twenty-five
genes overlapped with 243 interactive proteins of pinin (25UC for
short). Those 25UC genes were enriched in RNA splicing and RNA/mRNA
processing based on GO enriched analysis ([143]Figure 5A) and enriched
in the spliceosome, mRNA surveillance pathway, and RNA transport based
on KEGG enrichment analysis ([144]Figure 5B). These results suggest
that PNN is mainly linked to the RNA process and RNA transport in PCa.
PNISR, RBM39, DDX39B, SF3B1, SRSF11, CPSF6, CLK2, and SNRPB2 have the
function of splicing or process of RNA; ACIN1 and NKTR participate in
cell apoptosis and immune response. The protein-protein interaction
network can be found in [145]Figure 5C.
FIGURE 5.
FIGURE 5
[146]Open in a new tab
Co-expressed network with PNN. Enrichment results filtered with FDR <
0.05 based on 25 uniquely interacted and co-expressed genes with PNN
with (A) GO and (B) KEGG. (C) Protein-Protein-interaction (PPI) network
constructed using Cytoscape 3.8.2 based on PNN and 25UC.
To explore the potential therapeutic targets in PCa, we focused on
those genes that strongly positively (r > 0.8 and p < 0.05) correlated
with upregulated PNN, including FNBP4, TCERG1, RBM39, DDX39B and DMTF1.
Ten possible pharmaceutical molecules were identified using the Enrichr
package from the DsigDB database, based on their p-value. [147]Table 1
lists the effective drugs from the DsigDB database for PCa.
TABLE 1.
List of the suggested drugs for PCa patients with PNN expression.
Drug p-value Drug indication Drug stage (approved or not) Targeted gene
References