Abstract The purpose of this study was to assess the relationship between vitamin D receptor gene (VDR) expression and prognostic factors in papillary thyroid cancer (PTC). mRNA sequencing and somatic mutation data from The Cancer Genome Atlas (TCGA) were analyzed. VDR mRNA expression was compared to clinicopathologic variables by linear regression. Tree-based classification was applied to find cutoff and patients were split into low and high VDR group. Logistic regression, Kaplan–Meier analysis, differentially expressed gene (DEG) test and pathway analysis were performed to assess the differences between two VDR groups. VDR mRNA expression was elevated in PTC than that in normal thyroid tissue. VDR expressions were high in classic and tall-cell variant PTC and lateral neck node metastasis was present. High VDR group was also associated with classic and tall cell subtype, AJCC stage IV and lower recurrence-free survival. DEG test reveals that 545 genes were upregulated in high VDR group. Thyroid cancer-related pathways were enriched in high VDR group in pathway analyses. VDR mRNA overexpression was correlated with worse prognostic factors such as subtypes of papillary thyroid carcinoma that are known to be worse prognosis, lateral neck node metastasis, advanced stage and recurrence-free survival. Keywords: VDR mRNA, papillary thyroid carcinoma, TCGA data, vitamin D Introduction Thyroid carcinoma is the most common endocrine malignancy worldwide, the incidence of which is increasing. The most common subtype of thyroid carcinoma is papillary carcinoma (PTC), accounting for 80–90% of all cases ([40]1). Although this type of cancer has an excellent prognosis, the prognosis significantly worsens when the tumor grows and metastasizes ([41]2). For this reason, it is important to understand the characteristics of the tumor at the early stage. Several epidemiological reports show that higher levels of vitamin D3 are associated with a lower risk of developing cancer ([42]3). The active form of vitamin D3, 1,25-dihydroxyvitamin D3 (1,25D) exerts antitumor activity by binding to the vitamin D receptor (VDR). The antitumor activities of 1,25D include the inhibition of cancer cell proliferation and angiogenesis, promotion of cell differentiation and apoptosis ([43]4, [44]5, [45]6, [46]7, [47]8). The VDR is a receptor expressed by epithelial cells in both normal and malignant thyroid glands. Human VDR gene is located on chromosome 12q13.1 ([48]9). In cancer cells, VDR expression is a response to 1,25-dihydroxyvitamin D3 (1,25D) by decreasing proliferative activity in vitro. Izkhakov and coworkers reported that expression of VDR mRNA in malignant thyroid tissues is higher than that in normal thyroid ([49]2). They also reported correlations between VDR and other genes such as ECM1 (extracellular matrix protein-1) and TMPRSS4 (type II transmembrane serine protease-4), which are associated with tumor progression. Positive correlation was also observed between VDR and ECM1, as well as between VDR and TMPRSS4 ([50]2). Our study was designed to evaluate the correlation between the VDR mRNA expression and prognostic factors of PTC using ‘The Cancer Genome Atlas’ (TCGA) data. TCGA kindly provides multiplatform genomics data such as sequence and read count data from next-generation sequencing, copy-number analysis, methylomics and proteomics data ([51]10). Genomic data were also combined with patient-matched clinical data to correlate the molecular findings with clinical characteristics. Materials and methods Data preparation We downloaded TCGA thyroid cancer data, including clinical information, somatic mutations and gene expression data derived from RNA sequencing. Pathologic data were re-evaluated using scanned images of the paper-written pathologic documents provided by TCGA-associated hospitals. PTC subtype classification and MACIS scores were referenced from the 2014 TCGA thyroid cancer paper’s Supplemental Tables ([52]10). The total number of TCGA samples comprised 59 normal tissues and 501 cancer tissues. Total 499 patient samples were assessed after joining the clinical data without missing attributes. Somatic mutations were provided by two different mutation calling files from the Illumina DNA-sequencing machine. Sequencing experiments were performed by the Baylor College of Medicine, the Broad Institute at MIT and Harvard Genome Sequencing Center. Mutation status of BRAF, RAS (NRAS, HRAS and KRAS) and VDR genes was identified from somatic mutation calling files: identical results in two different calling files were considered as a meaningful mutation. TCGA gene expression data from RNA sequencing were provided with level 3 RSEM (RNA-Seq by expectation–maximization) counts after upper quartile normalization to maintain the standardization of different platforms or housekeeping genes ([53]https://wiki.nci.nih.gov/). RNA sequencing was performed by the University of North Carolina using an Illumina HiSeq RNA Sequencing machine. We analyzed VDR gene expression according to tissue type (normal vs cancer) and clinical information. Next, all 20,531 genes were used to assess gene ontology and pathway analysis. Statistical analysis Paired t-test was used to assess the differences in gene expression between 59 paired donor-matched normal and cancer samples, whereas unpaired t-test was used to assess the global differences between the 59 normal tissues and 499 cancer tissues. Association between clinical variables and VDR gene expression was measured by univariable and multivariable linear regression analysis. To predict cancer recurrence based on VDR expression, continuous VDR expressions was converted into two binary groups (low VDR and high VDR group) using tree-based classification analysis with a maximized area under the ROC (receiver-operating characteristic) curve. Binary VDR groups were used for univariable and multivariable logistic regression analyses to assess the relationship between VDR expression and clinicopathologic variables. Backward selection method was used in both linear and logistic regression for multiple model fitting. Kaplan–Meier estimator with log-rank test was used for survival analysis. Differentially expressed genes (DEG) and gene ontology (GO) tests between two VDR groups were performed using ‘EdgeR’ package, which is one of the bioinformatics tools in Bioconductor ([54]https://www.bioconductor.org/) ([55]11). Pathway enrichment analysis was performed using The Database for Annotation, Visualization and Integrated Discovery (DAVID) (12, 13) and the Gene Set Enrichment Analysis (GSEA) program ([56]14). False discovery rate (FDR) correction was used to adjust false-positive rate from multiple testing. All statistical analyses were performed using R 3.2.4. (R Core Team (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL [57]https://www.R-project.org/). Results No VDR gene single-nucleotide variants were observed in either of the two different mutation calling files. [58]Figure 1 shows the difference in VDR mRNA expression between normal and PTC tissues. Both paired and unpaired tests revealed that VDR expression was higher in cancer tissue than that in normal tissue (VDR expression in 55 normal tissues = 238.971 ± 14.987; in 55 paired cancer tissues = 307.567 ± 26.808; in 501 cancer tissues = 395.276 ± 10.323, mean ± standard error of mean (s.e.m.)). Figure 1. [59]Figure 1 [60]Open in a new tab VDR mRNA expression counts in normal and papillary thyroid carcinoma tissue. (A) Fifty-nine paired normal and cancer tissues. (B) 59 normal vs 501 cancer tissues. Correlations of VDR mRNA expression according to clinicopathologic variables are shown in [61]Table 1. VDR gene expression was higher with statistical significance in classic and tall-cell variant PTC (TCVPTC) than that in follicular variant PTC (FVPTC). VDR expression was also significantly upregulated in cases with T4 (tumor stage), N1b (lateral neck node metastasis), high MACIS group and AJCC stage IV. Gender, PTC subtype and N stage were only included in multiple linear regression model. The results also showed an association between VDR expression with BRAF^V600E mutation and recurrence, although the results did not reach statistical significance. Table 1. Association between VDR mRNA expression and clinicopathologic characteristics of 499 PTC patients by linear regression analysis. Variable Number Mean ± s.e.m. P value (univariate) P value (multivariate) Age <45 228 390.989 ± 14.257 0.669 ≥45 271 399.888 ± 14.838 Gender M 134 372.818 ± 17.539 0.179 0.058 F 365 404.267 ± 12.590 Subtype Classic PTC 355 408.001 ± 12.670 Reference Reference FVPTC^† 101 318.546 ± 19.165 <0.001 0.009 TCVPTC^‡ 35 494.194 ± 32.023 0.033 0.059 Others 8 400.611 ± 98.383 0.928 0.955 Size ≤2 cm 200 399.179 ± 15.865 0.791 >2 cm 299 393.576 ± 13.657 Lymphovascular invasion No 392 393.643 ± 11.827 0.688 Yes 107 403.803 ± 21.394 T stage T1 144 377.999 ± 17.633 Reference T2 154 382.805 ± 20.737 0.858 T3 184 411.481 ± 16.151 0.193 T4 17 495.216 ± 61.736 0.048 N stage N0 278 363.768 ± 12.188 Reference Reference N1a 120 410.893 ± 23.858 0.059 0.381 N1b 101 466.142 ± 24.895 <0.001 0.002 M stage M0 490 396.254 ± 10.494 0.759 M1 9 372.300 ± 58.740 MACIS Low 345 386.576 ± 12.243 Reference Intermediate 78 383.270 ± 24.077 0.909 High 76 450.675 ± 29.909 0.029 Stage I 289 379.548 ± 12.535 Reference II 44 367.811 ± 35.599 0.753 III 98 410.995 ± 26.292 0.243 IV 68 461.242 ± 30.243 0.009 BRAF mutation No 258 378.672 ± 15.039 Reference V600E 236 415.103 ± 14.299 0.081 Others 5 370.710 ± 82.387 0.939 RAS mutation No 446 399.870 ± 11.187 0.257 Yes 53 361.755 ± 25.071 Recurrence No 459 390.370 ± 10.905 0.075 0.155 Yes 40 458.378 ± 30.682 Survival Alive 485 394.442 ± 10.422 0.433 Dead 14 443.627 ± 78.110 [62]Open in a new tab ^† Follicular variant PTC; ^‡tall cell variant PTC. To predict recurrence using VDR gene expression, we divided continuous VDR expression values into binary variables using the tree-based classification and ROC curve. Optimal VDR cutoff value obtained was 466.5, with a maximal AUC of 0.624. Based on the cutoff, patients were divided into a low VDR group and a high VDR group. Low VDR group comprised 342 patients and high VDR group comprised 157 patients. [63]Table 2 shows the logistic regression results according to binary VDR group. Both the univariable and multivariable regression models revealed that high VDR group was associated with classic PTC and TCVPTC, advanced AJCC stage (stage IV) and recurrence. Table 2. Logistic regression analysis using binary VDR mRNA expression (cutoff value = 466.5; AUC = 0.624; low VDR group, n = 342; high VDR group, n = 157). Variables Univariate Multivariate Odds ratio P value Odds ratio P value Age (<45) ≥45 1.108 0.597 Gender (M) Female 1.057 0.801 Subtype (classic) FVPTC 0.515 0.016 0.576 0.051 TCVPTC 2.783 0.004 2.540 0.013 Others 0.696 0.660 0.764 0.747 Size (≤2 cm) >2 cm 0.923 0.683 Lymphovascular invasion (No) Yes 1.263 0.309 T stage (T1) T2 0.823 0.458 T3 1.507 0.086 T4 2.826 0.046 N stage (N0) N1a 1.431 0.128 N1b 1.962 0.006 M stage (M0) M1 1.091 0.903 MACIS (low) Intermediate 1.156 0.592 High 1.782 0.027 Stage (I) II 0.649 0.275 0.807 0.596 III 1.341 0.239 1.080 0.772 IV 2.244 0.003 1.925 0.020 BRAF mutation (No) V600E 1.349 0.123 Others 1.689 0.570 RAS mutation (No) Yes 0.538 0.080 Recurrence (No) Yes 2.625 0.004 2.329 0.014 Survival (alive) Dead 1.217 0.729 [64]Open in a new tab Total 40 patients showed recurrence after surgery. 21 patients had locoregional recurrence including central and lateral neck nodes, whereas 15 patients had distant recurrence such as bone, lung, etc. There is no information for four patients. In high VDR group, 11 patients had locoregional and six patients had distant recurrence (P value from Fisher’s exact test = 0.002). [65]Figure 2 illustrates recurrence-free survival difference for two VDR groups using the Kaplan–Meier estimator. The high VDR group showed worse recurrence-free survival than the low VDR group (P value = 0.018, follow-up duration: median 647 days, range 1–5150 days). There’s no statistical significance in overall survival between the two VDR groups. Figure 2. [66]Figure 2 [67]Open in a new tab Recurrence-free survival in the low and high VDR expression groups. We performed DEG tests for the low and high VDR groups using EdgeR package. The differential gene expression pattern is shown in [68]Fig. 3. DEG boundaries were defined as follows: average log CPM (counts per million) >0 and | logFC (fold change) | >1, and FDR q-value <0.01 (indicated as vertical and horizontal lines in [69]Fig. 3). A total of 545 genes were highly expressed in high VDR group, whereas 91 genes were highly expressed in low VDR group. [70]Table 3 lists the top 20 highly expressed DEGs in high VDR group. Figure 3. Figure 3 [71]Open in a new tab MA plot of genes differentially expressed in the low and high VDR expression groups. DEG cutoff consisted of average logCPM (count per million) >0, | logFC (fold change) | >1 and FDR q-value <0.01. Table 3. Top 20 upregulated genes in the high VDR group. Gene ID Gene symbol logFC logCPM P value FDR 7421 VDR 1.282 4.312 2.61E-149 4.39E-145 165904 XIRP1 4.100 0.524 3.83E-93 3.22E-89 256076 COL6A5 6.088 0.755 4.51E-92 2.53E-88 57016 AKR1B10 7.436 2.483 1.73E-86 7.27E-83 1646 AKR1C2 4.052 3.393 1.47E-78 4.95E-75 2877 GPX2 5.053 0.296 4.00E-77 1.12E-73 146802 SLC47A2 3.199 1.098 1.38E-59 3.32E-56 8038 ADAM12 2.429 3.250 3.35E-56 7.05E-53 8111 GPR68 1.868 2.199 4.08E-53 6.24E-50 729238 SFTPA2 4.175 4.587 9.36E-53 1.31E-49 10468 FST 2.823 0.579 1.06E-51 1.37E-48 1281 COL3A1 2.048 9.347 4.38E-50 4.91E-47 6947 TCN1 4.253 1.294 9.78E-50 1.03E-46 1278 COL1A2 1.894 9.490 2.00E-48 1.87E-45 1293 COL6A3 1.761 7.459 3.02E-48 2.67E-45 3909 LAMA3 2.289 2.671 1.02E-47 8.32E-45 92747 BPIFB1 4.529 0.566 1.04E-47 8.32E-45 10699 CORIN 1.836 0.284 1.21E-47 9.22E-45 1462 VCAN 2.082 5.740 6.74E-47 4.93E-44 10631 POSTN 2.160 6.867 1.07E-46 7.52E-44 [72]Open in a new tab [73]Supplementary Table 1 (see section on [74]supplementary data given at the end of this article) shows the pathways that are significantly enriched by 545 upregulated DEGs in high VDR group, using DAVID. Cellular interaction, autoimmune thyroid disease, immune receptor-related signal pathways, cancer-related chemokine signaling and the JAK-STAT pathway were significantly enriched in high VDR group. The GSEA pathway analysis showed similar results: chemokine signaling, immune cell-related pathways, cell adhesion, cellular interactions and the JAK-STAT pathway, all of which are associated with carcinogenesis, were significantly enriched in the high VDR group compared to low VDR group ([75]Supplementary Table 2). The ‘chemokine signaling pathway’ that was enriched in both DAVID and GSEA analysis harbors thyroid cancer related ‘MAPK signaling pathway’ composed with RAS-RAF-MEK-ERK1/2 cascade. GO analysis conducted with EdgeR also showed increased immune-related biological processes, enrichment of cell membrane components and upregulation of receptor signaling activity in the high VDR group. Top 10 enriched gene ontologies in high VDR group are listed in [76]Supplementary Table 3. Discussion Although most thyroid carcinomas have a good prognosis, some types of cancer have intractable features and a significantly unfavorable prognosis. Current prognosis and treatment guidelines are based on clinicopathologic data. Recently, additional information using immunohistochemistry and molecular methods was introduced to improve the sensitivity and specificity of the diagnosis and to predict disease prognosis. Molecular markers such as BRAF, RAS, RET/PTC rearrangement, RAX8/PPARγ and galectin-3 may be useful for these purposes according to the American Thyroid Association guidelines ([77]15). The hallmark of thyroid cancer, besides aggressive behavior, is a loss of uptake and trapping of radioactive iodine, which means resistance to the best systemic therapy for thyroid cancer. The standard chemotherapy regimens approved for thyroid cancer have poor efficacy and relatively high toxicity compared to their benefit. Therefore, the discovery of alternative therapeutic targets for advanced differentiated thyroid cancer is important ([78]16). Vitamin D receptor is a member of the nuclear hormone receptor superfamily. It binds to DNA at vitamin D3 response elements upon ligand binding to alter the transcription of vitamin D3-responsive genes ([79]17). Vitamin D3 appears to exert anticancer activity by decreasing proliferation, promoting re-differentiation, inhibiting angiogenesis and accentuating the effects of standard chemotherapy ([80]18, [81]19, [82]20, [83]21). Clinical interest in the association of vitamin D levels with cancer, and in the potential of vitamin D receptor as a therapeutic target, has dramatically increased in recent years. A correlation between vitamin D and cancer prevention has been shown in breast, prostate and colorectal cancer ([84]22). It rises to a consensus conference by the World Health Organization evaluating the evidence associated with vitamin D and cancer ([85]23). Observational, preclinical and clinical studies strongly suggest that vitamin D3 deficiency increases the risk of developing multiple malignancies, whereas other studies do not support this hypothesis ([86]24). However, several studies suggest that adequate vitamin D3 levels may provide protection against chronic diseases such as cancer and may improve cancer prognosis ([87]25, [88]26, [89]27). Expression of the VDR gene and 25-hydroxyvitamin D3 1-α-hydroxylase, the rate-limiting enzyme in the production of active vitamin D3, has been found in several tissues and tumor types ([90]28). Active vitamin D3 is important for regulating differentiation and cell growth in many different organs, and expression of 1a-hydroxylase/VDR is regarded as an important response against tumor progression. A prior report by Izkhakov and coworkers showed that VDR mRNA expression was higher in malignant thyroid tissues than that in normal thyroid tissues, which indicates a possible correlation between VDR gene expression and thyroid cancer prognosis ([91]2); however, there were no clinicopathological differences between patients with low levels of gene expression and those with high levels of gene expression. Clinckspoor and coworkers reported that lower vitamin D3 metabolism was associated with the progression of thyroid cancer of follicular origin ([92]29). Here, we found that overexpression of VDR mRNA was correlated with aggressive PTC subtypes – classic PTC and TCVPTC, lateral neck node metastasis, advanced tumor (T4) and AJCC stage (IV) and worse recurrence-free survival. Our results also showed an association between high VDR mRNA and BRAF^V600E mutant PTC, although did not gain statistical significance. Chemokine signaling pathway enrichment that harbors MAPK signaling cascade derived from two different pathway analyses also suggest that high VDR expression may be associated with thyroid cancer progression by enhancing MAPK pathway. Overall, results show that overexpression of VDR mRNA is associated with poor prognostic factors in PTC, aggressive subtype, advanced T and AJCC stage, lateral neck node metastasis and worse recurrence-free survival. One possible explanation for these results is impaired expression of vitamin D receptor on the cell membrane: VDR mRNA may be overexpressed in response to decreased vitamin D receptor protein expression in damaged cellular membranes. Another explanation may be genomic discordance between the expression levels of mRNA and the expression of protein. Guo and coworkers reported on the usefulness of mRNA to predict protein expression, but they described that the prediction of protein expression using mRNA was far from perfect ([93]30). Because of limitations in our study design, the correlation between VDR mRNA expression and vitamin D receptor protein expression could not be clearly assessed in this report. Further study in thyroid cancer is required. In conclusion, we used public multigenomics data to demonstrate that elevated VDR mRNA levels are associated with aggressive PTC subtypes, worse prognostic factors and recurrence-free survival. Further experimental validation should be performed to prove the biologic influence of VDR and circulating vitamin D. Supplementary Material Table S1. Pathways significantly enriched by 545 up-regulated DEGs in the high VDR group. Analysis was performed by DAVID. [94]Click here for additional data file.^ (47.8KB, pdf) Table S2. Top 20 significantly enriched pathways using all 20,531 genes. Analysis was performed using the Gene Set Enrichment Analysis. [95]Click here for additional data file.^ (42.5KB, pdf) Table S3. Top 10 enriched gene ontology terms in the high VDR group. [96]Click here for additional data file.^ (49.5KB, pdf) Declaration of interest The authors declare that there is no conflict of interest that could be perceived as prejudicing the impartiality of the research reported. Funding This study was supported by the Research Grant Number CB-2011-03-01 of the Korean Foundation for Cancer Research. References