Abstract Background: TSPX is an X-linked tumor suppressor that was initially identified in non-small cell lung cancer (NSCLC) cell lines. However, its expression patterns and downstream mechanisms in NSCLC remain unclear. This study aims to investigate the functions of TSPX in NSCLC by identifying its potential downstream targets and their correlation with clinical outcomes. Methods: RNA-seq transcriptome and pathway enrichment analyses were conducted on the TSPX-overexpressing NSCLC cell lines, A549 and SK-MES-1, originating from lung adenocarcinoma and squamous cell carcinoma subtypes, respectively. In addition, comparative analyses were performed using the data from clinical NSCLC specimens (515 lung adenocarcinomas and 502 lung squamous cell carcinomas) in the Cancer Genome Atlas (TCGA) database. Results: TCGA data analysis revealed significant downregulation of TSPX in NSCLC tumors compared to adjacent non-cancerous tissues (Wilcoxon matched pairs signed rank test p < 0.0001). Notably, the TSPX expression levels were inversely correlated with the cancer stage, and higher TSPX levels were associated with better clinical outcomes and improved survival in lung adenocarcinoma, a subtype of NSCLC (median survival extended by 510 days; log-rank test, p = 0.0025). RNA-seq analysis of the TSPX-overexpressing NSCLC cell lines revealed that TSPX regulates various genes involved in the cancer-related signaling pathways and cell viability, consistent with the suppression of cell proliferation in cell culture assays. Notably, various potential downstream targets of TSPX that correlated with patient survival (log-rank test, p = 0.016 to 4.3 × 10^−10) were identified, including EGFR pathway-related genes AREG, EREG, FOSL1, and MYC, which were downregulated. Conclusions: Our results suggest that TSPX plays a critical role in suppressing NSCLC progression by downregulating pro-oncogenic genes, particularly those in the EGFR signaling pathway, and upregulating the tumor suppressors, especially in lung adenocarcinoma. These findings suggest that TSPX is a potential biomarker and therapeutic target for NSCLC management. Keywords: TSPX, DENTT, lung adenocarcinoma, non-small cell lung cancer, RNA-seq, X-linked tumor suppressor, EGFR signaling pathway 1. Introduction Lung cancer is the leading cause of cancer death worldwide, approximately 1.76 million death each year (18.4% of the total cancer death) [[28]1]. Most lung cancer cases (~85%) are classified as non-small cell lung cancer (NSCLC) consisting of three major subtypes, i.e., adenocarcinoma, squamous cell carcinoma, and large cell carcinoma [[29]2,[30]3]. NSCLC is considered a biologically aggressive cancer type with rapid growth and progression and a low 5-year survival rate of approximately 20% overall [[31]4]. The key issues in NSCLC are its tumor heterogeneity and highly variable responses to clinical treatments [[32]2,[33]5,[34]6]. Thus, effective identifications of different histological subtypes and prognostic markers such as driver mutations would greatly improve the outcomes and survival of patients [[35]3,[36]6,[37]7]. For instance, the activating mutations of the epidermal growth factor receptor (EGFR) gene (EGFR-mut) have been observed in 15% (Europe) to 62% (Asia) of lung adenocarcinoma cases [[38]2,[39]7]. Excessive activation of the EGFR signaling cascade is key for the development of NSCLC [[40]2,[41]6]. Patients harboring EGFR-mut are responsive to the treatment of EGFR tyrosine kinase inhibitors (EGFR-TKIs), hence EGFR-TKIs are recommended as a first-line treatment plan for EGFR-mut patients [[42]2,[43]3,[44]7]. Identifying prognostic biomarkers could help not only in further classification but also provide an opportunity to elucidate their biological roles in NSCLC, thereby improving personalized precise therapies for patients [[45]3,[46]6,[47]8,[48]9]. TSPX (also known as DENTT, CDA1, and TSPYL2) is an X-linked tumor suppressor gene originally identified as a TGF-β responsive member of the TSPY/TSPY-like/SET/NAP-1 superfamily in a human NSCLC cell line [[49]10]. While it is ubiquitously expressed and likely serves as a housekeeping gene [[50]11,[51]12], it is frequently downregulated in various types of cancer, including liver cancer, glioma, prostate cancer, thyroid cancer, and lung cancer [[52]13,[53]14,[54]15,[55]16,[56]17]. Downregulation of TSPX has been linked to aberrant DNA hypermethylation, as treatment with the demethylating agent 5-aza-2′deoxycytidine restores its expression in NSCLC and glioma cell lines [[57]15,[58]18]. In addition, TSPX mutations in endometrial tumors and uterine leiomyomas have been reported [[59]19,[60]20]. TSPX protein harbors a SET/NAP-domain critical for chromatin/histone modification, gene regulation, and cell-cycle regulation [[61]21,[62]22,[63]23,[64]24,[65]25]. Previous studies, including ours, have demonstrated that TSPX inhibits cyclin-B/CDK1 kinase activity, activates the p53 pathway, suppresses oncogenes such as MYC and MYB, inhibits sirtuin 1 (SIRT1), and represses androgen receptor transactivation [[66]16,[67]17,[68]26,[69]27,[70]28,[71]29,[72]30]. It is also an essential component of the REST/HDAC repressor complex, suggesting a multi-functional role in gene regulation [[73]26]. However, the downstream targets of TSPX in lung cancer are still largely unknown. Furthermore, the correlation between TSPX expression and clinical features of lung cancer, such as cancer grades, prognosis, and survival remain to be elucidated. In the present study, we examined the expression levels of TSPX in the transcriptomes of lung adenocarcinoma (n = 515) and lung squamous cell carcinoma (n = 502) specimens in the Cancer Genome Atlas (TCGA) [[74]31]. Our results demonstrated that TSPX expression is downregulated in cancer specimens compared to adjacent non-tumor lung specimens, as determined by Wilcoxon matched pairs signed rank test. Notably, higher TSPX expression levels were significantly correlated with lower tumor grades and better survival rates in lung adenocarcinoma patients, as assessed by log-rank test. To evaluate its potential effects in cellular properties, TSPX was overexpressed in the lung adenocarcinoma cell line A549 and the lung squamous cell carcinoma cell line SK-MES-1, followed by analysis of cell proliferation and cell survival. The results showed that TSPX overexpression inhibited cell proliferation in both A549 and SK-MES-1 cell lines. Consistently, transcriptome and pathway enrichment analyses of the TSPX-overexpressing A549 and SK-MES-1 cells, performed using RNA-seq and DAVID bioinformatics resources [[75]32], indicated that TSPX regulates genes involved in oncogenic and other key signaling pathways, including NF-kB, Wnt, and MAPK signaling pathways. Further comparative analyses of the transcriptomes of A549 and SK-MES-1 cells overexpressing TSPX, and clinical lung adenocarcinoma specimens with high or low TSPX expression from TCGA datasets identified potential downstream target genes of TSPX associated with patient survival. These targets included downregulated genes such as the EGFR ligands AREG and EREG, the oncogenic transcription factors MYC and FOSL1, the apoptosis inhibitor BIRC3, and the receptor ligands DKK and PLAU. Of note, AREG, EREG, FOSL1, MYC, and PLAU are involved in the EGFR signaling pathway, which is a key pathway in lung cancer development [[76]2,[77]6]. Additionally, the tumor suppressor CACNA2D2 was upregulated. Since high TSPX expression levels are associated with lower tumor grades and better survival in lung adenocarcinoma patients, our findings suggest that TSPX plays a critical role as an X-located tumor suppressor that represses the development and progression of lung adenocarcinoma. Its reduced expression could serve as an indicator/prognostic marker of the aggressiveness of lung adenocarcinoma. This is the first study to establish a correlation between TSPX expression levels and clinical outcomes, and also provide a comprehensive analysis of TSPX downstream genes and pathways in NSCLC. These findings have the potential to improve personalized and precise therapeutic strategies for NSCLC patients. 2. Materials and Methods 2.1. Cell Culture and Lentiviral Transduction The human lung adenocarcinoma A549 cell line and lung squamous cell carcinoma SK-MES-1 cell line were obtained from ATCC (Manassas, VA, USA) through the Cell and Genome Engineering Core (CGEC) at the University of California San Francisco (UCSF) and verified by short tandem repeat (STR) analysis [[78]33]. The cells were thawed and cultured in RPMI 1640 medium or Dulbecco’s modified Eagle’s medium (DMEM) containing 10% Tet-system approved fetal bovine serum (FBS, Clontech/Takara Bio USA, Mountain View, CA, USA) and an antibiotics cocktail (100 U/mL penicillin and 100 µg/mL streptomycin). They were used immediately in experiments described in this study. FUW-tetO-TSPX, a lentiviral vector capable of expressing both full-length TSPX and EGFP under the control of doxycycline (Dox), was prepared as described previously [[79]16]. FUW-tetO-EGFP, a lentiviral vector for expression of EGFP alone, was used as a negative control. The generation of replication-incompetent lentiviruses followed the methods previously reported [[80]16,[81]29]. Cells were transduced with lentiviral particles containing the expression vectors, FUW-tetO-TSPX or FUW-tetO-EGFP with FUW-M2rtTA. The transduced cells were cultured in the absence of Dox until analysis. To induce the expression of TSPX and/or EGFP in the transduced cells, cells were cultured in the presence of 0.5 µg/mL Dox (Sigma-Aldrich, St. Louis, MO, USA). The Institutional Bio Safety Subcommittee has reviewed and approved all recombinant DNA and lentiviral transduction experiments. The migratory properties and morphological changes were measured by a scratch wound assay [[82]34]. Briefly, cells were plated at 5 × 10^5 cells/60 mm dish and cultured for 24 h in the absence of Dox. Before treatment with Dox, dishes were scratched at 3 sites with a sterile 1000 μL pipette tip. Cells were cultured with fresh medium with or without 0.5 µg/mL Dox. Images of the scratched areas were recorded at 0 and 48 h after scratching and changes at a representative site are shown in the figure. 2.2. Cell Proliferation Assay For cell proliferation analysis, cells were seeded at 1000 cells/well in 96-well plates and cultured in the presence or absence of 0.5 µg/mL Dox. The cell viability was analyzed at the indicated time points using the CellTiter 96 Aqueous One Cell Proliferation Assay kit (Promega, Madison, WI, USA), according to the manufacturer’s instructions. Each experimental group consisted of 6 wells per time point, and statistical comparisons were performed using Student’s t-test for each time point. Differences with a p-value < 0.05 are considered statistically significant. 2.3. Immunofluorescence and Annexin-V Binding Assay Immunofluorescence was performed after fixation with 4% paraformaldehyde and permeabilization with methanol as described previously [[83]16], using anti-GFP goat IgG (Abcam, Cambridge, MA, USA) and anti-TSPX mouse monoclonal IgG (generated in the lab). Annexin-V binding assay was used to detect apoptotic and dead cells, as previously described [[84]16]. The experiments were repeated at least twice to verify the results. 2.4. Western Blot Western blot was performed as described previously [[85]35], using anti-GFP goat IgG, anti-TSPX rabbit IgG (Bethyl Laboratories, Montgomery, TX, USA), and anti-βactin mouse monoclonal IgG (clone AC-15, Sigma-Aldrich). In brief, transduced cells were cultured in 12-well plates in the presence of 0.5 µg/mL Dox for 24 h, lysed in 100 µL SDS sample buffer, and denatured at 100 °C for 10 min. Ten µL of each sample were subjected to Western blot analysis. The experiment was performed in triplicate, and a representative result is shown in the figure. 2.5. RNA Preparation and RNA-Seq Transcriptome Analysis Total RNA was isolated from the A549 cells and SK-MES-1 cells and subjected to RNA-seq transcriptome analysis using the Illumina NextSeq 500 sequencer, as previously described [[86]16,[87]29]. In brief, each experimental group consisted of 3 wells in a 6-well plate, with cells cultured in the presence of 0.5 µg/mL Dox for 24 h. Sequencing libraries were independently prepared from cells in each well and sequenced separately, representing biological triplicates. After initial quality assessment by FastQC program (version 0.11.4) [[88]36], the sequence reads were mapped onto the human reference genome GRCh37/hg19 using the TopHat program (version 2.1.0) [[89]37]. The mapped reads were summarized and calculated to the count reads that could be associated with the expression levels using the featureCounts program (version 1.6.0) [[90]38]. Normalization of data and differential gene expression analysis were performed using a TCC/edgeR software package (version 1.46.0) [[91]39]. Genes representing changes with TCC/edgeR software analysis FDR < 0.005, Student’s t-test p-value < 0.05, Log[2](gene expression level) > 4, and |Log[2](fold change)| > 0.85 were considered as differentially expressed genes (DEGs). Pathway enrichment analyses were performed using DAVID bioinformatics resources [[92]32]. The pathways with a p-value < 0.05 are considered statistically significant. The datasets used and/or analyzed for the present study are available on request. 2.6. Dataset and Data Mining Analysis of Lung Adenocarcinoma and Lung Squamous Tumor Specimens from the TCGA Database The RNA-seq gene expression data and associated clinical information of lung adenocarcinoma and lung squamous cell carcinoma cases at the Cancer Genome Atlas (TCGA) data portal were downloaded from the UCSC Xena Browser [[93]31,[94]40]. The dataset of lung adenocarcinoma (LUAD) included 59 non-tumor samples and 515 tumor samples, and the dataset of lung squamous cell carcinoma (LUSC) included 51 non-tumor samples and 502 tumor samples. The expression levels were calculated as an RSEM normalized read count [[95]41]. The survival information of the respective patients in the TCGA database was obtained from the Human Protein Atlas (HPA) data portal [[96]31,[97]42], except for the classification of the high TSPX-expressing patients and the low TSPX-expressing patients. Fifteen cases in the LUAD dataset and seven cases in the LUSC dataset lacked survival information. Statistical analyses were performed with the Prism10 program (version 10.4.1) (GraphPad Software, La Jolla, CA, USA). Differences in gene expression levels (RSEM normalized read counts) were evaluated by one-way ANOVA followed by Tukey’s multiple comparison test. Differences with a p-value < 0.05 were considered statistically significant. The clinical transcriptome and patient survival data were downloaded from the public domain repository TCGA portal in a de-identified manner. The human studies were approved with a waiver by the Institutional Human Research Committee. 2.7. Quantitative RT-PCR (RT-qPCR) Analysis Total RNA was isolated from the Dox-induced A549 cells and SK-MES-1 cells using TRIZOL-plus kit (Thermo Fisher Scientific/Invitrogen, Carlsbad, CA, USA), and analyzed by RT-qPCR, using GoTaq qPCR Master Mix (Promega, Madison, WI, USA) or TaqMan Fast Advanced Master Mix (Thermo Fisher Scientific), and QuantStudio3 real-time PCR detection system (Thermo Fisher Scientific). In brief, each experimental group consisted of 3 wells, with cells cultured for 24 h in the presence of 0.5 µg/mL Dox. Reverse transcription products were independently prepared from cells in each well (biological triplicate) and analyzed by quantitative PCR in technical triplicates. The expression levels of the respective genes were normalized to that of the GAPDH gene. Statistical significance was evaluated using Student’s t-test. Difference with a p-value < 0.05 were considered statistically significant. The primer sequences are described in [98]Supplementary Table S1. 3. Results 3.1. The TSPX Expression Level Is Associated with the Clinical Outcomes of Lung Adenocarcinoma To explore the expression pattern of TSPX in lung cancer, we obtained the RNA-seq transcriptome data and clinical information of lung adenocarcinoma (n = 515) and lung squamous cell carcinoma (n = 502) samples from TCGA. Of the 58 cases with tumor and non-tumor paired lung adenocarcinoma samples, TSPX was downregulated in 47 cases (81%) as compared to the adjacent non-tumor specimens ([99]Figure 1A), indicating that TSPX was significantly downregulated in lung adenocarcinoma (Wilcoxon matched pair test p-value < 0.0001). Next, we correlated the TSPX expression level in cancer with the clinical information such as pathological stages/tumor grades and patient mortality. Among the 515 lung adenocarcinoma cases, the top 25% cases (n = 129) expressed TSPX at the highest level and were classified as the TSPX-high group, the bottom 25% cases (n = 129) expressed TSPX at the lowest level and were classified as the TSPX-low group, and the rest (n = 257) were classified as the TSPX-mid group ([100]Figure 1B). Our analysis showed that patients of the TSPX-high group were diagnosed at earlier pathologic/tumor stages, as compared to those of the TSPX-low group, i.e., while 66% of patients of the TSPX-high group were at stage-I, only 45% patients of the TSPX-low group were at stage-I (p-value = 0.0027) ([101]Figure 1C). Patients in the TSPX-mid group at stage-I occupied an intermediate position between the TSPX-high and TSPX-low groups ([102]Figure 1C). Similarly, the survival rate of the TSPX-high group was significantly higher than those of the TSPX-low and TSPX-mid groups (log-rank test p-value = 0.0025 and 0.0273, respectively) ([103]Figure 1D). The median survival time for the TSPX-high group was 1798 days, which was 510 days longer than that of the TSPX-low group (1288 days) and 377 days longer than that of the TSPX-mid group (1421 days). Additionally, the 5-year survival ratio was 49% in the TSPX-high group, compared to 33% in the TSPX-low group and 40% in the TSPX-mid group. The log-rank hazard ratio between the TSPX-high and TSPX-low groups was 0.5171 (95% CI: 0.3404 to 0.7854), indicating a significantly lower risk in the TSPX-high group. There was no significant difference in survival rate between the TSPX-low group and TSPX-mid group ([104]Figure 1D). These observations suggest that reduced levels of TSPX expression could be directly or indirectly associated with the progression and malignancy of lung adenocarcinoma. Figure 1. [105]Figure 1 [106]Open in a new tab TSPX expression levels in relation to the pathologic stage and survival ratios of lung adenocarcinoma and squamous cell carcinoma patients. (A) TSPX expression levels in 58 lung adenocarcinoma tumor (T)/non-tumor (NT) paired samples from TCGA. Expression values (RSEM normalized count values) were plotted, with paired samples linked by a solid line; blue, decrease; red, increase. The p-value of the Wilcoxon matched pairs signed rank test is shown. (B) TSPX expression levels in 59 NT and 515 lung adenocarcinoma cases. The latter were divided into the TSPX-high group (highest 25%, n = 129), TSPX-low (lowest 25% cases, n = 129), and TSPX-mid (n = 257) group. (C) Distributions of pathologic stages (I-IV) across the TSPX-low, TSPX-mid, and TSPX-high groups. Chi-squared test p-value is indicated. (D) Survival curves for the TSPX-high (red), TSPX-mid (brown), and TSPX-low (blue) groups. Log-rank test p-value is indicated. (E) TSPX expression levels in 51 lung squamous cell carcinoma tumor/non-tumor paired samples from TCGA, similar to A. (F) TSPX expression levels in 51 NT and 502 lung squamous cell carcinoma cases, categorized into TSPX-high (highest 25% cases, n = 126), TSPX-low (lowest 25% cases, n = 126), and TSPX-mid (n = 246) groups. (G) Distributions of pathologic stages between the TSPX-low, TSPX-mid, and TSPX-high groups for lung squamous cell carcinoma, similar to C. Red indicates Stage-IV. Chi-squared test p-value is indicated. (H) Survival curves for the TSPX-high (red), TSPX-mid (brown), and TSPX-low (blue) groups in lung squamous cell carcinoma. Log-rank test p-value is indicated. Similar analyses of the transcriptome and clinical data of lung squamous cell carcinoma (LUSC) in TCGA datasets showed that TSPX was significantly downregulated in lung squamous cell carcinoma ([107]Figure 1E). TSPX was downregulated in 48 out of 51 cases (94%) with paired tumor and non-tumor lung squamous cell carcinoma samples (Wilcoxon matched pair test p-value < 0.0001). In the lung squamous cell carcinoma samples, although there was a similar, but less robust, distribution trend of pathological stages from TSPX-low to TSPX-high groups as in the lung adenocarcinoma samples ([108]Figure 1G), there was also no significant difference in survival rate between TSPX-low group and TSPX-high group ([109]Figure 1H). These observations suggest that the TSPX expression level could be correlated with cancer aggressiveness in lung adenocarcinoma, but not as pronounced in lung squamous cell carcinoma, making it a potential differentiating diagnostic/prognostic marker for the two subtypes of NSCLC. 3.2. Overexpression of TSPX Inhibits Cell Proliferation in NSCLC Cell Lines To examine the effects of TSPX in NSCLC cells, A549 lung adenocarcinoma cells and SK-MES-1 lung squamous cell carcinoma cells were transduced with the tet-ON lentiviral vector system expressing EGFP and TSPX under control of doxycycline (Dox) ([110]Figure 2A). The resultant cells were designated as A549-tetON-TSPX and MES1-tetON-TSPX, respectively. A549-tetON-EGFP cells and MES1-tetON-EGFP cells that expressed EGFP alone were used as references in these experiments. Western blot