Abstract The identification of prognostic and therapeutic biomarkers is essential to reduce morbidity and mortality from lung adenocarcinoma (LUAD). This study aimed to identify a reliable prognostic and therapeutic biomarker for LUAD using integrated bioinformatics. Based on the cancer genome atlas (TCGA) and genome-tissue expression (GTEx) analyses, KIF20A has been identified as the hub gene. Following validation using a series of cohorts, survival analysis, meta-analysis, and univariate Cox analysis was conducted. ESTIMATE and CIBERSORT algorithms were then used to study the association of KIF20A with the tumor microenvironment (TME) and the percentage of tumor-infiltrating immune cells (TICs). In vitro experiments were conducted to determine the function of KIF20A. Finally, there was a negative association between the expression of the KIF20A and overall survival, progression-free survival, and disease-free survival, which was confirmed by meta-analysis and COX analysis. Furthermore, KIF20A also had a potential role of altering the TME and TICs proportions in LUAD. Validations in vitro were performed on A549 and PC-9 cell lines, and we found that the knockdown of KIF20A exhibited inhibitory effects on cell proliferation, resulted in cell cycle arrest during the G2/M phase, and induced cellular apoptosis. Our study demonstrated that KIF20A could be utilized as a reliable prognostic marker and treatment target for LUAD. However, further studies are required to validate these findings. Keywords: lung adenocarcinoma, KIF20A, bioinformatics analysis, prognosis, therapeutic target 1 Introduction Lung adenocarcinoma (LUAD) is the most common histological subtype of lung cancer. In the case of LUAD, the prognosis is extremely poor, with a five-year survival rate of less than 10% ([32]Zhao et al., 2018; [33]Fu et al., 2020). Various studies found that the aberrant expression of oncogenes, particularly those that drive cellular proliferation, increase resistance to therapy, and protect tumor cells against immune surveillance, play an essential role in tumor progression, subsequently leading to worse clinical outcomes ([34]Zhu et al., 2015; [35]Schabath et al., 2016; [36]Chuang et al., 2017; [37]Ma et al., 2019). Furthermore, tumor initiation and progression are influenced by the tumor microenvironment (TME), particularly the immune microenvironment ([38]Gajewski et al., 2013; [39]Hinshaw and Shevde, 2019). However, the mechanism by which the TME is involved in tumor development has not been elucidated. It is essential for the identification of potential oncogenes, alterations in the TME, and other cancer drivers to understand tumor pathogenesis, and accordingly, develop personalized, targeted therapies. Kinesin family member 20A (KIF20A), also known as RAB6KIFL, belongs to the kinesin superfamily-6. There has been evidence that members of the superfamily are involved in vital cellular processes, such as spindle assembly, intracellular transport, and cellular mitosis ([40]Shen et al., 2019; [41]Xiong et al., 2019; [42]Zhang et al., 2019). Furthermore, KIF20A dominantly functioned in chromosome partitioning and mitotic spindle formation ([43]Xiong et al., 2019). Recent studies have demonstrated that KIF20A is overexpressed in many malignant tumors, including gastric cancer, glioma, bladder cancer, breast cancer, and prostate cancer ([44]Li et al., 2019; [45]Shen et al., 2019). Additionally, Shen et al. demonstrated that high KIF20A expression is associated with high tumor grade, advanced stage, and poor prognosis for bladder cancer patients ([46]Shen et al., 2019). The prognostic significance of KIF20A has also been evaluated in other solid tumors, where it has been demonstrated that it plays a vital role in cell proliferation and tumor metastasis. However, there was limited knowledge about potential mechanisms of KIF20A in LUAD. Integrative bioinformatics analysis combined with molecular biology and information technology is increasingly used to identify novel diagnostic and therapeutic biomarkers. This study aims to determine whether KIF20A can function as a prognostic and therapeutic biomarker for LUAD via integrated bioinformatics analysis. The flowchart of the work is shown in [47]Figure 1. FIGURE 1. [48]FIGURE 1 [49]Open in a new tab Flowchart for the work in lung adenocarcinoma (LUAD). 2 Materials and methods 2.1 Data collection and preprocessing RNA-sequence data for LUAD was extracted from The Cancer Genome Atlas (TCGA) database ([50]https://portal.gdc.cancer.gov/), whereby 54 normal lung samples and 497 LUAD cases were identified. Another mRNA expression profile for 288 normal tissue samples was retrieved from the Genotype-Tissue Expression (GTEx) database ([51]https://www.gtexportal.org/home). [52]GSE19188 and [53]GSE33532 profiles were downloaded from the Gene Expression Omnibus (GEO) database ([54]https://www.ncbi.nlm.nih.gov/geo/). [55]GSE19188 profile contained 65 normal samples and 91 LUAD samples. [56]GSE33532 profile included 20 normal samples and 80 LUAD samples. The R package “limma” was used to combine TCGA and GTEx datasets for the following study. 2.2 Weighted gene co-expression network analysis and differential analysis Co-expression network analyses using the R package “WGCNA” were conducted to detect high tumor-related modules and identify potential genes involved in these modules. This technique involved transforming the adjacency matrix into a topological overlap matrix (TOM) based on the soft threshold power. According to the TOM-based dissimilarity measure, the genes were then grouped into distinct modules. The cutoff threshold for the minModuleSize and mergeCutHeight were 30 and 0.25, respectively. Additionally, the genes with gene significance (GS) greater than 0.5 and a module membership (MM) above 0.8 were classified as candidate genes. The R package “edgeR” was used for differential expressed gene (DEG) identification between normal lung and LUAD cases. The parameters used for the DEGs analysis were an adjusted p-value < 0.5 and a |log[2](fold change)| > 1. Finally, the candidate hub genes were codetermined by the promising genes in WGCNA and the DEGs analysis. 2.3 Function and pathway enrichment analysis Gene oncology (GO) enrichment analysis was conducted using “clusterProfiler” in R, and candidate hub genes were determined by KEGG enrichment analyses. Statistical significance was defined as an adjusted p-value of less than 0.05. Gene set enrichment analysis (GSEA) and gene set variation analysis (GSVA) were used to identify the dominant pathways associated with the hub genes. The “c2.cp.kegg.v7.1.symbols.gmt” gene set was identified from the molecular signature database (MSigDB) ([57]http://software.broadinstitute.org/gsea/msigdb/index.jsp). Pathway expression was profiled using the R package “pheatmap”. 2.4 Protein-protein interaction (PPI) construction network and COX analysis For the candidate hub genes, the interaction information among proteins with a combined score above 0.9 was retrieved using the retrieval of interacting genes database (STRING) search tool ([58]https://string-db.org/). To visualize the PPI network and carry out a clustering analysis of the network, we used Cytoscape software (version 3.6.0) and its molecular complex detection plugin (MCODE). For the top two clusters with the highest MCODE scores, the connectivity degrees of the nodes were calculated. Moreover, the R package “survival” was adopted to conduct univariate COX regression analysis for the corresponding genes within the two clusters. The common DEGs overlapped of nodes with connectivity degrees above 30 in the clustering network and a p-value below 0.001 in the COX analysis were identified as hub genes. 2.5 Hub gene validation A quantitative real-time polymerase chain reaction (qRT-PCR) was used to measure the expression of hub genes. In addition, the hub gene was validated through several differential analyses utilizing the TCGA and GTEx data and the profiles [59]GSE19188 and [60]GSE33532. Further validation was performed using the Oncomine meta-analysis platform ([61]https://www.oncomine.org). The protein expression of the corresponding gene was also examined using Western blot analysis, differential analysis based on the Clinical Proteomic Tumor Analysis Consortium (CPTAC) database ([62]https://cptac-data-portal.georgetown.edu/), and immunohistochemical analysis of images obtained from the Human Protein Atlas (HPA) database ([63]https://www.proteinatlas.org). 2.6 Survival analysis and meta-analysis We used the R packages “survival” and “survminer” to analyze the correlation between KIF20A expression and overall survival (OS) as well as progression-free survival (PFS) among patients with LUAD. The survival analyses for KIF20A were also conducted utilizing the Gene Expression Profiling Interactive Analysis (GEPIA) database ([64]http://gepia.cancer-pku.cn) and Kaplan-Meier plotter platform ([65]https://kmplot.com/analysis/). Meta-analysis of the five cohorts was conducted using R packages “survival” and “meta”. Pearson’s method was used to analyze the correlation between each term using the GraphPad Prism software (version 7.0). 2.7 Association between hub gene and tumor microenvironment (TME) Using the “ESTIMATE” package in R, the proportion of immune and stromal cells in LUAD tissues was calculated. The total ESTIMATE score was calculated by combining the two scores. Based on the CIBERSORT algorithm, we determined the relative community of tumor-infiltrating immune cells (TICs). The R packages “circlize” and “corrplot” were developed to visualize the correlation among TICs. 2.8. Cell culture and plasmid transfection We obtained the cell lines HBE, A549, PC-9, and NCI-H1395 from the Shanghai Cell Bank of the Chinese Academy of Medical Sciences (Shanghai, China). Cells were cultured in high glucose Dulbecco’s Modified Eagle’s medium (DMEM; Hyclone, Logan, Utah, United States) containing 10% fetal bovine serum (FBS; Gibco, Grand Island, NY, United States) and 1% penicillin-streptomycin (MRC, Jintan, China) at 37°C and 5% Carbon dioxide (CO[2]). The KIF20A-targeted sequence of short hairpin RNAs (shRNA, 5′- GCC​ACT​CAC​AAA​TTT​ACC​TTT-3′) and the scrambled sequence (5′-CCT​AAG​GTT​AAG​TCG​CCC​TCG-3′) were cloned into pLKO.1 plasmid. Both pLKO.1-KIF20A-shRNA (shKIF20A) and pLKO.1-scramble-shRNA (shNC) were provided by the Chinese Public Protein Plasmid Library (PPL, Nanjing, China). A549 and PC-9 cells were transfected using the X-tremeGENE HP DNA transfection reagent (Roche, Shanghai, China) according to the manufacturer’s guidelines. 2.9 Quantitative real-time PCR analysis The Total RNA Extraction Kit (Solarbo, Beijing, China) was used to isolate the total RNA. First-strand cDNA synthesis kit (Invitrogen, Carlsbad, CA, United States) and Premix Ex Taq SYBR Green PCR kit (Takara, Dalian, China) were used according to the manufacturer’s protocols for reverse transcription and quantitative PCR, respectively. The primers were synthesized as follows: KIF20A, forward: 5′-TGC​TGT​CCG​ATG​ACG​ATG​TC-3′ and reverse: 5′-AGG​TTC​TTG​CGT​ACC​ACA​GAC-3'; GAPDH, forward: 5′-GGA​GCG​AGA​TCC​CTC​CAA​AAT-3′ and reverse: 5′-GGC​TGT​TGT​CAT​ACT​TCT​CAT​GG-3'. 2.10 Western blot analysis Extraction and quantification of total protein were conducted. Six percent SDS-PAGE gels were then used to separate and transfer the proteins onto polyvinylidene fluoride membranes. Membranes were blocked with five percent skimmed milk for two hours at room temperature and then incubated with the primary antibody against KIF20A (ThermoFisher, Waltham, MA, United States, 1:1000 dilution, Catalog #PA5-83359) and β-actin (Abcam, Cambridge, United Kingdom, 1:1000 dilution, ab8226) overnight at four degrees. Thereafter, the bands were incubated with a horseradish peroxidase-conjugated secondary antibody (Bioss, Beijing, China) at room temperature for an hour and were subsequently visualized using enhanced chemiluminescence reagents (Beyotime, Shanghai, China). 2.11 Assessment of cell proliferation Cell counting kit (CCK)-8 kit (Beyotime) and colony formation assay were conducted to assess cell proliferation ability. A total of 6,000 cells were cultured overnight in 96-well plates. On the following day, cells were transfected with the corresponding plasmid. The CCK-8 solution was added to each well after 24–48 h to determine the cells’ viability. The colony formation assay was performed by culturing 1,000 cells per well overnight in six-well plates, followed by transfection. A new culture medium was added the next day, and the cells were kept in culture for an additional 14 days. The colonies were then stained with Giemsa solution (Beyotime) and calculated using ImageJ (version 1.8.0). 2.12. Cell cycle and apoptosis assessment Overnight at 4°C, the transfected cells were fixed in 70% ethanol. Afterward, the cells were stained with 500 μL PI/RNase staining buffer (BD Pharmingen, San Diego, CA, United States) for 15 min at 37°C or else incubated with 5 μL FITC Annexin V (BD Pharmingen), 5 μL propidium iodide (PI, BD Pharmingen) and 400 μL binding buffer for 15 min at 25°C in the dark. A flow cytometer (FCM) (BD FACSVerse, San Jose, CA, United States) was used to analyze the cell cycle and apoptosis according to the manufacturer’s instructions. 2.13 Statistical analysis GraphPad Prism (version 7.0) and R software (version 3.6.0) were used for statistical analysis. Log-rank tests were used to calculate the statistical significance of the survival analysis. In the meta-analysis, heterogeneity was measured using Cochran’s Q test and Higgin’s I ^2 statistics. The results of qRT-PCR were analyzed by the 2^-△△Ct method. Student’s t-tests and one-way analyses of variance (ANOVA) were used to determine whether there was a significant difference between LUAD and normal cells. Experiments were conducted at least three times, and the quantitative data are presented as mean ± standard deviation (SD). p-values below 0.05 were considered statistically significant. 3 Results 3.1 Identification of candidate hub genes A total of 839 samples (342 normal samples and 497 LUAD samples) were clustered according to their mRNA expression similarity. No outliers within the samples were removed, and the distribution of clinical traits within the samples was shown in [66]Supplementary Figure S1. In the WGCNA, a soft-thresholding power (β) of five and a scale-free R^2 of 0.95 were adopted for a scale-free network, as shown in [67]Supplementary Figure S2. Based on this method, 16 modules were identified, as shown in [68]Figure 2A. The turquoise module showed the strongest negative association with tumor samples, as shown in [69]Figure 2B, and had a higher module significance (MS) than the other modules, as shown in [70]Supplementary Figure S3. In general, MS was defined as the average gene significance (GS) for all genes within a particular module, indicating the overall correlation between a trait and the module. Additionally, the yellow module was the most positively associated with tumor samples. There were 1,055 and 144 genes with GS > 0.5 and MM > 0.8, respectively, in the turquoise and yellow modules, as shown in [71]Figure 2C and [72]Figure 2D. The differential analysis between normal and LUAD tissues revealed a total of 2,933 DEGs, containing 1,327 upregulated DEGs and 1,606 downregulated DEGs, as shown in [73]Figure 2E. There were 801 common genes between all DEGs and turquoise module-related genes and 103 common genes between DEGs and yellow module-related genes, as shown in [74]Figure 2F and [75]Figure 2G. The 904 intersected DEGs were defined as candidate hub genes and used for the following analysis. FIGURE 2. [76]FIGURE 2 [77]Open in a new tab Weighted gene co-expression network analysis (WGCNA) and differential analysis. Image (A) illustrates a clustering dendrogram for the GTEx and TCGA datasets based on the dissimilarity measure (1-TOM). Heatmap (B) shows the correlation between module eigengenes and clinical traits in LUAD. The scatter plots illustrate the eigengenes for the turquoise (C) and yellow (D) modules. A module membership above 0.8 and a gene significance greater than 0.5 were considered statistically significant. The volcano plot (E) shows 2,933 differentially expressed genes (DEGs) in LUAD samples versus normal lung tissues. An adjusted p-value < 0.05 and |log[2](fold change)| > 1 were set as the cutoff criterion. The Venn plots (F) illustrate 801 genes overlapped by the turquoise module eigengenes and differential analysis, and the 103 genes were codetermined by the yellow module eigengenes and differential analysis (G). 3.2 GO and KEGG enrichment analyses According to GO analysis, the 904 common DEGs were mainly enriched in chromosome segregation, mitotic nuclear division, cell cycle checkpoint, and other biological processes, as shown in [78]Figure 3A. The chromosomal region, cytosolic ribosome, and kinetochore were the primarily enriched cellular component terms. For molecular function enrichment, DEGs were mainly involved in the structural constituent of the ribosome, protein-threonine kinase activity, microtubule-binding, and other terms. More detailed information is provided in [79]Supplementary Table S1. In addition, KEGG analysis revealed that DEGs were predominantly associated with the ribosome, cell cycle, oxidative phosphorylation, and other biological pathways, as shown in [80]Figure 3B. The information is further illustrated in [81]Supplementary Table S2. FIGURE 3. [82]FIGURE 3 [83]Open in a new tab Gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) enrichment analyses. Bubble plot (A) shows the GO function analysis for the 904 common DEGs. The circle plot (B) shows the KEGG pathway analysis for the common DEGs. The higher the z-score, the higher the expression of enriched terms. 3.3 KIF20A was identified as a hub gene PPI network constructed from candidate DEGs contained 391 nodes and 2,794 edges, as shown in [84]Supplementary Figure S4. Meanwhile, the top six clusters in the PPI network were developed via the MCODE plugin, as shown in [85]Table 1, of which clusters one and two had the highest MCODE scores and were therefore selected as the key clusters, as shown in [86]Figure 4A. The connectivity degrees of the nodes within key clusters were calculated and visualized in the histograms, as shown in [87]Figure 4B. Among the DEGs in key clusters, univariate COX analysis was carried out to determine the significant factors (p < 0.001) for the overall survival (OS) in LUAD patients, as shown in [88]Figure 4C. According to COX analysis, five common genes, including CDK1, BUB1B, PLK1, KIF20A, and CCNA2, were shared by leading nodes with a connectivity degree above 30 and a p-value below 0.001 in the second cluster. However, no genes were identified in the first cluster, as shown in [89]Figure 4D. Amongst the five common factors, KIF20A was further evaluated and classified as the hub gene for the following study. TABLE 1. The top six MCODE clusters in the PPI network. MCODE cluster Nodes Edges MCODE scores MCODE cluster Nodes Edges MCODE scores cluster 1 37 646 36.541 cluster 4 12 66 12.000 cluster 2 41 625 31.250 cluster 5 9 36 9.000 cluster 3 12 66 12.000 cluster 6 9 36 9.000 [90]Open in a new tab PPI, protein-protein interaction; MCODE, molecular complex detection. FIGURE 4. [91]FIGURE 4 [92]Open in a new tab Identification of hub genes. Image (A) illustrates the network of the top two clusters in the protein-protein interaction network containing 904 common DEGs. Green and red colored spots indicate downregulated and upregulated DEGs, respectively. Image (B) illustrates the degrees of connectivity within the nodes in the corresponding cluster networks. Image (C) shows the results of the univariate COX analysis for the significant factors in selected clusters (p < 0.001). Image (D) shows the five common genes overlapped by the nodes with connectivity degrees above 30 in the cluster network (n > 30) and a p-value below 0.001 in the COX analysis. 3.4 KIF20A exhibited high expression in LUAD tissues In LUAD samples, KIF20A expression was significantly higher than in normal samples, as shown in [93]Figures 5A–D. According to a meta-analysis using the Oncomine database, LUAD tissues had a higher expression of KIF20A than normal lung tissues in five cohorts (Hou lung, Landi lung, Okayama lung, Selamat lung, and Su lung), as shown in [94]Figure 5E ([95]Hou et al., 2010; [96]Landi et al., 2008; [97]kayama et al., 2012; [98]Selamat et al., 2012; [99]Su et al., 2007). Furthermore, higher levels of KIF20A proteins were detected in LUAD tissues as compared with normal lung tissues, as shown in [100]Figure 5F. The observation that KIF20A had increased expression in LUAD was further assessed in A549, PC-9, and NCI-H1395 cell lines by qRT-PCR and Western blot analyses, as shown in [101]Figure 5H. FIGURE 5. [102]FIGURE 5 [103]Open in a new tab Verification of KIF20A expression. LUAD samples had higher KIF20A expression in both unpaired (A) and paired (B) differential analyses, and the observations were verified using [104]GSE19188 (C) and [105]GSE33532 (D) profiles. Meta-analysis (E), protein differential analysis (F), and immunohistochemical images (G) validated the overexpression of the KIF20A gene in LUAD. Quantitative real-time PCR (qRT-PCR) (H) and Western blot analysis (I) illustrate the significant upregulation ofKIF20A expression in A549, PC-9, and NCI-H1395 cells compared with HBE cells. 3.5. Identification of KIF20A as a prognostic factor Kaplan-Meier survival analysis revealed that high KIF20A expression is a significant indicator of poor OS and unfavorable PFS in LUAD patients, as shown in [106]Figures 6A–E. Besides, KIF20A showed a negative association with DFS in the [107]GSE50081 profile, as shown in [108]Figure 6F. These were further confirmed in the GEPIA database and Kaplan-Meier Plotter platform, as shown in [109]Supplementary Figure S5. As there was no heterogeneity among the five datasets (I ^2 < 50% and p > 0.05), we chose a fixed model to conduct the meta-analysis. The results of the meta-analysis suggested that KIF20A served as a high-risk factor for the survival of LUAD patients (HR = 1.41, 95% CI: 1.28–1.56), as shown in [110]Figure 6G. Subsequently, univariate Cox analysis demonstrated that KIF20A had a significantly negative correlation with the survival of LUAD patients in several datasets, as shown in [111]Table 2. According to the above results, KIF20A may serve as a prognostic factor for LUAD patients. FIGURE 6. [112]FIGURE 6 [113]Open in a new tab Prognostic value of KIF20A expression in LUAD. Kaplan-Meier survival analysis showed KIF20A expression significantly indicated worse OS in LUAD patients in the TCGA dataset (A), [114]GSE68465 (B), and [115]GSE50041 (C) profiles. Kaplan-Meier survival analysis showed that KIF20A expression was negatively associated with progression-free survival (PFS) in the TCGA dataset (D) and [116]GSE68465 profile (E), as well as disease-free survival (DFS) in the [117]GSE50081 profile (F). The meta-analysis (G) further confirmed the correlation of KIF20A expression with OS in the five cohorts. TABLE 2. Univariate Cox regression analysis for KIF20A expression on OS, PFS, and DFS in LUAD patients. Parameter Univariate analysis Multivariate analysis HR 95% CI P-value HR 95% CI P-value TCGA (OS) KIF20A 1.718 1.356–2.178 <0.001 1.685 1.299–2.185 <0.001 age 1.000 0.983–1.019 0.958 1.014 0.996–1.033 0.133 gender 1.091 0.774–1.539 0.618 0.971 0.685–1.377 0.87 stage 1.581 1.350–1.852 <0.001 1.422 0.943–2.145 0.093 T classification 1.598 1.313–1.945 <0.001 1.205 0.961–1.512 0.107 M classification 1.900 1.069–3.378 0.029 0.910 0.317–2.618 0.862 N classification 1.716 1.411–2.087 <0.001 1.156 0.810–1.650 0.425 [118]GSE42127 (OS) KIF20A 1.404 1.074–1.835 0.013 1.452 1.079–1.953 0.014 age 1.043 1.008–1.079 0.016 1.032 0.996–1.070 0.085 gender 1.804 0.959–3.392 0.067 1.308 0.671–2.551 0.43 stage 1.571 1.094–2.256 0.014 1.431 0.970–2.110 0.071 TCGA (PFS) KIF20A 1.876 1.463–2.407 <0.001 1.788 1.366–2.340 <0.001 age 1.001 0.984–1.019 0.868 1.013 0.995–1.031 0.162 gender 1.127 0.800–1.589 0.494 1.024 0.723–1.449 0.895 stage 1.573 1.339–1.847 <0.001 1.394 0.935–2.076 0.103 T classification 1.592 1.318–1.923 <0.001 1.235 0.989–1.543 0.062 M classification 2.053 1.151–3.662 0.015 0.939 0.342–2.582 0.903 N classification 1.603 1.319–1.948 <0.001 1.054 0.749–1.483 0.764 [119]GSE68465 (PFS) KIF20A 1.460 1.153–1.848 0.002 1.403 1.070–1.841 0.014 gender 1.225 0.871–1.721 0.243 1.186 0.840–1.673 0.333 age 1.010 0.991–1.030 0.301 1.016 0.997–1.036 0.106 grade 1.492 1.125–1.979 0.006 1.169 0.863–1.585 0.313 N classification 1.575 1.267–1.959 <0.001 1.582 1.255–1.995 <0.001 T classification 1.497 1.174–1.908 0.001 1.318 1.020–1.703 0.035 [120]GSE50081 (DFS) KIF20A 1.414 1.105–1.809 0.006 1.339 1.025–1.751 0.033 age 1.018 0.989–1.047 0.238 1.017 0.985–1.049 0.296 gender 1.480 0.846–2.589 0.170 1.565 0.880–2.783 0.128 stage 2.514 1.415–4.465 0.002 2.191 0.357–13.435 0.397 T classification 2.749 1.461–5.173 0.002 1.697 0.799–3.606 0.169 N classification 2.219 1.236–3.982 0.008 0.984 0.163–5.935 0.986 [121]Open in a new tab LUAD, lung adenocarcinoma; HR, hazard ratio; CI, confidence interval; OS, overall survival; PFS, progress-free survival; DFS, disease-free survival. 3.6 Prognostic significance of KIF20A in LUAD Based on TCGA data, both correlation test and differential analysis indicated that KIF20A had a dominantly positive association with T-stage and N-stage in LUAD patients, as shown in [122]Figures 7A–F. Still, there was no significant difference in KIF20A expression based on age (above or below 65), gender (female and male), and distant metastasis, as shown in [123]Supplementary Figure S6. Additionally, KIF20A significantly predicted poor OS in LUAD patients with certain clinical characteristics; stage I and II (p = 0.0015), stage III and IV (p = 0.0417), T1 and T2 (p = 0.0008), and N0 (p < 0.0001), but not in LUAD patients with T3 and T4 (p = 0.1251) and N1 and N2 (p = 0.1191), as shown in [124]Figures 7G–L. FIGURE 7. [125]FIGURE 7 [126]Open in a new tab Correlation of KIF20A expression with clinical characteristics according to the TCGA dataset. Scatter plots show the correlation of KIF20A expression with tumor stage (A), T classification (B), and N classification (C). Boxplots show the differential analysis of KIF20A expression in different tumor stages (D), T classification (E), and N classification (F). Kaplan-Meier survival analysis shows the correlation of KIF20A expression with OS in LUAD patients according to tumor stage: stage I&II (G) and stage III and IV (H), primary tumor: T1 and 2 (I) and T3 and 4 (J), and lymph node metastasis: N0 (K) and N1&2 (L). 3.7 KIF20A was involved in the cell cycle To identify the biological functions of KIF20A in LUAD, we conducted GSVA and GSEA for the DEGs between high and low KIF20A expression subgroups according to the median expression of KIF20A. The results of GSVA found that cell cycle pathways, oocyte meiosis, and DNA replication were markedly upregulated in high KIF20A expression samples compared with low KIF20A expression samples, as shown in [127]Figure 8A. The results of GSEA found that oocyte meiosis and p53 signaling pathway were the mainly enriched cell cycle pathways in the KEGG collection and the E2F targets, G2M checkpoint, and mitotic spindle in the HALLMARK gene set collection, as shown in [128]Figure 8C, [129]Figure 8D, and [130]Supplementary Table S3. The above results suggest that KIF20A was mainly associated with cell cycle in LUAD progression. FIGURE 8. [131]FIGURE 8 [132]Open in a new tab Gene set variation analysis (GSVA) and gene set enrichment analysis (GSEA). Image (A) illustrates the identification of differentially expressed pathways among LUAD patients by GSVA. LUAD patients were divided into low and high KIF20A expression subgroups based on the median KIF20A expression level. An adjusted p-value < 0.05 and log[2]|fold change| > 0.25 were set as the cutoff criterion. Image (B) shows the clustering heatmap for the differentially expressed pathways. Image (C) shows the enriched gene sets in KEGG, while image (D) shows the HALLMARK collections in the high KIF20A expression samples. 3.8 KIF20A was associated with TME [133]Supplementary Figure S7 showed that KIF20A had a significant negative association with the immune score, stromal score, and total ESTIMATE score. CIBERSORT analysis results revealed eight TICs among 21 kinds of TICs prominently associated with KIF20A, codetermined by the differential analysis in the histogram and the correlation test scatter plot, respectively, as shown in [134]Figures 9A–C. The correlation among the eight different types of TICs, as well as the association of eight TICs with OS, shows that levels of uncommitted macrophages (M0) and activated CD4 memory T-cells had a distinctively negative association with OS in patients with LUAD, as shown in [135]Figure 9D, [136]Figure 9E and [137]Supplementary Figure S8. FIGURE 9. [138]FIGURE 9 [139]Open in a new tab Correlation of KIF20A expression with tumor-infiltrating immune cells (TICs) proportion. Differential analysis (A) for the proportions of 21 kinds of TICs in LUAD samples with low or high KIF20A expression levels relative to the median KIF20A expression level. Scatter plots (B) show the correlation between the levels of the ten kinds of TICs and the KIF20A expression (p < 0.05). Venn diagram (C) shows the eight kinds of TICs correlated with KIF20A expression overlapped by the differential analysis and correlation test. Circle plot (D) shows the correlation among the eight different types of TICs. Image (E) illustrates the Kaplan-Meier overall survival (OS) analysis for the proportions of two types of TICs (p < 0.05). 3.9 KIF20A knockdown inhibited cell proliferation and induced cell cycle arrest Compared with shNC group, the shKIF20A efficiently reduced the expression of KIF20A in A549 and PC-9 cells after transfection, as shown in [140]Figure 10A. At 48 h after the transfection, the shKIF20A group had lower cell viability than the shNC group, as shown in [141]Figure 10C. The number of colonies was distinctively reduced in the shKIF20A group compared to the shNC group, as shown in [142]Figure 10D. The FCM results demonstrated that the cell numbers of the G0/G1 phase were remarkably decreased 48 h after transfection, while the cell proportions at the G2/M phase were dominantly increased, as shown in [143]Figure 11A. Moreover, the higher percentage of apoptosis was significantly associated with shKIF20A transfection, as shown in [144]Figure 11B. FIGURE 10. [145]FIGURE 10 [146]Open in a new tab Effects of KIF20A knockdown on the proliferation of A549 and PC-9 cells. Efficacy of KIF20A deletion measured by qRT-PCR (A) and Western blot analysis (B). KIF20A deletion inhibited cell growth in the CCK-8 test (C) and colony formation assay (D). FIGURE 11. [147]FIGURE 11 [148]Open in a new tab Effects of KIF20A knockdown on cell cycle and apoptosis inA549 and PC-9 cells. KIF20A knockdown induced G2/M arrest (A) and promoted apoptosis (B) 48 h after transfection. 4 Discussion Because of cancer metastasis and recurrence, LUAD has posed a significant obstacle to clinical outcomes. Accumulating findings indicate that multiple signaling pathways are activated during LUAD progression, including the MAPK signaling pathway, PI3K/AKT signaling axis, FOXM1 signals, WNT signaling pathway, JAK/STAT pathways, and others ([149]Zhan et al., 2017; [150]Gao et al., 2018; [151]Hou et al., 2019; [152]Shan et al., 2019; [153]Sun and Zhao, 2019). These pathways have been found to contribute to tumor development by inducing cell cycle arrest at the resting-first growth (G0/G1) transition phase or the second growth mitosis transition (G2/M) phase. The FOXO3/FOXM1 axis is an essential downstream signal of the PI3K/AKT, Ras/ERK, and JNK/p38MAPK pathways, which are also involved in the tumor growth mediated by controlling the cell cycle ([154]Yao et al., 2018). From these observations, we can conclude that the cell cycle plays an essential role in controlling tumor growth, suggesting that future therapeutic strategies for LUAD could target the key molecules involved in the cell cycle. In addition, numerous studies have shown that immunotherapy can potentially suppress tumor growth and metastasis by triggering the innate and adaptive immune systems ([155]Farhood et al., 2019; [156]Wculek et al., 2020). Studies have shown that the initiation and recurrence of LUAD are not only influenced by the oncogenic properties of cancer but also by its interaction with the TME, particularly the immune microenvironment. In this study, we have identified KIF20A as the hub gene via integrated bioinformatics analysis. This gene was linked with a worse OS and could be used as a prognostic indicator in LUAD patients. Furthermore, it was also linked with alterations in the TME and TICs communities. The KIF20A knockdown was also found to inhibit LUAD cell proliferation, arrest cells at the G2/M phase, and promote apoptosis. KIF20A, also known as MKLP2 and RAB6KIFL, is a type of protein belonging to the kinesin family. Most members of the kinesin family are characterized by conserved kinetic domain and ATP activity ([157]Li et al., 2019; [158]Xiong et al., 2019). The characteristics prompt them to participate in chromosome partitioning, cellular transport, and spindle formation. KIF20A was mainly involved in organelle dynamics and cell mitosis ([159]Fontijn et al., 2001; [160]Zhang et al., 2020). Aberrant expression of KIF20A might result in abnormal cellular behaviors in cancer cells. Xiong et al. distinctively demonstrated that KIF20A was highly expressed in colorectal cancer and was involved in pathological functions, including promoting cellular growth and increased resistance to chemotherapy, mediated by triggering the JAK/STAT3 signaling pathway. Li et al. indicated that KIF20A silencing significantly impaired the proliferative and invasive capabilities of ovarian cancer cells, and FOXO1 could target KIF20A to manipulate cellular behaviors ([161]Li et al., 2020). Xie et al. suggested that non-small cell lung cancer (NSCLC) had higher KIF20A expression than adjacent tissues. The deletion of the KIF20A remarkably alters cellular phenotypes, inhibits cell migration and invasion by reducing MMP-7 expression, as well as preventing cell proliferation by inactivating JNK signals, implying that KIF20A might function as a potential oncogene and a promising treatment target for NSCLC ([162]Xie et al., 2020). In addition to pathological functions and therapeutic value, KIF20A has been recognized as an independent prognostic factor for various malignant cancers, such as bladder cancer, colorectal cancer, breast cancer, gastric cancer, epithelial ovarian cancer, and NSCLC ([163]Ni et al., 2018; [164]Sheng et al., 2018; [165]Shen et al., 2019; [166]Xiong et al., 2019; [167]Nakamura et al., 2020). To our knowledge, no studies reported on the role of KIF20A as a prognostic indicator for LUAD patients. Furthermore, a growing body of studies demonstrated that cancer cells, to some extent, shaped their surrounding microenvironment by secreting various cytokines and chemokines ([168]Wu and Dai, 2017; [169]Hinshaw and Shevde, 2019). The functional microenvironment would reprogram the surrounding cells, especially immune cells, and prompt them to take part in the tumor initiation and progression ([170]Frankel et al., 2017). Additionally, the innate and adaptive immune cells in TME elicited crucial roles in tumor survival and maintenance ([171]Gajewski et al., 2013; [172]Cassim and Pouyssegur, 2019). In this study, we found that M0 macrophages and activated memory CD4^+ T cells were not only significantly associated with KIF20 expression but also negatively correlated with the survival of LUAD patients, indicating their potential actions in LUAD pathogenesis. Macrophages (Mφs) could polarize into inflammatory M1 Mφs or immune-suppressive M2 Mφs, of which TME preferred M2 Mφs compared with M1 Mφs ([173]Genin et al., 2015; [174]Nielsen and Schmid, 2017). Besides, Mφs were reported to foster the migration and invasion of cancer cells mediated by secreting epidermal growth factors ([175]Elbaz et al., 2015; [176]Zeng et al., 2019). High M0 Mφs infiltration was found in many solid tumors and commonly indicated unfavorable survival, such as colorectal cancer, digestive system cancer, and breast cancer ([177]Bense et al., 2016; [178]Ge et al., 2019; [179]Yang et al., 2019). CD4^+ central memory T cells mainly act on immune memory and immunoprotection during tumor metastasis ([180]Bhattacharyya et al., 2010). In general, high factions of activated memory CD4^+ T cells were correlated with good survival in esophageal cancer ([181]Wang et al., 2020), which as opposite to our findings. However, the pathogenesis of TME involved in cancer progression is still not fully understood. Further research on pathogenesis is recommended in order to develop suitable therapeutic interventions. The work contributed to the understanding of LUAD pathogenesis and the possible development of novel therapeutic strategies. However, the study has some limitations that need to be acknowledged. First of all, it was difficult to identify the specific molecular mechanisms of KIF20A in the development and progression of LUAD, highlighting the need for further studies to understand the underlying mechanism. Moreover, the inhibitory effects of KIF20A against cancer cell proliferation were not validated in vivo. Finally, the association of KIF20A with the TME, especially TICs communities, was not verified experimentally. In conclusion, KIF20A could be used as a reliable prognostic and therapeutic biomarker for LUAD. High KIF20A expression indicated worse OS, PFS, and DFS in LUAD patients. Knockdown of KIF20A efficiently suppressed cell proliferation, induced G2/M phase arrest, and promoted cellular apoptosis. However, more research is required to further validate the findings of this study. Acknowledgments