Abstract Background Mitophagy is the selective degradation of mitochondria by autophagy. It becomes increasingly clear that mitophagy pathways are important for cancer cells to adapt to their high-energy needs. However, which genes associated with mitophagy could be used to prognosis cancer is unknown. Methods We created a clinical prognostic model using mitophagy-related genes (MRGs) in lung adenocarcinoma (LUAD) patients for the first time, and we employed bioinformatics methods to search for biomarkers that affect the progression and prognosis of LUAD. Transcriptome data for LUAD were obtained from The Cancer Genome Atlas (TCGA) database, and additional expression data from LUAD patients were sourced from the Gene Expression Omnibus (GEO) database. Furthermore, 25 complete MRGs were identified based on annotations from the MSigDB database. Results A comparison of the mitophagy scores between the groups with high and low scores was done using receiver operating characteristic (ROC) curves, which also revealed the differential gene expression patterns between the two groups. Using Kaplan-Meier analysis, two prognostic MRGs from the groups with high and low mitophagy scores were identified: TOMM40 and VDAC1. Using univariate and multivariate Cox regression, the relationship between the expression levels of these two genes and prognostic clinical features of LUAD was examined further.The prognosis of LUAD patients was shown to be significantly correlated (P < 0.05) with the expression levels of these two genes. Conclusions Our prognostic model would improve the prognosis of LUAD and guide clinical treatments. Keywords: Mitophagy, Lung adenocarcinoma, Gene, Bioinformatics, Prognosis 1. Introduction Lung cancer has been identified as the most common cause mortality from cancer globally. Although its incidence rate has declined overall in recent years, its incidence rate among young people is on the rise [[29]1]. Non-small cell lung cancers (NSCLC) comprise approximately 85 % of lung cancers histologically. Lung adenocarcinomas (LUAD) are the most prevalent type of NSCLC, accounting for over 50 % of all lung cancer cases [[30]2]. Although immunotherapy and targeted therapies have made significant progress in the treatment of LUAD, the disease's insidiosity and lack of specificity result in the majority of patients being diagnosed in an advanced stage, and the overall 5-year survival rate remains below 20 % [[31]3,[32]4]. To enhance LUAD patient survival outcomes, it is imperative to investigate novel biomarkers and efficient prognostic prediction models. Mitophagy (MP) is the selective autophagy of mitochondria, a process that selectively targets damaged mitochondria for degradation [[33][5], [34][6], [35][7]]. Autophagy is a lysosomal-dependent protein degradation system that maintains the balance and equilibrium of the internal environment by degrading and recycling proteins and damaged organelles. A number of disorders, such as cardiovascular diseases, neurological diseases, and cancers, have been linked to aberrant autophagy, in accordance to evidence [[36]8,[37]9]. Mitophagy is closely related to LUAD. Impaired mitophagy can result from the cytoplasm's tumorigenic processes, such as PI3K/Akt, Akt/mTOR, or p53 protein aggregation. These activities can cause genomic instability, interfere with cell differentiation, influence the activation of cell aging programs, and interfere with cell metabolism [[38]10,[39]11]. Consequently, studies on the regulatory genes linked to MP can be used to determine the prognosis of lung cancer and comprehend the role of MP in both the onset and development of LUAD. By using multiple databases, this study explored the differentially expressed MRGs (MP-related genes) in LUAD, in combination with clinicopathological features and immunohistochemical protein levels, to construct a model for LUAD prognostic evaluation that will be valuable in guiding the evaluation and treatment of cancer. 2. Materials and methods 2.1. Data availability The “TCGAbioLinks” package of R was used to retrieve the LUAD dataset (TCGA-LUAD) from the TCGA database ([40]https://portal.gdc.cancer.gov/) [[41]12], and the sequencing data of 594 clinical samples were obtained after expelling those without clinical data. From the UCSC Xena database ([42]http://genome.ucsc.edu), clinical data pertaining to the samples was retrieved [[43]13]. Data from the cBioPortal database ([44]https://www.cbioportal.org/) were retrieved for tumor mutation burden (TMB) and microsatellite instability (MSI) [[45]14]. In addition, we retrieved the expression datasets [46]GSE40791 and [47]GSE27262 [[48]15,[49]16] of LUAD patients from the GEO database through the “GEOquery” package [[50]17], which included 50 and 194 samples, respectively. Details of the specific datasets are provided in [51]Table 1. In addition, we searched and collected 29 MRGs as annotated by the MSigDB database [[52]18] by using “mitophagy” as the key word. We found that 4 of the 29 MRGs did not have available transcriptomic data in the TCGA dataset and only retained the remaining 25 MRGs (TOMM22, ATG12, UBB, MFN2, RPS27A, MAP1LC3A, VDAC1, PINK1, TOMM20, CSNK2A2, UBA52, FUNDC1, TOMM40, SQSTM1, MTERF3, ATG5, ULK1, CSNK2B, MAP1LC3B, SRC, UBC, PGAM5, TOMM5, CSNK2A1, and MFN1), for further research. Table 1. Lung Adenocarcinoma Dataset Information list. TCGA-LUAD [53]GSE27262 [54]GSE40791 Platform / [55]GPL570 [56]GPL570 Species Homo sapiens Homo sapiens Homo sapiens Tissue Lungs Lungs Lungs Samples in the Normal group 59 25 100 Samples in LUAD group 535 25 94 Reference / PMID: [57]22726390 PMID: [58]23187126 [59]Open in a new tab TCGA: The cancer genome atlas; LUAD: Lung adenocarcinoma. 2.2. Gene mutations and the calculation of mitophagy score Using the LUAD dataset, we first examined the variations in the expression of the 25 MRGs and compared them across different groups. Subsequently, we preprocessed the somatic mutation data of LUAD patients in the TCGA-LUAD dataset using VarScan software, and visualized the somatic mutation status of 25 MRGs using the “maftools” package [[60]19]. We employed the R package “GSVA” [[61]20] to compute a gene-set enrichment mitophagy score for each LUAD/[62]GSE27262/[63]GSE40791 dataset sample. Using the expression profiles of 25 MRGs as a basis, the single-sample gene-set enrichment analysis (ssGSEA) algorithm was used to quantify the relative abundance of each gene in the dataset samples [[64]21]. 2.3. Differential analysis, function, and pathway enrichment analysis Using the TCGA-LUAD dataset and the “limma” package [[65]22], we conducted differential analysis to find the DEGs related with high and low MPs score grouping by obtaining DEGs between high and low MPs score groups. DEGs were defined as genes having an adjusted p-value (P.adj) < 0.05 and |logFC| > 1.0. We conducted GO and KEGG analyses [[66]23] of DEGs in the high and low score groups using the R “clusterProfiler” package to better understand their biological activities and roles in molecular pathways. 2.4. Differential analysis and correlation analysis of immune infiltration between patients in high and low-risk groups The ssGSEA method was employed to distinguish between multiple human immune cell phenotypes within the tumor microenvironment (TME) in order to approximate the immune cell count in LUAD samples [[67]24]. Using the ssGSEA analysis enrichment score in R's GSVA package, the infiltration levels of 28 immune cell types were represented for each sample. Next, a boxplot plot was employed to illustrate the differences in immune infiltration among the different groups. The TCGA-LUAD dataset's gene expression matrix was then combined to determine the association between immune cells and MRGs in various risk groups. The R package “pheatmap” was then used to plot the association between them, creating a correlation heatmap. We also quantitatively analyzed the level of immune infiltration between the groups and used the R package “ESTIMATE” [[68]25] to assess the immune activity of tumors based on expression profile data. Next, we examined the variations in immune infiltration characteristics between MPs in high- and low-scoring groups of LUAD patients. Additionally, we analyzed the differences in tumor mutation burden (TMB) and microsatellite instability (MSI) between high- and low-scoring groups of MPs in LUAD patients from the TCGA-LUAD dataset. 2.5. A prognostic risk model was constructed based on MRGs For the MRGs in the TCGA-LUAD dataset, we used the least absolute shrinkage and selection operator (LASSO) regression using the ‘glmnet’ R package [[69]26]. We created a predictive model based on these MRGs using ten-fold cross-validation and a random seed of “2021".This approach enabled us to create a model capable of predicting study outcomes. Regularization and dimensionality reduction analyses were also performed to further screen for prognostic MRGs. [MATH: riskSco re=i< mrow>Coeffic< /mi>ient(hubgene i)*mRNAExpression(hubgene i) :MATH] 2.6. Clinical assessment of prognostic MRGs The R package “rms” was utilized to generate a nomogram and conduct univariate and multivariate Cox regression analyses on the TCGA-LUAD dataset's predicted MRG expressions against clinical parameters. This allowed us to investigate the clinical prognostic value of MRGs on LUAD [[70]27]. The calibration curve was used to evaluate the nomogram's accuracy and resolution, and the R package “ggDCA” was utilized to create a decision curve analysis (DCA) diagram in order to evaluate the prognostic value of the Cox regression model [[71]28]. 2.7. Construction of mRNA-mRNA, mRNA-miRNA, mRNA-RBP, mRNA-TF interaction network The STRING database was used in this study to create a protein-protein interaction network (PPI network, minimum necessary interaction score: highest confidence (0.900)) from 25 MRGs. Cytoscape was then used to display the PPI network model [[72]29]. The PPI network's strongly connected local regions may serve as molecular complexes with specific biological functions. The maximum neighbor coefficient (MNC), degree, edge preserved component (EPC), differential metabolic network construction (DMNC), and Matthews Correlation Coefficient (MCC) are the five algorithms that make up the PPI network and are used to determine the connection scores of MRGs associated with other PPI network nodes. The top 15 MRGs with the highest score of the five algorithms were intersected to obtain the hub genes in LUAD. We used the ENCORI and miRDB datasets to predict miRNAs that interact with hub genes.We next intersected the Target Score>80 data segment of the miRDB database with the mRNA miRNA data in the ENCORI database to generate an mRNA-miRNA interaction network. Predicting RNA binding proteins (RBPs) that interact with key hub genes was another application of the ENCORI database. mRNA-RBP interaction pairings were screened using clipExpNum ≥ 5 and clipIDnum>10 as screening criteria, and the mRNA-RBP interaction network was shown. Finding transcription factors (TF) that bind to key genes (mRNA) was done using the hTFtarget and CHIPBase databases (version 3.0). 2.8. QPCR validation of the expression of hub genes cDNA was used as a template for PCR amplification, and the generation process of PCR products was monitored in real time using fluorescently labeled probes or fluorescent dyes such as SYBR Green. This makes it possible to measure the target genes' RNA expression levels in various cell types and to calculate and compare the target genes' relative expression levels in each cell type according to the qPCR assay's fluorescence signal strength. Human non-small cell lung cancer cells (A549), human non-small cell lung cancer cells (H1299), and normal human bronchial epithelial cells (BEAS-2B) were prepared, grown, and then exposed to fluorescence quantitative PCR detection to ascertain the relative expression of RNA using the following primers: MFN1 - F: TGGCTAAGAAGGCGATTACTGC. MFN1 - R: TCTCCGAGATAGCACCTCACC. MFN2-F: TCCTGTACGTGTCTTCAAGGAA. MFN2-R: CTGGAAGATGGACGTACTTTGTC. RPS27A –F: CGACGAAGGCGACTAATTTTGC. RPS27A –R: GCACCCCAATGTGATCTGC. SQSTM1-F: GCACCCCAATGTGATCTGC. SQSTM1-R: CGCTACACAAGTCGTAGTCTGG. UBA52-F: AAGACAAGGAGGGTATCCCAC. UBA52-R: TGTTGTAGTCTGAGAGAGTGCG. UBB-F: GGTCCTGCGTCTGAGAGGT. UBB-R: GGCCTTCACATTTTCGATGGT. UBC-F: CTGGAAGATGGTCGTACCCTG. UBC-R: GGTCTTGCCAGTGAGTGTCT. ULK1-F: GGCAAGTTCGAGTTCTCCCG. ULK1-R: CGACCTCCAAATCGTGCTTCT. VDAC1-F: ACGTATGCCGATCTTGGCAAA. VDAC1-R: TCAGGCCGTACTCAGTCCATC. 2.9. Statistical analysis This study used version 4.1.2 of the R software for all data processing and analysis. Two sets of continuous data were compared, and independent student t-tests were used to assess the statistical significance of regularly distributed variables. In contrast, the Mann-Whitney U test was employed to analyze the differences among non-normally distributed variables. When comparing three or more groups, we utilized the Kruskal-Wallis test. We compared two sets of category variables and used Fisher's exact test or the chi-square test to determine their statistical significance. Using Spearman correlation analysis, correlation coefficients were all calculated. For survival analysis, the R package “survival” was utilized. The log-rank test was used to compare the survival rates of the two patient groups, and Kaplan-Meier survival curves were employed to illustrate the differences in survival. Statistics always use two-sided P values; a value of P < 0.05 denotes statistical significance. 3. Results 3.1. Significant results of mitochondrial autophagy gene (MRGs) expression in the LUAD dataset Three datasets were used, and principal component analysis (PCA) was used to evaluate the variation between LUAD samples: TCGA-LUAD, [73]GSE27262, and [74]GSE40791 ([75]Fig. 1A–C). The PCA plots demonstrated marked differences between LUAD and normal samples in all three datasets. Fig. 1. [76]Fig. 1 [77]Open in a new tab Differences in the expression of mitophagy genes (MRGs) in lung adenocarcinoma (LUAD) data set. A-C. In the TCGA-LUAD dataset, PCA plots of mitophagy genes (MRGs) (A), [78]GSE27262 dataset (B), and [79]GSE40791 dataset (C) in FPKM format. D-F. Mitochondrial autophagy genes (MRGs) in the TCGA-LUAD dataset (D), [80]GSE27262 dataset (E), and [81]GSE40791 dataset (F) in FPKM format were shown in group comparison diagrams. PCA: Principal Component Analysis; LUAD: Lung adenocarcinoma; TCGA: The Cancer Genome Atlas; MRGs: Mitophagy-related genes. * denotes p < 0.05, ** denotes p < 0.01, and *** denotes p < 0.001. Next, the expression of the 25 MRGs in the TCGA-LUAD dataset was compared between tumor and normal samples. It was shown that the expression levels of 17 MRGs (CSNK2A1, FUNDC1, MFN1, MTERF3, RPS27A, SQSTM1, SRC, TOMM22, TOMM40, TOMM5, UBA52, VDAC1, ATG12, CSNK2B, PGAM5, TOMM20, ULK1) were significantly higher in the LUAD tumor tissues than in the normal tissues (p < 0.05). However, there was a significant difference (p < 0.05) in the expression levels of 5 MRGs (MAP1LC3A, MAP1LC3B, MFN2, PINK1, and UBB) between the tumor and normal tissues. ATG5, CSNK2A2, and UBC expression showed no difference between the normal tissues and the LUAD tumor tissues ([82]Fig. 1D). In the [83]GSE27262 dataset, 10 MRGs (CSNK2A1, MFN1, MTERF3, TOMM40, FUNDC1, SQSTM1, SRC, VDAC1, TOMM22, TOMM5) were significantly up-regulated (p < 0.05), while 5 genes (CSNK2A2, MAP1LC3A, MAP1LC3B, UBB, UBC) were significantly down-regulated (p < 0.05) ([84]Fig. 1E). The differential analysis of the [85]GSE40791 dataset revealed that out of the 25 MRGs, 15 genes (CSNK2A1, FUNDC1, MFN1, MTERF3, TOMM20, TOMM22, VDAC1, PGAM5, ULK1, RPS27A, TOMM40, CSNK2B, SQSTM1, SRC, TOMM5) were significantly up-regulated (p < 0.05). In contrast, 7 genes (CSNK2A2, MAP1LC3A, MAP1LC3B, MFN2, PINK1, UBB, UBC) were significantly down-regulated (p < 0.05) ([86]Fig. 1F). These 25 MRGs may be crucial in LUAD, according to the differential expression analysis of these 25 MRGs between the LUAD group and the Normal group in three LUAD datasets. 3.2. Multiple types of MRGs gene mutations in LUAD patients We analyzed the types of mutations in the 25 MRGs in LUAD patient samples. We found they mainly included missense (64 %), nonsense (11 %), and splicing mutations (3 %), in addition to a small number of frameshift insertions (1 %) and inframe insertions (1 %). At the nucleotide level, most of these mutations result from single nucleotide polymorphisms (SNPs), and a small percentage are insertion-deletions (indels). The C > A mutation (35 %) was the most common, followed by C > T (24 %) and C > G (14 %) ([87]Fig. 2A). Fig. 2. [88]Fig. 2 [89]Open in a new tab Mutation analysis of MRGs in lung adenocarcinoma (LUAD). A. The mutational status of MRGs in lung adenocarcinoma (LUAD). B. The mutational details of MRGs in lung adenocarcinoma (LUAD) were shown. C. Chromosomal map of MRGs. MRGs: mitophagy-related genes; LUAD: Lung adenocarcinoma. As shown in [90]Fig. 2B, the TCGA-LUAD dataset contains 18 somatic mutations in 25 MRGs. 84 LUAD patient tumor samples show somatic mutations, and 18 MRG mutations were present in 67 samples, accounting for 79.76 % of the total number of tumor samples with mutations. Among the 18 MRGs with mutations, most contained missense mutations, with the UBC gene having the highest mutation burden, accounting for 15 % of the total number of mutations in LUAD samples. [91]Fig. 2C shows that these 25 MRGs were mainly distributed on chromosomes 1, 5, 6, 19, 20, and X, among which chromosomes 1, 5, and 20 had the highest number of mutated MRGs with 3 MRGs on each chromosome. The many MRG mutation types imply that MRGs could be a major factor in developing LUAD disease. 3.3. The TCGA-LUAD samples were grouped according to mitochondrial autophagy score We used the ssGSEA method in the GSVA package to compute the MP score for each sample in the TCGA-LUAD dataset based on the expression levels of the 25 MRGs. LUAD tumor tissue scores were compared to normal tissue scores in the TCGA-LUAD dataset. There was a statistically very significant difference (p0.001)([92]Fig. 3A). The expression levels of these MRGs demonstrated a high degree of accuracy in predicting the diagnosis, as shown by the plotting of the ROC curves of 25 MRGs from the TCGA-LUAD dataset (AUC = 0.852, [93]Fig. 3B). We also performed the same analyses for the [94]GSE27262 and [95]GSE40791 datasets ([96]Fig. 3C–F). Fig. 3. [97]Fig. 3 [98]Open in a new tab Construction of mitophagy score. A-B. Mitophagy score group comparison (A) and ROC curve display (B) in the TCGA-LUAD dataset. C-D. Mitophagy scores in [99]GSE27262 data set group comparison (C) and ROC curve display (D). E-F. ROC curve display and group comparison(E) of mitophagy scores in the [100]GSE40791 dataset (F). G-H. The prognostic KM curve (G) of the mitophagy score between high and low mitophagy score in the TCGA-LUAD data set (Time = Day) and the group comparison (H) are shown. TCGA: The cancer genome atlas; LUAD: Lung adenocarcinoma; ROC: receiver operating characteristic curve; MRGs: mitophagy-related genes; KM: Kaplan–Meier. Next, using clinical data from dataset samples and a prognostic analysis of the MP score, we divided TCGA-LUAD patients into two groups: high MPs and low MPs. Results of the survival study indicated that patients with low MPs had a comparatively better prognosis than those with high MPs (P < 0.0001; [101]Fig. 3G), and there was a statistically significant difference in MP scores between the two groups (P < 0.001; [102]Fig. 3H). 3.4. DEGs were enriched in various biological functions and pathways In the TCGA-LUAD dataset, we conducted GO analysis ([103]Table 2, [104]Fig. 4A–D) and KEGG analysis ([105]Table 3, [106]Fig. 4A–E) on the 100 differentially expressed genes ([107]Table S1) that divide the high MP score groups from the low MP score groups. The result showed the DEGs were involved in biological processes including phagocytosis, endocytosis, and T-cell mediated cytotoxicity, maintaining cellular components including the external plasma membrane, endocytic vesicles, multivesicular bodies, and coated vesicles, and molecular functions including carbohydrate binding, cargo receptor activity, scavenger receptor activity, and monosaccharide binding. According to KEGG analysis, The DEGs were involved in five pathways, including complement and coagulation cascades, amoebiasis, pertussis, and ECM-receptor interaction. Table 2. GO enrichment analysis results of WGCNA module Differentially expressed genes. Ontology ID Description GeneRatio BgRatio pvalue p.adjust qvalue BP GO:0050764 regulation of phagocytosis 5/91 98/18670 1.17e-04 0.026 0.024 BP GO:0050766 positive regulation of phagocytosis 4/91 70/18670 3.79e-04 0.048 0.045 BP GO:0001914 regulation of T cell mediated cytotoxicity 3/91 33/18670 5.50e-04 0.060 0.056 BP GO:0045807 positive regulation of endocytosis 5/91 153/18670 9.14e-04 0.093 0.087 CC GO:0005771 multivesicular body 6/99 51/19717 2.06e-07 1.05e-05 9.63e-06 CC GO:0009897 external side of the plasma membrane 11/99 393/19717 4.50e-06 1.72e-04 1.57e-04 CC GO:0030139 endocytic vesicle 7/99 303/19717 8.42e-04 0.016 0.015 CC GO:0030135 coated vesicle 6/99 289/19717 0.003 0.057 0.053 MF GO:0038024 cargo receptor activity 5/87 85/17697 6.16e-05 0.005 0.005 MF GO:0005044 scavenger receptor activity 4/87 51/17697 1.14e-04 0.007 0.006 MF GO:0048029 monosaccharide binding 4/87 75/17697 5.08e-04 0.025 0.023 MF GO:0030246 carbohydrate binding 6/87 271/17697 0.002 0.061 0.055 [108]Open in a new tab WGCNA: Weighted gene co-expression network analysis; GO: Gene Ontology; BP: biological process; CC: cellular component; MF: molecular function. Fig. 4. [109]Fig. 4 [110]Open in a new tab Functional enrichment analysis (GO) and pathway enrichment (KEGG) analysis of DEGs. A. The histogram of DEGs shows the results of functional enrichment analysis (GO) and pathway enrichment analysis (KEGG) (A). B-D. Network diagrams illustrate the biological processes (BP) (B), molecular functions (MF) (C), and cellular components (CC) (D) from the GO functional enrichment analysis of DEGs. E. The KEGG pathway enrichment analysis results for DEGs are presented in a ring network diagram. F. Results of the GO/KEGG enrichment analysis of DEGs combined with logFC are displayed in a bubble chart. GO terms were shown in the abscissa in the histogram (A), and the height of the bar indicates the P value of GO terms. Specific genes are represented by the brown dots in the network diagram, and the dark blue circles represent pathways. The bubble diagram (F) shows the BP route as brown dots, the CC pathway as red circles, the MF pathway as dark cyan circles, and the KEGG pathway as light green circles. DEGs: Differentially expressed genes; GO: Gene Ontology; BP: Biological Process; CC: Cellular Component; MF: Molecular Function; KEGG: Kyoto Encyclopedia of Genes and Genomes. The criteria for screening GO and KEGG enrichment items were a P.value of less than 0.05 and an FDR value (q.value) of less than 0.10. Table 3. KEGG enrichment analysis results of WGCNA module Differentially expressed genes. Ontology ID Description GeneRatio BgRatio pvalue p.adjust qvalue KEGG hsa05146 Amoebiasis 6/56 102/8076 6.82e-05 0.007 0.006 KEGG hsa04974 Protein digestion and absorption 5/56 103/8076 6.98e-04 0.034 0.031 KEGG hsa05133 Pertussis 4/56 76/8076 0.002 0.059 0.054 KEGG hsa04610 Complement and coagulation cascades 4/56 85/8076 0.003 0.061 0.056 KEGG hsa04512 ECM-receptor interaction 4/56 88/8076 0.003 0.061 0.056 [111]Open in a new tab WGCNA: Weighted gene co-expression network analysis; KEGG: Kyoto Encyclopedia of Genes and Genomes. We also performed joint KEGG and GO analysis to determine each DEG's z-score. The bubble diagram ([112]Fig. 4F) demonstrated that most DEGs are implicated in biological processes and the maintenance of cellular components. 3.5. Significant immune characteristics differences were found between high and low mitophagy score groups The single-sample GSEA (ssGSEA) method was utilized in the TCGA-LUAD dataset in order to evaluate the differences in the infiltration of 28 distinct types of immune cells between the groups with high and low MP scores within the dataset ([113]Fig. 5A). The findings showed that there was a major difference in the infiltration of 20 different types of immune cells, including activated B cells, activated CD4 T cells, and activated dendritic cells, between the groups with high mitophagy scores and those with low score of mitophagy. Fig. 5. [114]Fig. 5 [115]Open in a new tab ssGSEA immune infiltration analysis between high and low mitophagy score groups in TCGA-LUAD dataset. A. The ssGSEA immune infiltration analysis results are shown in a group comparison chart comparing the TCGA-LUAD dataset's high and low mitophagy score groups. B–C. The correlation analysis results of immune cell infiltration abundance were compared between the low mitophagy score group (B) and the high mitophagy score group (C) in the TCGA-LUAD dataset. TCGA: The cancer genome atlas; LUAD: Lung adenocarcinoma; ssGSEA: single-sample gene-set enrichment analysis. Furthermore, we evaluated the relationship between 19 immune cell type infiltration and the 25 MRGs in the high mitophagy group and 20 immune cell types in the low mitophagy group. In the group with a low mitophagy score, a positive correlation was found between MAP1LC3B and the infiltration of 18 distinct types of immune cells, whereas a negative correlation was found between TOMM20, ULK1, and CSNK2A1 and the infiltration of 16, 17, and 18 different types of immune cells, respectively ([116]Fig. 5B). [117]Fig. 5C shows that in the group with a high mitophagy score, there was a positive correlation between TOMM5 and a negative correlation between ULK1 and 5 distinct types of immune cells, and 5 distinct types of immune cells with a negative correlation, and a positive correlation between MAP1LC3B and 10 distinct kinds of immune cells and ATG12. We obtained the StromalScore, ImmuneScore, ESTIMATEScore, and Tumor Purity scores and analyzed their correlations with our mitophagy scores in the TCGA-LUAD dataset. Results indicated that our mitophagy score had significant correlations (p < 0.001) with StromalScore, ImmuneScore, ESTIMATEScore, and Tumor Purity score ([118]Fig. 6A–H). Tumor Purity has a positive correlation, and the other three have a negative correlation. Furthermore, in terms of microsatellite instability (MSI), there was not a statistically significant difference detected between the groups that had high mitophagy scores and those that had low mitophagy scores(P > 0.05, [119]Fig. 6I). However, it is worth noting that there was a strong and statistically significant difference in the tumor mutation burden (TMB) between the groups with high mitophagy scores and those with low mitophagy scores (P < 0.001, [120]Fig. 6J). Fig. 6. [121]Fig. 6 [122]Open in a new tab ESTIMATE immune evaluation analysis between high and low mitophagy score groups in the TCGA-LUAD dataset. A. The StromalScore score results group comparison chart of the TCGA-LUAD dataset's high and low mitophagy score groupings. B. Scatter plot shows the correlation between the StromalScore scoring results and the mitophagy score in the TCGA-LUAD dataset. C. The ImmuneScore score group comparison chart of the TCGA-LUAD dataset shows the differences in mitophagy scores between the high and low groups. D. Scatter plot showing the correlation between the ImmuneScore scoring results and the mitophagy score in the TCGA-LUAD dataset. E. The TCGA-LUAD dataset's comparison chart of ESTIMATEScore score results between groups with high and low mitophagy scores. F. Scatter plot showing the relationship between the TCGA-LUAD dataset's mitophagy score and the ESTIMATEScore score results. G. The TCGA-LUAD dataset's group comparison chart for Tumor Purity score results shows which groups had higher and lower mitophagy scores. H. A scatter plot showing the relationship between the Tumor Purity scoring results and the mitophagy score in the TCGA-LUAD dataset. I–K. Microsatellite instability (MSI) (I), tumor mutation burden (TMB) (J), and TIDE immunotherapy score (K) between high and low mitophagy score groups in the TCGA-LUAD dataset are shown in the group comparison chart. When the absolute value is over 0.8, the correlation coefficient (r) in the correlation scatter plot is high; when the absolute value is between 0.5 and 0.8, it is moderately correlated; when the absolute value is between 0.3 and 0.5, it is weakly correlated; and when the absolute value is below 0.3, it is inconsequential. TCGA: The cancer genome atlas; LUAD: Lung adenocarcinoma; MSI: Microsatellite Instability; TMB: tumor mutation burden. In addition, given the important role of immunotherapy in treating LUAD patients, we analyzed the responses to immunotherapy between the high and low mitophagy score groups. [123]Fig. 6K demonstrates that immunotherapy response score was lower for LUAD patients in the high mitophagy score group compared to the low mitophagy score group (p < 0.001), suggesting that the immunotherapy response in the high mitophagy score group might be better than that in the low mitophagy score group. The abundance of multiple immune cell infiltrations varies significantly between samples from the groupings of high-risk and low-risk, and there is a significant correlation with 25 MRGs, indicating that these genes have great potential in tumor immune differentiation and immunotherapy. 3.6. Two characteristic genes were obtained and associated with clinical features We created a prognosis model using LASSO regression analysis and the 25 MRGs, and we also plotted a risk factor map and visualization map ([124]Fig. 7A–C). Two characteristic genes were obtained through the analysis TOMM40 and VDAC1, for which we drew prognostic survival Kaplan-Meier curves ([125]Fig. 7D and E). We then compared the differences in expression of TOMM40 and VDAC1 in the setting of different clinical features, such as OS, DSS, PFI, pathologic stage, gender, T stage, and N stage ([126]Fig. 7, [127]Fig. 8A–F). Both TOMM40 and VDAC1 expression differences were statistically significant (P < 0.05) across all settings, with the exception of the PFI group (P > 0.05) ([128]Fig. 8G–L). Fig. 7. [129]Fig. 7 [130]Open in a new tab Construction of MRGs prognostic model and clinical correlation analysis. A. Diagram of the LASSO regression prognostic model for MRGs. B–C. Variable trajectory plot (B) and risk factor plot (C) of the MRGs prognostic model. D-E. Prognostic survival KM curves of MRGs TOMM40, VDAC1. F. Correlation analysis between MRGs TOMM40 and clinical T stage. MRGs: mitophagy-related genes; LASSO: Least absolute shrinkage and selection operator. *** indicates p < 0.001. Fig. 8. [131]Fig. 8 [132]Open in a new tab Clinical correlation analysis of MRGs. Correlation analysis of MRGs TOMM40 with clinical N stage (A), Gender (B), clinical pathologic stage (C), OS (D), DSS (E), and PFI (F). G-L. Correlation analysis of MRGs VDAC1 with clinical T stage (G), clinical N stage (H), Gender (I), clinical pathologic stage (J), OS (K), and DSS (L). MRGs: mitophagy-related genes. * indicates p < 0.05, ** indicates p < 0.01, and *** indicates p < 0.001. 3.7. The prognostic model has a good performance In this study, we employed the MRGs that we generated to further validate the prognostic model by doing statistical analysis on the clinical data of LUAD patients found in the TCGA-LUAD dataset ([133]Table 4). Age, gender, pathologic stage, N stage, M stage, T stage, expression levels of the two MRGs, TOMM40 and VDAC1, and clinical factors were found to be directly linked (P < 0.05) with the prognosis of LUAD patients ([134]Table 5). A forest plot presents the results ([135]Fig. 9A). A nomogram was plotted. Nomogram analysis assessed the model's prognostic potential ([136]Fig. 9B). We plotted the calibration curves. We conducted a prognostic calibration analysis on the nomogram derived from the univariate and multivariate Cox regression models([137]Fig. 9C–E). Following that, we examined the capability of the created Cox regression prognostic model to predict the prognosis for a period of 1 year ([138]Fig. 9F), 3 years ([139]Fig. 9G), and 5years ([140]Fig. 9H) by employing the DCA technique. Predicting power was highest for 5 years, then 3 years, then 1 year. Table 4. Patient Characteristics of LUAD patients in the TCGA Datasets. Characteristic levels Overall n 535 T stage, n (%) T1 175 (32.9 %) T2 289 (54.3 %) T3 49 (9.2 %) T4 19 (3.6 %) N stage, n (%) N0 348 (67.1 %) N1 95 (18.3 %) N2 74 (14.3 %) N3 2 (0.4 %) M stage, n (%) M0 361 (93.5 %) M1 25 (6.5 %) Pathologic stage, n (%) Stage I 294 (55.8 %) Stage II 123 (23.3 %) Stage III 84 (15.9 %) Stage IV 26 (4.9 %) Gender, n (%) Female 286 (53.5 %) Male 249 (46.5 %) Age, n (%) ≤65 255 (49.4 %) >65 261 (50.6 %) OS event, n (%) Alive 343 (64.1 %) Dead 192 (35.9 %) DSS event, n (%) Alive 379 (76 %) Dead 120 (24 %) PFI event, n (%) Alive 309 (57.8 %) Dead 226 (42.2 %) [141]Open in a new tab TCGA: The cancer genome atlas; LUAD: Lung adenocarcinoma. Table 5. COX regression to identify clinical features of dataset TCGA-LUAD. Characteristics Total(N) Univariate analysis __________________________________________________________________ Multivariate analysis __________________________________________________________________ Hazard ratio (95 % CI) P value Hazard ratio (95 % CI) P value T stage 523 T1 175 Reference T2 282 1.521 (1.068–2.166) 0.020 1.340 (0.841–2.136) 0.218 T3 47 2.937 (1.746–4.941) <0.001 3.186 (1.554–6.529) 0.002 T4 19 3.326 (1.751–6.316) <0.001 1.820 (0.822–4.031) 0.140 N stage 510 N0 343 Reference N1 94 2.381 (1.695–3.346) <0.001 2.006 (1.041–3.865) 0.038 N2 71 3.108 (2.136–4.521) <0.001 2.266 (0.924–5.561) 0.074 N3 2 0.000 (0.000-Inf) 0.994 0.000 (0.000-Inf) 0.993 M stage 377 M0 352 Reference M1 25 2.136 (1.248–3.653) 0.006 1.994 (0.904–4.397) 0.087 Age 516 ≤65 255 Reference >65 261 1.223 (0.916–1.635) 0.172 Gender 526 Female 280 Reference Male 246 1.070 (0.803–1.426) 0.642 Pathologic stage 518 Stage I 290 Reference Stage II 121 2.418 (1.691–3.457) <0.001 0.966 (0.482–1.937) 0.922 Stage III 81 3.544 (2.437–5.154) <0.001 1.356 (0.502–3.663) 0.548 Stage IV 26 3.790 (2.193–6.548) <0.001 TOMM40 526 Low 264 Reference High 262 1.501 (1.121–2.008) 0.006 1.041 (0.710–1.527) 0.837 VDAC1 526 Low 265 Reference High 261 1.808 (1.347–2.427) <0.001 1.754 (1.189–2.587) 0.005 [142]Open in a new tab TCGA: The cancer genome atlas; LUAD: Lung adenocarcinoma. Fig. 9. [143]Fig. 9 [144]Open in a new tab Prognostic performance of MRGs prognostic model. Forest plot (A), nomogram (B) of univariate and multivariate Cox regression analysis of prognostic MRGs combined with clinical variables. C-E. The 1-year (C), 3-year (D), and 5-year (E) calibration curves of the nomogram analysis of the Cox regression prognostic model. F–H. 1-year (F), 3-year (G), and 5-year (H) DCA plots of the Cox regression prognostic model. MRGs: mitophagy-related genes; DCA: decision curve analysis. 3.8. The hub genes interact with multiple proteins, miRNAs, TFs We designed a PPI network that includes 25 MRDEGs ([145]Fig. 10A). The top 15 MRGs with the highest scores for each algorithm were chosen after we utilized the Cytoscape software's cytoHubba plugin to calculate the PPI network using the following five algorithms: MCC, DMNC, MNC, Degree, and EPC ([146]Fig. 10B–F). The elliptical blocks in the illustration gradually get redder in color as their scores increase from yellow to red. We then created a Venn plot to show the results by intersecting the top 15 MRGs identified by five different algorithms to identify the hub genes for LUAD disease ([147]Fig. 10G). A total of 9 hub genes were obtained, which were MFN1, MFN2, RPS27A, SQSTM1, UBA52, UBB, UBC, ULK1, and VDAC1. Subsequently, we conducted PPI analysis on 9 hub genes and constructed a PPI network ([148]Fig. 10H). Fig. 10. [149]Fig. 10 [150]Open in a new tab Construction of PPI network A. The PPI network of MRGs. B–F The top 15 MRGs interactive network was obtained using MCC (B), DMNC (C), MNC (D), Degree (D), and EPC (F) algorithms in the PPI network. The color of the elliptical block in the graph gradually increases from yellow to red, indicating a gradual increase in score. G. The top 15 MRGs Venn diagram results of the five algorithms in the PPI network, MCC, DMNC, MNC, Degree, and EPC, are displayed. H. The protein interaction network (PPI network) of key genes. MRGs: mitophagy-related genes; PPI network: Protein-protein interaction network; MCC: Matthews Correlation Coefficient metric; DMNC: differential metabolic network construction; MNC: the maximum neighborhood benefit; EPC: edge perched component. Using Cytoscape, we were able to visualize the network of mRNA-miRNA interactions ([151]Fig. 11A). Whereas the purple triangular dots indicate miRNAs, the sky-blue oval block represents mRNA. The network's 52 miRNA molecules and 7 hub genes (MFN1, MFN2, SQSTM1, UBA52, UBB, ULK1, VDAC1) form 72 pairs of mRNA-miRNA interactions. Fig. 11. [152]Fig. 11 [153]Open in a new tab Construction of mRNA-miRNA, mRNA-RBP, and mRNA-TF interaction network A-C. The mRNA miRNA (A), mRNA RBP (B), and mRNA TF (C) interaction network of key genes. The sky-blue oval block in the mRNA-miRNA interaction network represents mRNA; The purple triangular dots are miRNAs. The sky-blue oval block in the mRNA-RBP interaction network represents mRNA; The pink diamond-shaped dots represent RBP. The sky-blue oval block in the mRNA-TF interaction network represents mRNA; Light green circular dots represent specific TF. RBP: RNA binding protein; TF: Translation factors. We also constructed the network of interactions between RBP and mRNA ([154]Fig. 11B), where RBP is represented by the pink diamond dots and mRNA by the sky-blue oval block. We identified 23 RBPs corresponding to the 9 hub genes and found 21 types of RBPs targeted by the SQSTM1 gene in the network. The mRNA-TF interaction network, which consists of 8 hub genes (MFN1, MFN2, RPS27A, SQSTM1, UBA52, UBB, UBC, VDAC1), was eventually found by searching two databases for TFs that bind to the hub genes and 46 TFs ([155]Fig. 11C), where the light green circular dots in the network represent TF, and the sky-blue oval blocks in the network are mRNA. We found 30 pairs of UBC-TF interactions, making UBC the most frequent gene interacting with TFs. In summary, there are interactions between hub genes and various small molecules, which drive the progression of diseases. 3.9. The expression of the hub genes was validated by laboratory experiment We used the qPCR method to measure the relative expression levels of hub genes in LUAD cells and normal human bronchial epithelium cells. At least one of the two LUAD cells (A549, H1299) had significantly lower relative expression levels of MFN1, MFN2, RPS27A, SQSTM1, UBA52, UBB, UBC, and VDAC1 than the normal cells (BEAS-2B), while all of the LUAD cells had significantly higher relative expression levels of ULK1. Among the three cell types, there was no significant difference in SQSTM1 expression ([156]Fig. 12A–I). Fig. 12. [157]Fig. 12 [158]Open in a new tab Laboratory validation of hub genes in LUAD A-I. The relative expression of MFN1 (A), MFN2 (B), RPS27A (C), SQSTM1 (D), UBA52 (E), UBB (F), UBC (G), ULK1 (H), VDAC1 (I). * indicates p < 0.05, ** indicates p < 0.01, and *** indicates p < 0.001. 4. Discussion Accumulating evidence shows that mitophagy supports the metabolic plasticity of cancer cells [[159]30]. Mitophagy has long been recognized as a protective mechanism cancer cells use to resist mitochondria-induced apoptosis, the main pathway of cell death in cancer cells driven by metabolic stress [[160]31]. According to a report, the ability of lung cancer cells to undergo mitophagy significantly reduced their ability to invade and migrate. For example, mitophagy-related Drp-1 protein was reduced in advanced lung cancer, suggesting an association with cancer progression. The expression of Drp-1 has a direct effect on mitophagy and is related to the invasion, proliferation, and metastasis of LUAD [[161]32]. Therefore, exploring mitophagy-related genes and evaluating their functions and clinical significance in LUAD is necessary. Using mitophagy related genes (MRGs) in LUAD patients, we screened MRGs and constructed a clinical prognostic model that affect the progression and prognosis of LUAD using bioinformatics tools. The TCGA-LUAD dataset (594 samples), [162]GSE27262 dataset (50 samples), and [163]GSE40791 dataset (194 samples) were used for this analysis. It was discovered that there is a substantial correlation between MRG expression levels and a diagnosis of LUAD, particularly in the TCGA-LUAD and [164]GSE4079 datasets, and that the majority of the 25 MRGs had significantly different expressions in LUAD tumor tissue compared to normal tissue. Patients with lower scores for mitophagy had a better prognosis, based on Kaplan-Meier analysis (P < 0.0001). These findings indicate that MRGs are crucial for the development and prognosis of LUAD. Using MRGs, we created a LUAD clinical prognosis model using LASSO regression analysis. We found that the expression levels of two MRGs, TOMM40 and VDAC1, were significantly correlated with the prognosis of LUAD in patients. The prognosis model performed the best when predicting the prognosis in 5 years. These results highlight the significance of VDAC1 and TOMM40 in predicting patients' prognoses for LUAD. The process of importing proteins into the mitochondria depends on TOMM40, which is a channel-forming subunit of the mitochondrial outer membrane translocase [[165]33]. Reports on TOMM40's involvement in cancers, such as LUAD, are few, but it has been shown to play a significant role in Alzheimer's disease [[166][34], [167][35], [168][36]]. Shoshan-Barmatz et al. [[169]37] demonstrated that VDAC1 is an important regulator of mitochondrial function. It mediates the release of apoptotic proteins in mitochondria to regulate epigenomic elements and apoptosis. It also participates in endoplasmic reticulum (ER)-mitochondrial crosstalk, autophagy, and inflammation regulation. The biological processes and pathways influenced by MRGs were explored. GO analysis of the DEGs between the groups with high and low mitophagy scores showed that most DEGs are involved in T-cell-mediated cytotoxicity, endocytosis, and phagocytosis. KEGG analysis revealed most of the 100 DEGs are involved in the five pathways, including amoebiasis, pertussis, coagulation cascades, complement, and ECM-receptor interaction. These results suggest that the patients' immune system is affected by MRGs. Further investigation revealed a significant difference in the infiltration of 20 different types of immune cells between the high and low mitophagy score groups. Additionally, the groups with high and low mitophagy scores responded to immunotherapy in very different ways, and it's possible that the high mitophagy score group's immunotherapy response was better to that of the low mitophagy score group. The important roles of MRGs in the immune response may underly their predicting power for the prognosis of LUAD in patients. In our current study, we attempted to establish an interaction network between 25 MRGs using the STRING database. We found extensive interactions between these MRGs, reflecting the collaboration and interaction between genes to jointly complete a specific function. We attempted to use 5 algorithms to calculate the genes that interact most closely with other genes and took the intersection of 5 sets to obtain 9 hub genes, namely MFN1, MFN2, RPS27A, SQSTM1, UBA52, UBB, UBC, ULK1, VDAC1. In the comparative analysis of the three datasets, MFN1/SQSTM1/ULK1/VDAC1 was highly expressed in the LUAD samples of all three datasets. In comparison, UBB/UBC was lowly expressed in the LUAD samples of the three datasets. The expression of other genes in the three datasets is inconsistent, which may be related to the heterogeneity of the datasets. In the lab, we used qt-PCR technology to confirm the expression of these 9 genes. We found that the expression of MFN1/ULK1/VDAC1/UBB/UBC genes remained consistent with the expression in the three datasets. SQSTM1 expression did not differ between tumor and normal cells, however this could be due to sample heterogeneity. One of the mitochondrial membrane proteins encoded by MFN1/MFN2 is involved in mitochondrial fusion and plays a role in the maintenance and operation of mitochondrial networks [[170]38]. The main component of the mitochondrial outer membrane, VDAC1, encodes a voltage-dependent anion channel protein that promotes the exchange of ions and metabolites on the membrane [[171]39]. These three genes are all related to the function of mitochondria. These genes have not been found to be exclusively responsible for the onset and development of cancer. According to Wang Y et al.'s study [[172]40], MFN1 is likewise strongly expressed in LUAD tissue and is linked to a poor prognosis for patients.The study also noted that mitochondrial damage activates the apoptotic pathway and induces cell cycle arrest. In Wang G et al.'s study [[173]41], a neoptosis related (signature) was found, which includes VDAC1, an accurate gene group prognosticator for patients with LUAD. This is consistent with the role of VDAC1 in constructing predictive models for LUAD prognosis in this study. RPS27A encodes a ubiquitin-protein, a highly conserved protein [[174]42]. Li H et al.'s study [[175]43] pointed out that the low expression of RPS27A can regulate LUAD proliferation, apoptosis, and cell cycle by activating the RPL11-MDM2-p53 pathway. SQSTM1 encodes a protein that can bind to ubiquitin to activate and regulate the nuclear factors κB (NF-kB) signaling pathway, which has been discovered for the first time in the development of LUAD in this study [[176]44]. P62/SQSTM1 is also a selective autophagic receptor that initiates the core autophagy mechanism in binding to ubiquitin [[177]45]. In this study's analysis of the PPI network, SQSTM1 interacted with various RBPs, indicating that SQSTM1 plays an important role in regulating mitochondrial autophagy and is expected to become a targeted gene for mitochondrial autophagy. ULK1 is an autophagy-activated kinase. During autophagy initiation, it participates in regulating complexes and can convert various signals into the formation of autophagosomes. It is a key gene for autophagy initiation in all cancers [[178]46]. This study has several limitations. The sample sizes are small, particularly for the [179]GSE27262 and [180]GSE40791 datasets, which limited the statistical power of the analysis. The samples in each dataset are heterogeneous, and many other factors, including age, sex, and race, may also affect the statistical analysis. Moreover, the molecular mechanisms of MRGs' functions remain largely unsolved. Therefore, further exploration of MRGs in the progression and prognosis of LUAD is needed. 5. Conclusions We used bioinformatics tools to analyze the expressions of MRGs in LUAD patients and explored their roles in their prognosis. In accordance with the expression levels of 20 MRGs, the patients were divided into two distinct groups: those with high mitophagy scores and those with low mitophagy scores. We observed significant differences between these patient groups regarding survival prognosis, immune cell infiltration, and molecular interactions. Further investigation revealed an important correlation between the expression levels of two MRGs, TOMM40 and VDAC1, and patients' LUAD prognoses. Measuring the expression levels of MRGs, especially that of TOMM40 and VDAC1, could help make clinical decisions in treating LUAD. Ethics approval and consent to participate Not applicable. Data availability statement All pertinent data for the study are either uploaded as supplemental information or provided in the publication. Data are accessible in an open-access public repository upon reasonable request. The datasets used in this study are available at GE0 ([181]https://www.ncbi.nlm.nih.gov/) and TCGA ([182]https://www.cancer.gov/about.nci/organization/ccg/research/struct ural-genomics/tcga) CRediT authorship contribution statement Wu-Sheng Liu: Writing – review & editing, Writing – original draft, Validation, Funding acquisition, Conceptualization. Ru-Mei Li: Validation, Formal analysis. Yong-Hong Le: Validation, Software, Formal analysis. Zan-Lei Zhu: Validation, Conceptualization. Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgements