Abstract Introduction Acute myocardial infarction (AMI) is a critical condition that can lead to ischemic cardiomyopathy (ICM), a subsequent heart failure state characterized by compromised cardiac function. Methods This study investigates the role of mitophagy in the transition from AMI to ICM. We analyzed AMI and ICM datasets from GEO, identifying mitophagy-related differentially expressed genes (MRDEGs) through databases like GeneCards and Molecular Signatures Database, followed by functional enrichment and Protein-Protein Interaction analyses. Logistic regression, Support Vector Machine, and LASSO (Least Absolute Shrinkage and Selection Operator) were employed to pinpoint key MRDEGs and develop diagnostic models, with risk stratification performed using LASSO scores. Subgroup analyses included functional enrichment and immune infiltration analysis, along with protein domain predictions and the integration of regulatory networks involving Transcription Factors, miRNAs, and RNA-Binding Proteins, leading to drug target identification. Results The TGFβ pathway showed significant differences between high- and low-risk groups in AMI and ICM. Notably, in the AMI low-risk group, MRDEGs correlated positively with activated CD4+ T cells and negatively with Type 17 T helper cells, while in the AMI high-risk group, RPS11 showed a positive correlation with natural killer cells. In ICM, MRPS5 demonstrated a negative correlation with activated CD4+ T cells in the low-risk group and with memory B cells, mast cells, and dendritic cells in the high-risk group. The diagnostic accuracy of RPS11 was validated with an area under the curve (AUC) of 0.794 across diverse experimental approaches including blood samples, animal models, and myocardial hypoxia/reoxygenation models. Conclusions This study underscores the critical role of mitophagy in the transition from AMI to ICM, highlighting RPS11 as a highly significant biomarker with promising diagnostic potential and therapeutic implications. Keywords: mitophagy, acute myocardial infarction, ischemic cardiomyopathy, machine learning, diagnostic model 1. Introduction Acute Myocardial Infarction (AMI), a severe form of coronary heart disease, results from sudden coronary artery occlusion, leading to myocardial ischemia and necrosis ([35]1). Annually, it accounts for about 7 million new cases and roughly half of cardiovascular deaths globally. Ischemic Cardiomyopathy (ICM), often following AMI or reflecting advanced coronary disease, involves myocardial fibrosis from prolonged ischemia, severely affecting heart function and causing about 70% of heart failure cases ([36]2). Despite improved AMI management increasing survival, ICM’s prevalence is rising. In Western countries, ICM’s one-year mortality rate is around 16%, with a five-year rate near 40%. Outcomes for AMI-induced ICM patients are generally worse, with an increased risk of severe cardiac events, compared to those with non-ischemic cardiomyopathy ([37]3).To improve the treatment of AMI and reduce the incidence of subsequent ICM, it is crucial to explore the pathophysiological mechanisms of post-myocardial infarction, identify novel biomarkers for risk stratification, recognize high-risk patients, and discover potential therapeutic targets. Mitochondria play a pivotal role in several cellular processes including signal transduction, redox balance, and energy conversion. Cardiomyocytes, which are among the cells with the highest mitochondrial content, can undergo mitophagy in response to various stressors such as nutrient deficiency, hypoxia, DNA damage, inflammation, or mitochondrial membrane depolarization ([38]4, [39]5). This process selectively removes damaged mitochondria to maintain cellular homeostasis ([40]6, [41]7). During ischemia-reperfusion (I/R) injury, mitophagy is beneficial as it clears defective mitochondria. Evidence indicates that mice deficient in Drp1(dynamin - related protein 1) or Parkin manifest impaired mitophagy and exhibit an enlarged myocardial infarction area subsequent to I/R injury ([42]8, [43]9). Conversely, stress-induced activation of mitophagy can lead to excessive clearance of mitochondria, resulting in inadequate ATP (Adenosine Triphosphate) synthesis and ultimately precipitating cardiomyocyte apoptosis. In experimental models, inhibition of mitochondrial fission and mitophagy by knocking down Drp1 or Mff (mitochondrial fission factor) has led to dilated cardiomyopathy ([44]10, [45]11). These findings highlight the necessity of mitophagy for normal heart function and suggest that excessive mitochondrial division may be detrimental to cardiac health. The pathophysiological mechanisms of mitophagy in AMI and ICM are still unclear, and it remains uncertain whether the extent of mitophagy affects the prognosis of these diseases. Further investigation of its regulatory mechanisms is of significant importance for the treatment of these diseases. Machine learning algorithms are increasingly employed in bioinformatics analysis, capable of managing dynamic, voluminous, and complex datasets. These algorithms can detect trends and patterns potentially overlooked by human analysis, thereby significantly enhancing the reliability of diagnostic systems. Previous studies have applied machine learning to analyze and identify mechanisms and biomarkers for the development of ischemic heart failure following acute myocardial infarction ([46]12). However, these studies often provide broad conclusions and do not specifically address mitophagy. In research conducted by ZhiKai Yang and colleagues, various machine learning algorithms were utilized to study differences in mitophagy between patient groups with acute myocardial infarction and stable coronary artery disease ([47]13). While this research underscored the significant role of mitophagy in coronary artery disease, it did not address the subset of patients with the worst prognosis who progress from myocardial infarction to ischemic cardiomyopathy. This study conceptualized AMI and ICM as stages of a single pathological process, using bioinformatics and machine learning to explore mitophagy’s role ([48] Figure 1 ). We identified key mitophagy genes and signaling pathways influencing the transition from AMI to ICM, revealing potential biomarkers for diagnosis, risk stratification, and new insights into the treatment and prognosis of these cardiovascular conditions. Figure 1. [49]Figure 1 [50]Open in a new tab Flow chart for the comprehensive analysis of MRDEGs. AMI, Acute Myocardial Infarction; ICM, Ischemic Cardiomyopathy; DEGs, Differentially Expressed Genes; MRGs, Mitophagy-Related Genes. MRDEGs, Mitophagy-Related Differentially Expressed Genes; SVM, Support Vector Machines; LASSO, Least Absolute Shrinkage and Selection Operator; ROC, Receiver Operating Characteristic; GSEA, Gene Set Enrichment Analysis; GSVA, Gene Set Variation Analysis, PPI Network, Protein-Protein Interaction Network. GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; TF, Transcription Factor; RBP, RNA-Binding Protein. 2. Materials and methods 2.1. Data collection and processing Using the R package GEOquery ([51]14), we downloaded two datasets each for AMI [[52]GSE48060 ([53]15) and [54]GSE29532 ([55]16)] and ICM [[56]GSE116250 ([57]17) and [58]GSE46224 ([59]18)] from the GEO ([60]19) database ([61]https://www.ncbi.nlm.nih.gov/geo/). Comprehensive details are available in [62]Supplementary Tables S1 , [63]S2 . The R package sva ([64]20) was utilized for batch correction and integration, producing the consolidated GEO datasets for AMI and ICM. The R package limma ([65]21) facilitated normalization and standardization, followed by principal component analysis ([66]22). Mitophagy-related genes (MRGs) were sourced from the GeneCards database ([67]23) ([68]https://www.genecards.org/) and the Molecular Signatures Database (MSigDB) ([69]24) ([70]https://www.gsea-msigdb.org/gsea/msigdb), yielding a total of 1633 unique MRGs (mitophagy-related genes) after merging and deduplication, as detailed in [71]Supplementary Table S3 . 2.2. Differentially expressed genes between AMI and ICM The analysis of differential gene expression was carried out for both AMI and ICM using the limma package in R. After reviewing literature ([72]25, [73]26) and testing various thresholds, we chose |logFC| > 0 and P < 0.05 to ensure robust results while maximizing the inclusion of as many biologically significant differentially expressed genes as possible. Genes with logFC above 0 and a p-value below 0.05 were categorized as up-regulated, whereas those with logFC below 0 and the same p-value threshold were categorized as down-regulated. Venn diagrams were employed to depict the overlap between up-regulated and down-regulated genes, and further intersections with MRGs (mitophagy-related genes) were analyzed to pinpoint MRDEGs (mitophagy-related differentially expressed genes). 2.3. Protein-protein interaction network construction and hub gene selection The STRING database ([74]27) ([75]https://string-db.org/) facilitated the construction of a PPI network based on MRDEGs(mitophagy-related differentially expressed genes), employing a minimum interaction confidence score of 0.400(medium confidence). Interactions with a confidence score above this threshold are considered to be sufficiently supported by evidence, thereby filtering out potential false-positive results. The CytoHubba ([76]28) plugin within Cytoscape ([77]29) software applied five algorithms—Maximum Neighborhood Component (MNC), Maximal Clique Centrality (MCC), Edge Percolated Component (EPC), Degree, Closeness ([78]30)—to compute scores for MRDEGs, selecting the top 20 MRDEGs. The intersection of results from these algorithms identified hub genes related to AMI and ICM. By performing multi-analysis screening using the STRING database and five algorithms in Cytoscape, the reliability of the results was enhanced, and errors that might arise from relying on a single algorithm were minimized. 2.4. Protein domain prediction and regulatory network construction AlphaFoldDB ([79]31) ([80]https://alphafold.com) predicted and visually displayed the protein structures of hub genes, assessed by a Predicted Local Distance Difference Test (pLDDT) score ranging from 0 to 100. The regulatory network between the mRNA of 9 hub genes and 48 transcription factors (TFs) was predicted using the ChIPBase ([81]32) database ([82]http://rna.sysu.edu.cn/chipbase/). Potential interactions between mRNA and miRNAs, as well as mRNA and RNA-binding proteins (RBPs) ([83]33), were screened using the StarBase v3.0 database ([84]34) ([85]https://starbase.sysu.edu.cn/), and the networks were visualized using Cytoscape software. This analysis included 4 hub genes and 27 miRNAs, as well as 10 hub genes and 43 RBPs. Furthermore, the Comparative Toxicogenomics Database(CTD) ([86]35) ([87]https://ctdbase.org/) was employed to identify potential drugs or molecular compounds associated with the hub genes. The mRNA-Drug regulatory network was constructed and subsequently visualized using Cytoscape software, comprising 8 hub genes and 15 drugs or molecular proteins. 2.5. Hub gene expression difference and correlation analysis Expression levels of MRDEGs(mitophagy-related differentially expressed genes) in the Combined Datasets were compared using group comparison graphs. The Spearman algorithm analyzed the correlation of hub gene expressions, with the R packages igraph ([88]36) and ggraph illustrating correlations and chord diagrams. Scatter plots by the ggplot2 R package displayed the strongest correlated hub genes. 2.6. Functional enrichment analysis of MRDEGs Gene Ontology(GO) ([89]37) and Kyoto Encyclopedia of Genes and Genomes(KEGG) ([90]38) enrichment analysis of hub genes was performed using the R package clusterProfiler ([91]39), electing results based on an adjusted p-value < 0.05. The Pathview R package ([92]40) visualized the pathway enrichment analysis results. 2.7. GSEA and GSVA analysis GSEA ([93]41) (Gene Set Enrichment Analysis)analysis was executed on the combined datasets for AMI and ICM using the clusterProfiler package in R, with the following settings: seed value at 2023, a gene set size range from 10 to 500, and the gene set c2.cp.all.v2022.1.Hs.symbols.gmt [All Canonical Pathways]. The threshold for significance was set at a p-value below 0.05. Additionally, GSVA ([94]42) (Gene Set Variation Analysis)was applied to all genes within the combined datasets of AMI and ICM, utilizing gene sets from MSigDB ([95]24), adhering to the same p-value criterion for selection. 2.8. Diagnostic model construction We utilized multiple machine learning algorithms, including Logistic Regression, Support Vector Machine (SVM) ([96]43), and Least Absolute Shrinkage and Selection Operator (LASSO) regression analysis, to identify key genes for constructing diagnostic models for AMI and ICM. This approach is grounded in several studies of significant scientific value in the field of bioinformatics ([97]44, [98]45). The models were implemented using the R package glmnet, with parameters set.seed (500) and family=‘binomial’ ([99]46).The key genes chosen from AMI and ICM to determine the RiskScore, employing coefficients obtained from LASSO regression analysis. [MATH: RiskSco re = iCoefficient (gene< /mi>i)mRN< mi>A Expre< /mi>ssion < mrow>(gene< /mi>i) :MATH] 2.9. Diagnostic model validation and key gene ROC curve analysis ROC(Receiver Operating Characteristic) curves were plotted for the diagnostic models of AMI Key Genes and ICM Key Genes using the pROC package in R. Additionally, nomograms ([100]47) illustrating the relationships between Key Genes were generated with the rms package in R. Calibration analysis was conducted to evaluate the precision and discriminatory capacity of the diagnostic models for AMI and ICM. Decision Curve Analysis (DCA) ([101]48) for predicting clinical outcomes using AMI Key Genes and ICM Key Genes was performed using the ggDCA package in R. Moreover, Functional Similarity (Friends) analysis was carried out with the GOSemSim R package ([102]49). 2.10. High- and low-risk group differential expression analysis, GSEA, GSVA To enhance the reliability of our methodology, we drew upon the approach proposed by Zhang L et al. ([103]50), and utilized mitophagy-related RiskScore to subgroup the AMI group for further in-depth analysis. Based on the formula outlined in section 2.8, we calculated the RiskScore for acute myocardial infarction (AMI) samples within the AMI Combined Datasets, utilizing the regression coefficients derived from the LASSO model specifically for AMI. The median RiskScore was instrumental in categorizing the samples into HighRisk and LowRisk groups. Samples with a risk score above the median were classified into the HighRisk group, while those with a risk score equal to or below the median were classified into the LowRisk group. A similar methodology was applied to determine the RiskScore for ischemic cardiomyopathy (ICM) samples in the ICM Combined Datasets, again using the LASSO regression coefficients pertinent to ICM. The samples were classified into HighRisk and LowRisk categories based on their median RiskScores. These two sets of high- and low-risk classifications will be utilized for subsequent subgroup analyses independently. Differential analysis was carried out with the limma package in R, with visualization of the results achieved through the ggplot2 and pheatmap packages in R. GSEA ([104]41) was conducted on AMI samples in the AMI Combined Datasets and ICM samples in the ICM Combined Datasets with clusterProfiler package in R. GSVA ([105]42) was applied to the HighRisk and LowRisk groups of AMI and ICM samples, respectively. The same gene sets, parameters, and screening criteria were used as in previous analyses. 2.11. Immune infiltration analysis of HighRisk and LowRisk groups Immune cell infiltration matrices were determined through single sample gene set enrichment analysis (ssGSEA) ([106]51) for samples of AMI and ICM. Comparison graphs for the groups were created using ggplot2 to illustrate the variance in immune cell expression between the LowRisk and HighRisk groups in AMI and ICM. The detailed subgroup classification method can be found in section 2.10. 2.12. Validation of peripheral blood samples The Ethics Committee of Shanghai East Hospital, affiliated with Tongji University, approved this study(Approval number 2024-175), which follows the Declaration of Helsinki guidelines. Once written informed consent was secured from all participants, peripheral blood samples were collected from six individuals each with diagnoses of AMI and ICM, and from six normal subjects. Venous blood samples were collected into whole blood RNA preservation tubes (model ZXQX-10). After centrifugation and sedimentation, total RNA was extracted from peripheral blood mononuclear cells (PBMCs) employing the TransZol Up Plus RNA kit (TransGen, China). The integrity and purity of the extracted RNA were evaluated using the GEN5 microplate reader (biotek, USA). Quantitative real-time PCR (RT-qPCR) experiments were performed on the Q5 Real-Time PCR Detection System (Thermo, USA). Glyceraldehyde-3-phosphate dehydrogenase (GADPH) was used as the internal control to normalize the data. Relative expression levels of the target genes were determined using the 2-ΔΔCt method. 2.13. Experimental validation 2.13.1. Experimental animals For the animal experiments, we obtained eight-week-old male C57BL/6 mice from Shanghai Lingchang Biotechnology Co. (Shanghai, China). The mice were housed under controlled conditions of temperature (23°C) and humidity (65%) with a 12/12-hour light/dark cycle. All experimental procedures were conducted in strict compliance with national regulations regarding animal welfare and ethics. The study was approved by the Ethics Committee of Shanghai East Hospital, associated with Tongji University. 2.13.2. Establishment of myocardial infarction model Twenty-four mice were randomly assigned into two groups: Myocardial Infarction (MI) and Sham, with 12 mice in each group. In each group, six mice were randomly selected for histological staining and immunohistochemistry, while the remaining six were used for molecular analyses. Myocardial infarction was induced in the MI group by ligating the left anterior descending (LAD) coronary artery. Post-ligation, the myocardium exhibited a color change from bright red to pale, accompanied by a gradual weakening of contraction. Electrocardiographic (ECG) monitoring confirmed the successful establishment of the MI model, as indicated by ST-segment elevation and the presence of a J wave following the ST-segment. The Sham group underwent the same surgical procedure without LAD ligation. Cardiac function was assessed via echocardiography on the day following surgery. 2.13.3. Hematoxylin & eosin, masson staining, and immunohistochemistry Hematoxylin and Eosin (HE) staining and Masson staining were performed on cardiac tissue sections using the respective kits (Beyotime, C0105M). These staining procedures were used to observe and analyze the morphological characteristics of the cardiac tissues. The heart tissue sections were deparaffinized, rehydrated, autoclaved with citrate buffer (pH 6.0) for 10 minutes for antigen repair, cooled to room temperature, and then sealed for 15 minutes with 3% H[2]O[2] for endogenous peroxidase activity. Sections were washed with PBS (Phosphate - Buffered Saline) and blocked with 10% goat serum for 30 minutes. They were then incubated overnight at 4°C with a primary antibody (anti-RPS11 antibody, 1:200, Proteintech, 17041-1AP). The following day, biotinylated secondary antibody (1:500) and streptavidin-HRP(Horseradish Peroxidase) were incubated sequentially at room temperature for 30 minutes each after washing with PBS, and the nuclei were washed with PBS and stained with DAB(3,3’ - Diaminobenzidine tetrahydrochloride), and the nuclei were lightly post-stained with hematoxylin. The expression of RPS11 protein was indicated by a brownish-yellow signal, and the area and intensity of the positive signal were analyzed using Image-Pro Plus software. 2.13.4. RT-PCR Total RNA was extracted from mouse heart tissues using Trizol reagent (Beyotime, R0016). The detailed methods, steps, and reagents follow those described in section 2.12, RT-PCR operations. 2.13.5. Establishment of the myocardial H/R model and detection of apoptosis rate by flow cytometry In this study, H9c2 cardiomyocytes were selected and cultured at 37°C and 5% CO[2] in sugar-rich DMEM(Dulbecco’s Modified Eagle Medium) medium supplemented with 10% fetal bovine serum (FBS) and 1% penicillin/streptomycin (P/S). When the cells reached the logarithmic phase, they were divided into two groups according to the experimental requirements: the normoxic control group (the Control group) and the hypoxia-reoxygenation group (the H/R group). In the construction of the hypoxia-reoxygenation model, the cells in the hypoxia-reoxygenation group(the H/R group) were first placed in a three-gas incubator (1% O[2], 5% CO[2], 94% N[2]) with sugar-free DMEM instead of the conventional medium for a 4-hour hypoxia treatment; then the cells were replaced with the conventional medium (sugar-rich DMEM, 10% FBS, 1% P/S) and placed in the conventional medium at 37°C and 5% CO[2] for a 4-hour reoxygenation. The cells of the normoxic control group (the Control group) were always cultivated in a conventional incubator without changing the culture medium. After the establishment of the model, the cells in each group were subjected to flow cytometry using the Annexin V-FITC/PI double staining kit (MCE,HY-K1073), and the apoptosis rate was analyzed according to the instructions of the kit. 2.13.6. Gene knockdown via plasmid transfection & western blot To investigate the role of the RPS11 gene in the hypoxia/reoxygenation process of cardiomyocytes, H9c2 cardiomyocytes were divided into three groups in this experiment: Hypoxia-reoxygenation group (the H/R group), hypoxia-reoxygenation + RPS11 knockdown group (the H/R+siRPS11 group) and hypoxia-reoxygenation + empty vector control group (the H/R+siCON group). Knockdown transfection was performed 24 hours before hypoxia treatment with siRNA Transfection Reagent (Sigma-Aldrich, SITRAN-RO) according to the instructions. The cells in the H/R+siRPS11 group were transfected with the siRPS11 plasmid (MCE, HY-RS12221); the cells in the H/R+siCON group were transfected with the empty plasmid (siCON). Twenty-four hours after transfection, the cells were placed in a triple gas incubator for a 4-hour hypoxia treatment (1% O[2], 5% CO[2], 94% N[2]). The cells were shifted to a regular incubator following the replacement of the standard medium for a 4-hour reoxygenation phase at 37°C and no CO[2]. The anoxia treatment and reoxygenation methods were performed as described previously. After modeling each experimental group, cellular protein samples were collected for Western blot assay. After the protein samples were lysed with RIPA (Radioimmunoprecipitation Assay Buffer) lysate, the total protein concentration was determined using the BCA (Bicinchoninic Acid) protein concentration assay kit (Thermo Fisher Scientific, USA), and the same amount of protein (30µg per well) was loaded onto an SDS-PAGE (Sodium Dodecyl Sulfate - Polyacrylamide Gel Electrophoresis) gel for electrophoresis. After electrophoresis, proteins were transferred to PVDF (Polyvinylidene Fluoride) membranes (Millipore, USA), sealed with 5% skimmed milk powder for 1 hour at room temperature, and then incubated with primary antibodies (including anti-GAPDH (MCE, HY-P80137), anti-β-actin (MCE, HY-[107]P80438), anti-RPS11 (Proteintech, 17041-1AP), anti-BNIP3(BCL2 protein-interacting protein 3) (MCE, HY-[108]P80035), and anti-LC3II/I (Microtubule-associated protein 1 light chain 3 II/I) (Aladdin,[109]Ab112877) separately at 4°C overnight. The membranes were treated as follows the next day: they were washed three times, for 10 minutes each, with PBST(Phosphate - Buffered Saline Tween), and then incubated for an hour with the secondary antibodies (HRP-marked). Protein signals were detected using the ECL(Enhanced Chemiluminescence) chemiluminescence kit (MACKLIN, E917966), and the grey levels of target proteins were analyzed using Image Lab software (Bio-Rad, USA) and normalized using GAPDH and β-actin as internal references. The experiment was repeated three times