Abstract Acute myocardial infarction (AMI) is a major contributor to cardiovascular-related mortality, and early diagnosis is crucial for effective treatment and better outcomes. While several biomarkers have been explored for AMI, there remains a need for reliable, non-invasive biomarkers that can accurately differentiate AMI patients from healthy individuals. This study aims to identify potential mRNA biomarkers in peripheral blood that could aid in the diagnosis and monitoring of AMI. We performed transcriptomic analysis of blood samples from 81 individuals, including 16 healthy controls, 58 AMI patients, and 7 post-treated AMI individuals. Through a combination of Sparse Partial Least Squares-Discriminant Analysis (sPLS-DA), random forest (RF), Weighted Gene Co-expression Network Analysis (WGCNA), and LASSO regression, we identified mRNA markers that are significantly correlated with AMI. Specifically, the mRNA expressions of ANKRD52, ART1, NRP2, and PPP1R15A were elevated in AMI patients, whereas BAIAP2L1 and CCNE1 were downregulated. However, while these mRNA biomarkers show potential for distinguishing AMI patients from healthy individuals, further studies are needed to confirm their clinical applicability. Supplementary Information The online version contains supplementary material available at 10.1038/s41598-025-92757-4. Keywords: Acute myocardial infarction (AMI), mRNA biomarkers, Machine learning, Diagnostic screening Subject terms: High-throughput screening, Diagnostic markers, Cardiovascular biology Introduction Acute myocardial infarction (AMI) represents a significant global health challenge, with rising incidence and mortality rates, particularly in regions with low to middle socioeconomic indices^[38]1. Early detection of AMI is imperative for the timely initiation of therapeutic interventions to reduce mortality rates. Although troponin and creatine kinase-MB are the conventional markers for AMI diagnosis, their specificity is limited, leading to potential false positives in non-cardiac conditions^[39]2. Consequently, the scientific community continues to explore new biomarkers that offer higher sensitivity and specificity for the early detection of AMI. Previous investigations have explored AMI-specific diagnostic markers through mRNA and single-cell transcriptomics; however, the progression of AMI and cardiac repair is a protracted and dynamic biological process. This complex process spans from the initial transformation of healthy cardiomyocytes under hypoxic conditions, which alters energy metabolism and induces oxidative stress, to the eventual myocardial injury repair. Biomarker levels notably fluctuate throughout these stages, reflecting the dynamic nature of AMI pathophysiology. A study monitoring AMI patients over a 20-week post-event observation period revealed that levels of cardiac-specific miRNA-1 and endothelial-specific miRNA-126 in circulation were significantly elevated compared to healthy control, with a subsequent decline during the observation period^[40]3. In ischemic cardiac tissue, HIF-1α protein accumulates persistently, and vascular endothelial growth factor A (VEGFA) is induced in the acute phase of myocardial infarction but not during the chronic phase^[41]4. There is an upregulation of insulin-like growth factor 2 (IGF2) and angiopoietin 2 (ANGPT2) in the infarcted region, while expressions of VEGFA and fms-related tyrosine kinase 1 (FLT1, also known as VEGF receptor 1) are diminished^[42]5. Single-cell sequencing of peripheral blood has revealed an upregulation of the IL-1 signaling pathway following ST-Elevation Myocardial Infarction (STEMI), indicating an ongoing heart failure process^[43]6. Analysis of two GEO high-throughput datasets reveals that post-myocardial infarction, hsa-miR-330-3p exerts regulatory effects on MMP2, leading to macrophage infiltration into atherosclerotic lesions and promoting the secretion of numerous factors such as MMP-2. These factors contribute to endothelial cell damage by cleaving extracellular matrix components, collagen, and elastin, thereby accelerating the formation of intravascular thrombi^[44]7. These findings suggest that the potential origin of variability in AMI biomarker studies is attributed to the differing gene expression patterns in myocardial tissues and peripheral blood^[45]7. During myocardial infarction, mononuclear cells, macrophages, lymphocytes, and smooth muscle cells accumulates and release of various cytokines and inflammatory biomarkers^[46]8. Single-cell transcriptomics provides the necessary resolution to distinguish cell-specific responses, which are often masked in analyses of bulk tissue^[47]9. The complexity of myocardial infarction pathophysiology, the dynamic nature of gene expression patterns, and the heterogeneity of patient populations can contribute to the observed variability of identified biomarkers. As myocardial infarction progresses, transcriptional activity undergoes selective regulation, necessitating the identification of more relevant markers through the assessment of myocardial recovery at various post-treatment stages. The advent of machine learning has markedly altered the stratification of risk and the prediction of mortality in AMI. Chang et al.^[48]10 identified a five-feature model including troponin I, HDL cholesterol, HbA1c, anion gap, and albumin as potential biomarkers for the early detection and treatment of AMI. This model utilizes feature selection combined with machine learning techniques, building upon established risk factors for AMI and cardiovascular diseases. A study has corroborated the expression specificity of immune-related genes in AMI, which can facilitate diagnosis in clinical settings through the development of various machine model^[49]11. Moreover, the application of machine learning extends beyond biomarkers. Employing algorithms that integrate multimodal data, including electrocardiographic waveforms, demographic information like gender and age, have been shown to enhance the diagnostic accuracy for myocardial ischemia or infarction^[50]12. Despite these advances and promising results, characterized by high performance on receiver operating curves, these innovative models necessitate further clinical trials to affirm their applicability and efficacy in real-world medical settings. In this study, we performed a transcriptomic analysis on peripheral blood samples from AMI patients, individuals seven days post-treatment, and healthy controls. This analysis unveiled AMI-associated transcriptomic signatures, including specific mRNA expression patterns linked to key signaling pathways. Employing a combination of data analytical techniques and machine learning, we executed a multi-step screening process aimed at identifying a set of potential mRNA biomarkers correlated with AMI. Materials and methods Study design and sample collection This study was conducted in accordance with the ethical standards and approval of the Kunming University of Science and Technology (KUST) and the People’s Hospital of Lijiang. All procedures involving human participants were in compliance with the ethical standards of the research committee of KUST and People’s Hospital of Lijiang. This study was conducted in accordance with the principles of the Declaration of Helsinki and informed consent was obtained from all participants involved in this research. For those participants who were unable to provide consent due to their medical condition, consent was obtained from their legal guardians. All subjects were informed about the nature of the study, the procedures involved, potential risks, and their right to withdraw at any time without affecting their medical care. Patients presenting with AMI were recruited and treated at the People’s Hospital of Lijiang City, Yunnan Province. The inclusion criteria for participation in this study were: (1) a clinical diagnosis of AMI, meeting the standards defined in the “Guidelines for the Diagnosis and Treatment of AMI”; (2) long-term residents living in Lijiang for more than six months; (3) both male and female patients aged 18 years or older; (4) no known hematological diseases or severe liver and kidney dysfunction; (5) no family history of hereditary tumors; (6) no history of radiation therapy or chemotherapy; (7) no occupational exposure to radioactive materials, toxic gases, or other carcinogens; and (8) written informed consent signed by the patient or their legal representative. Patients were excluded from the study if they met any of the following criteria: (1) a history of diagnosed hematological diseases or severe renal or hepatic dysfunction; (2) recent (within the past six months) coronary intervention treatment (such as stent placement or coronary artery bypass surgery); (3) currently participating in other interventional clinical studies involving AMI; (4) presence of other severe comorbidities, such as active severe infections, advanced malignancies, or immunodeficiency diseases; (5) women who are pregnant or breastfeeding; and (6) any other conditions deemed unsuitable for participation by the researchers, such as poor compliance or significant confounding factors. Upon confirmation of eligibility, detailed demographic information, chronic disease history, laboratory test results, and diagnostic data were recorded for each patient, and peripheral blood samples were obtained at the time of admission. Peripheral blood was also collected from a subset of seven patients 7–10 days into treatment when their condition had stabilized. The healthy control group was publicly recruited from the local population by our department, consisting of individuals aged 18 or older, excluding those with underlying cardiovascular conditions, pregnant individuals, and patients with cancer. After drawing peripheral blood from participants, each sample was immediately mixed with threefold volume of Trizol (Qiagen, Germany) and stored at -80 °C for subsequent use. RNA-seq and data analysis Genomic RNA from all enrolled samples of peripheral blood was extracted using the TRNzol Universal Reagent kit (Tiangen, China). The extracted RNA was quantified and its quality assessed through: (1) Measuring sample concentration with the Qubit 4.0 Fluorometer (Invitrogen, USA) and the Equalbit RNA BR Assay Kit (Vazyme, China). (2) Evaluating RNA integrity using the Qsep1 Bio-fragment Analyzer (Guangding, China). High-quality RNA with a concentration ≥ 80 ng/µl, and a total yield ≥ 1 µg was then used for subsequent library construction. A total of 3 µg of RNA was employed for the library construction process, which included mRNA enrichment with Oligo(dT) magnetic beads (Yesen, China), fragmentation, synthesis of first- and second-strand cDNA, purification with AMPure XP beads (Beckman, USA), end-repair, A-tailing, adaptor ligation, and PCR enrichment. Completed libraries were quantified and quality-assessed using Qubit 4.0, the Agilent 2100 Bioanalyzer (Agilent Technologies, USA), and the Bio-RAD CFX 96 Real-Time System (Biorad, USA), ensuring library concentrations exceeded 10 nM for subsequent sequencing. Sequencing was performed on the NovaSeq 6000 S4 platform (Illumina, USA) using the Illumina NovaSeq 6000 S4 Reagent Kit V1.5, generating PE150 reads. Quality control of all sequencing data was conducted using FastQC software for analysis and filtration. Human sequences were removed by mapping to the human reference genome (GRCh38.p13 [51]https://www.ncbi.nlm.nih.gov/assembly/2334371) using BWA (BurrowsWheeler alignment). In this study, we conducted various analyses, including RNA-seq analysis, sPLS-DA analysis, WGCNA analysis, and animal analysis. For the detailed methodologies of these analyses, please refer to the Supplementary Methods. Real‑time quantitative polymerase chain reaction (RT-qPCR) Total RNA was isolated using TRIzol (Invitrogen, USA), and 1 µg RNA was used for reverse transcription using a PrimeScript RT Reagent Kit (TaKaRa, JPN) according to the manufacturer’s instructions. RT–qPCR was performed by using TB Green Premix Ex Taq II (TaKaRa, JPN). The samples were processed using a Thermo Q6 Real-Time System. Five genes were selected for validation as potential biomarkers for AMI. Three replicates were run for each sample. The primer sequences are shown in Table.S1. Statistical analysis Statistical analyses were performed using R (Version 4.2). Quantitative data were analyzed as appropriate based on their distribution, and categorical variables were assessed accordingly. A p value of ≤ 0.05 was considered statistically significant. For detailed methodologies related to statistical tests, including the treatment of normally and non-normally distributed data, as well as the handling of categorical variables, please refer to the Supplementary Methods ‘Statistical Analysis’ part. Results Subject characteristics Table [52]1 presents the demographic and clinical profiles of the study’s participants, comparing healthy controls, AMI patients, and those receiving AMI treatment. The study workflow is illustrated in Fig. [53]1. A total of 87 participants were recruited, and clinical data - including age, sex, comorbidities, and clinical AMI indicators were collected. Peripheral blood samples were obtained from all subjects, followed by transcriptomic sequencing. To ensure data accuracy, individuals with incomplete clinical information (4 from the control group) and those with poor sequencing quality (2 from the AMI group) were excluded. Ultimately, 58 confirmed AMI patients, 7 post-treatment AMI patients, and 16 healthy controls were included in the analysis. After sequencing, a series of analytical steps were performed, including differential expression analysis, WGCNA, sPLS-DA, and random forest analysis, culminating in LASSO regression for feature selection and validation in a mouse model. This integrated approach aimed to identify key gene expression differences across groups to uncover potential biomarkers and therapeutic targets for AMI. Cardiac biomarkers including Ctn1, CKMB, and MYO were measured between 7 and 10 days post-treatment, while other data were collected at hospital admission. The AMI group (56.21 ± 12.20) was older than the control group (33.94 ± 5.69) (p < 0.001). Given the multi-ethnic composition of the Lijiang region in Yunnan, the proportions of minority groups were catalogued within different cohorts, with the majority being Han, followed by Bai, Naxi, Tibetan, Yi and others ethnicities. Hypertension prevalence was higher in the AMI group (41.38%) compared to the control group (0.00%), while the prevalence of diabetes showed no significant difference across groups (P = 0.136). Smoking history was significantly more common in the AMI group (60.34%), compared to an absence of smoking history in the control group (0.00%) (P < 0.001). The primary diagnosis within the AMI group was Inferior STEMI (55.17%). After treatment, the levels of Ctn1, CKMB, and MYO significantly decreased(P < 0.001), suggesting therapeutic efficacy. Table 1. Clinical characteristics of participant. Variables AMI (n = 58) Control (n = 16) Treatment (n = 7) P Age, Mean ± SD 56.21 ± 12.20 33.94 ± 5.69 57.14 ± 16.57 < 0.001 Sex, n (%) < 0.001 Female 10 (17.24) 14 (87.50) 2 (28.57) Male 48 (82.76) 2 (12.50) 5 (71.43) Minority, n (%) 0.075 Bai 12 (20.69) 0 (0.00) 0 (0.00) Han Chinese 29 (50.00) 5 (31.25) 4 (57.14) Naxi 8 (13.79) 8 (50.00) 0 (0.00) Tibetan 5 (8.62) 0 (0.00) 0 (0.00) Yi 0 (0.00) 0 (0.00) 0 (0.00) Other ethnic groups 4 (6.90) 3 (18.75) 3 (42.86) Hypertension, n (%) 0.002 No 34 (58.62) 16 (100.00) 4 (57.14) Yes 24 (41.38) 0 (0.00) 3 (42.86) Diabetes, n (%) 0.136 No 47 (81.03) 16 (100.00) 7 (100.00) Yes 11 (18.97) 0 (0.00) 0 (0.00) Smoking, n (%) < 0.001 No 23 (39.66) 16 (100.00) 3 (42.86) Yes 35 (60.34) 0 (0.00) 4 (57.14) Alcohol drinking, n (%) 0.219 No 43 (74.14) 16 (100) 5 (71.43) Yes 15 (25.86) 0 (0.00) 2 (28.57) STEMI type, n (%) < 0.001 Anterior STEMI 14 (24.14) NA 5 (71.43) Extensive anterior STEMI 11 (18.97) NA 0 (0.00) Inferior STEMI 32 (55.17) NA 2 (28.57) None 0 (0.00) 16 (100.00) 0 (0.00) Others 1 (1.72) 0 (0.00) 0 (0.00) Chest pain duration (Min), M (Q₁, Q₃) 360.00 (144.00, 592.75) NA 250.00 (123.00, 757.00) < 0.001 Ctn1 (Ng/Ml), M (Q₁, Q₃) 5.79 (0.83, 48.23) NA 5.00 (3.15, 11.99) < 0.001 CKMB (U/L), M (Q₁, Q₃) 84.00 (30.00, 291.00) NA 22.00 (19.50, 34.00) < 0.001 MYO (ug/L), M (Q₁, Q₃) 317.30 (71.88, 991.17) NA 41.30 (36.05, 46.30) < 0.001 TC, M (Q₁, Q₃) 5.11 (4.24, 6.23) NA NA – TG, M (Q₁, Q₃) 2.08 (1.16, 2.78) NA NA – ApoA, M (Q₁, Q₃) 1.21 (1.06, 1.35) NA NA – LDL-C, M (Q₁, Q₃) 2.99 (2.48, 3.85) NA NA – Creatinine, M (Q₁, Q₃) 77.46 (65.00, 89.12) NA NA – Urea Nitrogen, M (Q₁, Q₃) 5.15 (4.50, 6.57) NA NA – SaO2, M (Q₁, Q₃) 94.00 (93.00, 95.00) NA NA – RCA, M (Q₁, Q₃) 0.50 (0.00, 0.99) NA NA – LAD, M (Q₁, Q₃) 0.80 (0.43, 0.98) NA NA – LCX, M (Q₁, Q₃) 0.40 (0.00, 0.70) NA NA – LAX, M (Q₁, Q₃) 0.00 (0.00, 0.00) NA NA – [54]Open in a new tab SD: standard deviation; M: Median; Q₁: 1st Quartile; Q₃: 3st Quartile; STEMI: ST-Elevation Myocardial Infarction. Ctn1(Ng/mL): Cardiac Troponin I; CKMB (U/L): Creatine Kinase-MB; MYO (µg/L): Myoglobin; TC: Total Cholesterol; TG: Triglycerides; ApoA: Apolipoprotein A; LDL-C: Low-Density Lipoprotein Cholesterol; SaO2: Arterial Oxygen Saturation. RCA, LAD, LCX, and LAX are the stenosis of the contrast findings, and the values are in percentage, respectively. RCA: Right Coronary Artery; LAD: Left Anterior Descending artery; LCX: Left Circumflex artery; LAX: Left coronary Artery axis. Fig. 1. [55]Fig. 1 [56]Open in a new tab Presented the schematic diagram of the study. Pairwise differential expression analysis reveals distinct mRNA signatures and treatment effects in AMI In our RNA-seq data analysis, we compared gene expression across three groups of participants: (1) differences in gene expression between AMI patients and healthy controls, (2) gene expression in patients post-treatment compared to AMI patients, and (3) comparisons of gene expression between post-treatment patients and healthy controls. Initially, Principal component analysis (PCA) was employed to investigate mRNA profiles among individual samples and groups. This unsupervised multivariate approach (Fig. [57]2A) generates principal component axes capable of elucidating the variability in the data without prior knowledge of the sample groups. Possibly due to sample heterogeneity, the three groups (control, AMI, and post-treatment) could not be distinctly segregated based on their overall transcriptomic landscape. To delve deeper into the data and uncover mRNAs associated with AMI, we subsequently engaged in a supervised analysis, sPLS-DA, a method that identifies the most discriminative mRNAs within the dataset. Through this analysis, we distinguished the groups primarily along component 1, as illustrated in the dot plot of component 1 versus component 2 (Fig. [58]2B). Although, there was no clear separation into molecular subtypes, our analysis revealed significant differences in gene expression: 1067 genes between AMI patients and controls (Fig. [59]2C), 477 genes between post-treatment and AMI individuals (Fig. [60]2D), and 567 genes between post-treatment individuals and controls (Fig. [61]2E). The threshold for differential gene screening among groups was set as an absolute log2 fold change greater than 2 and p-value < 0.05. The differentially expressed genes (DEGs) identified in pairwise comparisons among the three groups are detailed in Table.S2 (Original). Certain mRNAs were consistently modulated across both comparative analyses (Fig.[62]S1). Certain mRNAs, including HBEGF, G0S2, EREG, FOSB, PLK2, NR4A2, HEY1, SFN, PER1, RNU6-415P, LINC00664, CSAD and CNTNAP3C, were significantly upregulated in the AMI group compared to the controls. However, the mRNAs CSAD, IL32, RPL36, VAMP5, ZNF48, RPS26, UQCC3, RPL13P12, and ZNF593OS showed no significant change compared to the AMI group after treatment (Table.S3). Taking into account the significant age differences between the AMI group and the control group, we included age as a covariate and conducted pairwise differential comparison analyses among the three groups. A total of 3,033 genes were identified between AMI patients and controls (Fig. [63]S2A), 941 genes between post-treatment individuals and AMI patients (Fig. [64]S2B), and 1,342 genes between post-treatment individuals and controls (Fig. [65]S2C). Several differentially expressed genes exhibited consistent expression patterns regardless of age adjustment; for instance, FOSB, G0S2, and CNTNAP3C were significantly upregulated in the AMI group compared to controls and remained significantly upregulated upon re-analysis, with a notable downregulation observed in the treatment group compared to AMI patients. However, another set of genes, such as UTY, ANKRD36BP1, and KDM5D, which were significantly upregulated in the AMI group relative to controls, displayed altered expression patterns when age was considered as a covariate [Table [66]S2 (Age as a covariate)].In the analysis accounting for age as a covariate, we identified a new set of significantly downregulated genes, including PROCR (log2FoldChange = -2.36, p = 0.0488), LINC02987 (log2FoldChange = -2.36, p = 0.0215), and MAD2L2 (log2FoldChange = -2.52, p = 0.0012). Notably, MED18 and ASIC1 were also significantly downregulated in both analyses. Fig. 2. [67]Fig. 2 [68]Open in a new tab Transcriptomic analysis was conducted on peripheral blood samples from AMI and treatment groups. (A) Depicted the principal component analysis (PCA) plot based on transcriptomics data. (B) The sPLS-DA demonstrated moderate clustering between the three groups. (C) The volcano plots displayed differentially expressed genes (DEGs) between the AMI group, the post-treatment group and the healthy control group. The black line represented the cutoff line with indicated significance criteria. Points with absolute log fold-change ≥ 2 and P < 0.05 were shown in blue, points with absolute log fold-change ≤ -2 and P ≤ 0.05 were in red, and the rest were in green: (C) AMI versus healthy controls, (D) Post-treatment of AMI versus AMI, and (E) Post-treatment of AMI versus healthy controls. AMI Acute Myocardial Infarction, FC fold change, Treat Post-treatment of AMI. Deciphering molecular pathways and gene functions through enrichment analysis To elucidate the roles of identified genes, we conducted enrichment analyses using Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) enrichment analyses. GO analysis revealed that, compared to the control group, the upregulated DEG in the AMI group were enriched in 795 terms across Biological Process (BP), Cellular Component (CC), and Molecular Function (MF) categories (p < 0.05). The top 10 enriched terms are depicted in Fig. [69]3A. Notably, BP was predominantly enriched in positive regulation of calcium ion transport, CC showed enrichment in platelet alpha granule lumen, and MF was enriched for platelet-derived growth factor binding. In the comparison between post-treatment and AMI, upregulated DEGs were associated with an enrichment of 159 terms (p < 0.05), which included cell junction disassembly in BP, blood microparticle in CC and antigen binding in MF. Downregulated DEGs in this comparison were associated with 721 significant terms (p < 0.05), including regulation of epithelial cell proliferation in BB, semaphorin receptor complex in CC, and in transmembrane receptor protein tyrosine kinase activity in MF (Fig. [70]3B). When comparing treated individuals to controls, the upregulated genes were related to 284 significant terms (p < 0.05), encompassing regulation of epithelial cell proliferation in BP, platelet alpha granule in CC, and in type II transforming growth factor beta receptor binding in MF (Fig. [71]3C). Fig. 3. [72]Fig. 3 [73]Open in a new tab GO functional enrichment and pathway enrichment analyses were conducted on DEGs. (A–C) The top 5 GO terms associated with the Molecular Functions (MF), Biological Processes (BP), and Cellular Compartments (CC) in both upregulated and downregulated DEGs were identified. (A) In the comparison between AMI and healthy controls, (B) Post-treatment of AMI versus AMI, and (C) Post-treatment of AMI versus healthy controls. (D–F) KEGG enrichment analyses were performed on the upregulated and downregulated DEGs. The top 10 were shown here. (D) For the comparison between AMI and healthy controls, (E) Post-treatment versus AMI, and (F) Post-treatment of AMI versus healthy controls. In the pathway enrichment analysis facilitated by KEGG, DEGs between AMI and control groups highlighted enrichment in the top 10 pathways (Fig. [74]3D). The upregulated DEGs were predominantly enriched in pathways associated with Amoebiasis, Nitrogen metabolism and Glycine, serine, and threonine metabolism. Notably, genes such as CXCL8 (log2FC = 3.38, p. adjust < 0.01) and FOSB (log2FC = 5.86, p < 0.01, p.adjust < 0.01), which were significantly upregulated in AMI group, partook in essential immune-related pathways (Table.S4). Downregulated DEGs were enriched in Cholinergic synapse, Relaxin signaling pathway and Neuroactive ligand-receptor interaction. Comparative pathway enrichment of DEGs between Treatment and AMI groups were depicted in Fig. [75]3E, with upregulated genes enriched in Inositol phosphate metabolism, Leishmaniasis and Hypertrophic cardiomyopathy, while downregulated genes were enriched in IL-17 signaling pathway, Biosynthesis of amino acids and Leishmaniasis. The top 10 pathways of DEGs enrichment between Treatment and Control comparisons are presented in Fig. [76]3F, where upregulated differential genes were enriched in TGF-beta signaling pathway, Osteoclast differentiation and non-small cell lung cancer. Downregulated DEGs were enriched in Ribosome, Coronavirus disease COVID-19 and African trypanosomiasis. Incorporating age as a covariate significantly altered the gene enrichment analysis results for AMI patients. Unlike the age-excluded analysis, both AMI and control groups showed enrichment in the ECM-receptor interaction pathway (Fig. [77]S2C). GO analysis also revealed enrichment in the platelet-related pathway, specifically in the platelet alpha granule lumen (Fig. [78]S2B), highlighting a notable increase in platelet-derived growth factor binding. Additionally, while the positive regulation of calcium ion transport pathway was previously significant, it lost significance under age consideration, shifting to regulation of calcium ion transport. The downregulated genes in the AMI group showed significant enrichment in the Relaxin Signaling Pathway, with RLN2 being significantly downregulated in both analyses. When comparing the treatment group to the AMI group, upregulated genes were enriched in fatty acid metabolism. However, in the age-adjusted treatment group, several immune-related pathways, such as Primary immunodeficiency and Autoimmune thyroid disease, were significantly upregulated—findings not observed in the conventional analysis. Both analyses revealed enrichment in arginine and proline metabolism for downregulated genes in the treatment group compared to the AMI group. The age-excluded analysis showed downregulation in the IL-17 Signaling Pathway and Cytokine-cytokine receptor interaction, which were upregulated in both AMI and control groups. However, after considering age as a covariate, the enrichment of the Cytokine-cytokine receptor interaction pathway decreased. Sparse partial least squares regression discriminant analysis (sPLS-DA) Component 1, encapsulating 38 mRNAs effectively differentiates AMI patients from control subjects. Figure [79]4A showed the top 25 contributors to this component. Similarly, Component 2, comprising 50 mRNAs detailed in Fig. [80]4B, distinguishes treatment from controls, emphasizing the leading 25 mRNAs. The efficacy of these two components in classification is underscored by their comprehensive coverage. Data revealed that a very good classification was obtained with two components. Analysis of receiver operating characteristic (ROC) curves conducted for Component 1 revealed that the area under the curve (AUC) for discerning AMI from the other groups was 0.9195 (P < 0.01). The contribution of individual mRNAs to Component 1 was ordered by their importance in mRNA expression, with notable contributors including CC3orf52, CTNNB1, REC114, LINC01736, and POLR3E. The ROC curve for Component 2 demonstrated an AUC of 0.9974 (P < 0.01) for differentiating the AMI group from other groups, and an AUC of 0.9769 (P < 0.01) for the distinction of the control group from others. The top five contributors to Component 2 were ZDHHC3, ZNGB1, ZDHHC6, SPATA7, and SLC31A1. It was noteworthy that certain mRNAs, such as ZFP36 (log2FC = 1.76, p.adjust = 0.012), were significantly upregulated in AMI compared to control and also contributed to Component 2. The standalone contribution of the transcriptomic data to the final sPLS-DA model was demonstrated by score plots of the first two components, indicating the optimal separation capacity of the transcriptomic data (Fig. [81]4C). Fig. 4. [82]Fig. 4 [83]Open in a new tab Sparse partial least-squares discriminant analysis. (A, B) Selected features were shown in a pyramid bar plot. The loading plot represented the top 25 mRNAs contributing to group separation (Left). The area under the ROC curve (AUC) values, compared to the grouping level, were shown for the three-group classification (Right). (A) The bar plot showed the contribution value of each mRNA. (B) sPLS-DA contributions to component 2 were depicted. The color in the bar plot represented the highest mRNA expression level in the corresponding group. (C) Displayed the background prediction plot for the sample’s prediction results. Identification of the most significant module by WGCNA To identify gene modules of biological significance associated with AMI, we employed WGCNA to construct a gene correlation network. Outlier samples were examined using hierarchical clustering methods, resulting in the exclusion of four outliers (Fig. [84]5A, h = 15000). To ensure a scale-free co-expression network (Fig. [85]5B), we selected a soft-thresholding parameter β = 14, achieving a scale-free fit index (R^2) of 0.72. Following the identification and merger of highly similar modules, 34 modules were delineated and color-coded (Fig. [86]5C). Based on the correlation analysis between modules and traits (Fig. [87]5D), the tan module, comprising 207 mRNAs, was identified as positively correlated with the clinical trait of chest pain duration (r = 0.73, P < 0.01). The plots of module membership and gene significance further demonstrated significant correlations within the tan module (Fig. [88]5E). Fig. 5. [89]Fig. 5 [90]Open in a new tab The results of weighted gene co-expression network analysis (WGCNA). (A) Clustering dendrogram of 81 samples. (B) Analysis of the scale-free index for various soft-threshold powers (left). Analysis of the mean connectivity for various soft-threshold powers(right). (C) WGCNA was performed to identify 28 modules by unsupervised clustering. (D) Heatmap of the correlation between the module eigengenes and clinical traits of AMI, healthy control, and Post-treatment of AMI. The tan module was identified as the positively correlated module with chest pain duration (r = 0.73, P < 0.01). (E) The gene significance and module membership of the genes in the tan module exhibited a positive correlation. (F) KEGG enrichment analyses in tan module. To elucidate the functional roles of the genes within the tan module, we conducted a KEGG enrichment analysis. The analysis revealed that five mRNAs within the tan module—CD44, FLT3LG, CSF3R, CD36, and HLA-DPB1—were enriched in the Hematopoietic cell lineage pathway (Fig. [91]5F). Identification of candidate mRNA biomarkers Employing a Random Forest (RF) importance score, we prioritized the top 200 mRNAs, illustrating their contribution to classification performance based on disease status (Fig.S3). The top 5 genes with significant contributions under this algorithm are POLR3E, ZNF831, TRIM17, WSB1, and PABPC1(Fig.S3). The out-of-bag (OOB) estimate of error rate was 26.15%, and stability in the ensemble of decision trees was reached at approximately 400 trees (Fig.S4). Subsequently, we integrated the intersections of the top 200 mRNAs from the RF model, the sPLS-DA components 1 and 2, the DEGs of AMI vs. Control, Treatment vs. Control, Treatment vs. AMI, and the mRNAs from the tan module identified in the WGCNA. The differential genes were derived from the analysis that did not consider age as a covariate. The UpSet plot revealed an intersection of 92 mRNAs (Table.S5) across any two significant gene sets (Fig. [92]6A). Utilizing LASSO regression (Fig. [93]6B), candidate mRNA biomarkers were filtered from the intersection, yielding six mRNAs with optimal lambda values (Fig. [94]6C). We validated the expression levels of these mRNAs in peripheral blood samples using quantitative PCR (qPCR) (Fig. [95]6D). In the AMI group, ANKRD52, ART1, NRP2, and PPP1R15A expressions were elevated compared to the healthy control group, whereas BAIAP2L1 and CCNE1 were reduced. Notably, after treatment, these six mRNAs’ expression levels did not significantly differ from those in the AMI group (Fig. [96]6E). Fig. 6. Fig. 6 [97]Open in a new tab Through various data analysis methods, significant gene sets were obtained in AMI, post-treatment of AMI, and healthy control group. (A) The UpSet plot illustrated the intersection and union relationships among DEGs (top 200), sPLS-DA component 1, component 2, tan module of WGCNA, and random forest (RF) top 200 genes. The ribbons between the dots represent the intersections between sets. Genes with an intersection count of two or more groups were selected for further analysis. (B) The extracted features were reduced via the LASSO regression. (C) LASSO coefficients of the variables. (D) The expression level of six mRNA in the AMI, Post-treatment of AMI and healthy controls. (E) Validation of 6 mRNA expression by fluorescence quantitative PCR. Hub gene expression in the AMI mice model To further validate the relationship between AMI and the candidate biomarkers, we employed a mouse AMI model. Following immediate ligation of the left anterior descending (LAD) coronary artery, ECG recordings displayed a characteristic ST-segment elevation (Fig.S7A), indicating successful induction of myocardial ischemia. Histological analysis of myocardial sections revealed significant changes in the infarct area when compared to the sham group (Fig.S7B, C). Additionally, elevated serum cTnI levels quantitatively confirmed the occurrence of myocardial damage (Fig.S7 D). M-mode echocardiography was performed to evaluate cardiac structure and function (Fig.S7E). Detailed measurements of interventricular septum (IVS) thickness, left ventricular posterior wall (LVPW) thickness, left ventricular internal diameter (LVID), ejection fraction (EF), and fractional shortening (FS) revealed alterations in cardiac structure and contractility (Fig.S7F-J). Notably, the mRNA levels of ANKRD52 (Fig.S7K) and NRP2 (Fig.S7L) in peripheral blood were significantly elevated in the myocardial infarction model compared to the sham group, consistent with our findings in human data. Discussion AMI is a significant contributor to cardiovascular mortality, and early diagnosis is crucial for effective treatment and improved outcomes. Although various biomarkers have been explored for the diagnosis of AMI, there is a continued need for reliable non-invasive biomarkers to accurately differentiate AMI patients from healthy individuals. This study aims to identify potential mRNA biomarkers in peripheral blood to aid in the early diagnosis and monitoring of AMI. In this study, our integrative analysis identified several candidate intersecting genes. ANKRD52, ART1, NRP2, and PPP1R15A were found to be upregulated, while BAIAP2L1 and CCNE1 demonstrated downregulation in the context of AMI. Members of the ANKR family are implicated in a diverse array of functions, including the formation of transcriptional complexes, initiation of immune responses, biogenesis and assembly of cation channels in membranes, and regulation of the cell cycle^[98]13–[99]15. ANKRD1 is identified as a cardiac-specific ankyrin repeat domain-containing protein principally expressed in the heart and implicated in the morphogenesis and function of cardiomyocytes^[100]16. Alterations in both the expression and phosphorylation levels of ANKRD2 protein are known to mediate the balance between muscle physiology and pathological inflammatory responses^[101]17. Furthermore, ANKRD26 is associated with thrombopoiesis and the pathogenesis of autoimmune diseases. References 42–44 likely provide further substantiation for