Abstract Introduction Acute myeloid leukemia (AML) is a type of blood cancer that is identified by the unrestricted growth of immature myeloid cells within the bone marrow. Despite therapeutic advances, AML prognosis remains highly variable, and there is a lack of biomarkers for customizing treatment. RNA N6-methyladenosine (m^6A) modification is a reversible and dynamic process that plays a critical role in cancer progression and drug resistance. Methods To investigate the m^6A modification patterns in AML and their potential clinical significance, we used the AUCell method to describe the m^6A modification activity of cells in AML patients based on 23 m^6A modification enzymes and further integrated with bulk RNA-seq data. Results We found that m^6A modification was more effective in leukemic cells than in immune cells and induced significant changes in gene expression in leukemic cells rather than immune cells. Furthermore, network analysis revealed a correlation between transcription factor activation and the m^6A modification status in leukemia cells, while active m^6A-modified immune cells exhibited a higher interaction density in their gene regulatory networks. Hierarchical clustering based on m^6A-related genes identified three distinct AML subtypes. The immune dysregulation subtype, characterized by RUNX1 mutation and KMT2A copy number variation, was associated with a worse prognosis and exhibited a specific gene expression pattern with high expression level of IGF2BP3 and FMR1, and low expression level of ELAVL1 and YTHDF2. Notably, patients with the immune dysregulation subtype were sensitive to immunotherapy and chemotherapy. Discussion Collectively, our findings suggest that m^6A modification could be a potential therapeutic target for AML, and the identified subtypes could guide personalized therapy. Keywords: acute myeloid leukemia, N6-methyladenosine modification, subtype, single-cell transcriptome, tumor microenvironment 1. Introduction Acute myeloid leukemia (AML) represents an infrequent hematologic malignancy distinguished by the aberrant expansion of precursor cells in the myeloid lineage, resulting in a perturbed differentiation process. In recent decades, allogeneic hematopoietic stem cell transplantation (allo-HSCT) has emerged as a pivotal therapeutic intervention, substantially enhancing the overall survival of eligible AML patients ([39]1). However, the effectiveness of AML treatment and the issue of relapse continue to pose significant clinical challenges. Until recently, there has been a considerable gap in meeting the treatment needs of AML patients. RNA methylation, particularly N6-methyladenosine (m^6A) methylation, constitutes the prevalent intramolecular modification observed within eukaryotic mRNA molecules. Such modification is involved in regulating various aspects of RNA, including splicing, stability, localization, and translation ([40]2, [41]3). In recent times, m^6A modifications have not solely surfaced as a novel stratum of epigenetic control within the realm of cancer; they have also exhibited substantial therapeutic promise for addressing diverse forms of malignancies ([42]3–[43]6). Notably, current studies have highlighted the potential of m^6A modifications in elucidating the pathogenesis and therapeutic aspects of AML. Specifically, the m^6A methyltransferases METTL3 and METTL14 can control and/or maintain myeloid leukemia cells ([44]7–[45]9). The m^6A readers IGF2BP2 and IGF2BP3 promote AML development in an m^6A-dependent behavior by controlling the expression level of critical genes in the glutamine metabolism pathways ([46]10, [47]11). The RNA-binding protein YBX1 plays a pivotal role in sustaining myeloid leukemia cells through an m^6A-dependent mechanism, wherein it regulates the stability of BCL ([48]12). FTO plays an oncogenic role in AML as an m^6A RNA demethylase. Small molecule inhibitors that target FTO have the potential to be used in the treatment of AML ([49]13, [50]14). These findings provide valuable insights into the crucial and intricate involvement of m^6A modification and its regulators in AML, highlighting a promising avenue for AML treatment. In this study, we identified two distinct patterns of m^6A modification intensity based on the m^6A activity score. We found that high-intensity modes significantly impact the prognosis of AML. Additionally, we conducted a comprehensive analysis of the role of m^6A modification in AML subtypes, including their genomic alterations, tumor immune microenvironment, and immunotherapy implications. This examination notably enriches our comprehension of the molecular processes that contribute to m^6A modification’s role in the development of AML and its impact on the effectiveness of drug treatments. Overall, our findings provide a solid foundation for the development of m^6A modification-targeting therapies for AML and suggest that this approach could be an effective and specific strategy for cancer treatment. 2. Materials and methods 2.1. Data download and processing We got patient data for acute myeloid leukemia (AML) from the TCGA (The Cancer Genome Atlas) database, which was downloaded from the following source: [51]https://xena.ucsc.edu/. RNAseq data from a total of 151 samples were re-analyzed. Clinical survival information was obtained for 132 patients and somatic mutation information was obtained for 143 patients. To verify, we conducted searches in the GEO database using the keywords ‘acute myelogenous leukemia’ and ‘LAML’ to identify relevant datasets. Specifically, we included datasets in our analysis if they met the following criteria: (1) Adequate sample size (n > 200). (2) Using RNA-seq instead of microarrays. (3) Count matrix is provided for further analysis. [52]GSE106291 (n = 250) and [53]GSE146173 (n = 246) were selected to further analysis ([54]15, [55]16). We utilized the dataset [56]GSE178926, which comprises targeted immune gene expression profiles derived from pre-treatment bone marrow samples of acute myeloid leukemia patients undergoing treatment with pembrolizumab and the hypomethylating agent azacytidine ([57]ClinicalTrials.gov Identifier: [58]NCT02845297) as described by Rutella et al. in 2022 ([59]17), for the purpose of immunotherapy prediction. Single-cell RNA sequencing (scRNA-seq) data from 12 patients diagnosed with acute myeloid leukemia (AML) were obtained from the dataset provided by van Galen et al. ([60]18). 2.2. Re-analysis of scRNA-seq The single-cell RNA sequencing (scRNA-seq) count data underwent normalization using the ‘Seurat’ R package ([61]19), followed by log-transformation with an offset value of 1 and subsequent scaling. We discerned genes displaying significant variability through the utilization of the ‘FindVariableFeatures’ function, wherein the ‘vst.method’ parameter was configured as ‘vst’. We conducted Principal Component Analysis (PCA) on the highly variable genes, utilizing the top 30 principal components for subsequent analyses. Cell type annotation was obtained from Zeng et al. ([62]20). In order to represent the outcomes visually, we employed t-distributed stochastic neighbor embedding (t-SNE) to reduce the complexity of the dataset. The RunTSNE function was used to generate a 2-dimensional t-SNE plot based on the top 30 principal components. The resulting t-SNE plot was then used to visualize the clustering results. Subsequently, we employed the ‘FindAllMarkers’ function to identify cluster-specific markers, with the ‘method’ parameter set to ‘MAST’ ([63]21). 2.3. m^6A modification activity score We obtained 23 m^6A modification enzymes (FTO, CBLL1, FMR1, HNRNPC, HNRNPA2B, IGF2BP1, IGF2BP2, ELAVL1, IGF2BP3, ALKBH5, LRPPRC, METTL3, METTL14, RBM15, RBM15B, VIRMA, WTAP, YTHDF1, YTHDF2, YTHDC1, YTHDC2, YTHDF3, ZC3H13) and scored their m^6A modification activity using the AUCell R package ([64]22). The enzymes were used as a gene set to calculate the area under the curve (AUC) value. This value was used to rank gene expression for each cell, reflecting the proportion of highly expressed genes in the gene set for that cell. To identify the active gene set, we used the “AUCell_exploreThresholds” function to calculate the threshold value (0.048). Using the AUC scores of each cell, we colored the cell clustering UMAP embedding to show which cells were active. 2.4. Regulon analysis and signature enrichment To assess the activity of transcription factor (TF) regulons in the context of single-cell RNA sequencing (scRNA-seq) data, we conducted a regulon analysis employing the SCENIC framework. Following the guidelines outlined by Van de Sande et al., we leveraged the Docker image of pySCENIC and utilized logarithmically transformed expression counts obtained from AML cells for input data ([65]23). The identification of putative transcription factors (TFs) was carried out with default parameters and a compiled list of human TFs from Lambert et al. ([66]24). To refine potential TF-target interactions within each regulon, we incorporated CisTarget, which entailed the utilization of established human TF motifs databases annotated at intervals of 500 base pairs, 5 kilobases (kb), and 10 kb from transcriptional start sites. Additionally, a drop-out masking strategy was applied during this refinement process. Subsequently, the enrichment of these refined TF regulons was assessed using AUCell, and enrichment scores were scaled for visualization purposes. 2.5. GRN construction We generated gene regulatory networks (GRNs) for both m^6A-active and m^6A-inactive cell populations using the R package bigSCale2 ([67]25, [68]26). To do so, we first separated the m^6A-active and m^6A-inactive cells and generated a cell count matrix for each m^6A modification state using Seurat’s GetAssayData function. We then filtered the resulting matrices to remove genes with ensemble identifiers and passed them to bigSCale2 to construct the networks. The networks were generated under the “normal” clustering parameter, with an edge cutoff set to the top 0.9 quantile for correlation coefficient. We visualized the networks using the R package igraph, and each network’s layout was derived from the Fruchterman-Reingold algorithm. 2.6. Construction of cell-specificity m^6A related signature The single-cell approach enables a detailed exploration of the m^6A modification model. We employed a likelihood-ratio test to establish a m^6A-related gene signature for each cell-type-specific cluster. This entailed identifying differentially expressed genes between cells exhibiting m^6A activity and those that are m^6A inactive within each cluster. We focused on identifying m^6A related genes based on the size of effect value (log2FC). The m^6A-related genes within the various cell types were identified using the following criteria: |log2FC| > 0.25, percentage of cells (pct) < 0.1, and p-value < 0.05. We prefer to obtain m^6A-related genes specific to each cell type, we therefore used and evaluated the cell specificity of m^6A-related genes. Cell-specific genes within the different cell types were identified using the following criteria: log2FC > 0.25, percentage of cells (pct) < 0.1, and p-value < 0.01. Cell-specificity m^6A related signature emphasizes strict cell specificity and comprehensive m^6A modification profile, we thus relaxed the screening criteria for m^6A-related genes and limited the screening criteria for cell markers. 2.7. Reanalysis of bulk RNA-seq DEseq2 software was used to detect Differentially expressed genes (DEGs). Count matrix was scaled by variance-stabilizing transformation. We identified differentially expressed genes (DEGs) among the three subtypes using a threshold of |log2FC| ≥ 1 and a false discovery rate (FDR) ≤ 0.01. Subsequently, we conducted Gene Ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis, visualizing the results through Metascape. Furthermore, Molecular Signatures Database V7.4 of hallmark gene sets were used for GSEA by the R package ClusterProfiler. 2.8. Clustering, ESTIMATE, and ssGSEA For TCGA-LAML, we integrated m^6A related genes from five types of blood cancer cells and further identified the genes that determine survival status by Cox regression. We conducted hierarchical clustering analysis using the ‘ConsensusClusterPlus’ package (R implementation, K = 3) based on the 40-genes (Cox regression p-value < 0.05) from cell-specificity m^6A Related Signature. We utilized ESTIMATE to assess the stromal score, immune score, ESTIMATE score, and tumor purity for each LAML sample ([69]27). The ssGSEA was conducted to determine the enrichment levels of immune-related pathways in each sample. 2.9. Comparing tumor immune microenvironment We conducted a systematic ssGSEA analysis to assess the activation levels of 17 immune pathways, as defined by ImmPort ([70]28). Additionally, we quantified the expression levels of 78 immunomodulators based on a prior study, with the aim of exploring the immune microenvironment in LAML. To discern differences in immune signatures, including immune pathways and immunomodulators, among LAML subtypes, we employed both the Kruskal-Wallis test and the Wilcoxon rank-sum test. 2.10. Survival analysis We defined overall survival (OS) as the duration between diagnosis and either death or last follow-up. We evaluated the differences in OS within subtypes in each cohort using Mantel-Cox log-rank tests, implemented using the R package ‘survival’. Kaplan-Meier plots were generated using the R package ‘survminer’ to visualize the survival curves for each cluster. We also derived univariate and pairwise hazard ratios (HRs) for each cluster using Cox proportional hazards regression. 2.11. Construction of composite machine learning model The genes that satisfy the two conditions are used to build the machine learning (ML) model: 1. m^6A related genes; 2. FDR<0.01 in the difference analysis of the three subtypes. We employed a systematic machine learning (ML)-based framework to construct subtype prediction models. The process involved several key steps: 1. Data Preprocessing: This step encompassed imputing missing values and scaling continuous features. Continuous features were standardized to have a zero mean and unit variance. 2. Data Splitting: Following preprocessing, we divided the data into two sets, a training dataset for model development, and a validation dataset for assessing model accuracy. The data split was random and maintained a ratio of 7:3. 3. Model Development: For each of the five ML algorithms (random forests, support vector machines, naïve Bayes, K-nearest neighbors, and neural networks), we developed optimal models using the training dataset. 4. Validation: To ensure robustness, we required consistent results from at least three models for each sample. All of the aforementioned modeling steps were implemented using R (version: 4.1.2). 2.12. Genomic analysis We conducted Copy Number Alteration (CNA) analysis using GISTIC2.0 on the TCGA LAML cohort. We examined variations in amplification or deletion events at the gene level across the three subtypes. To visualize the Copy Number Variation (CNV) data, we utilized the ‘ComplexHeatmap’ package in R, creating a waterfall plot. Additionally, we calculated the Tumor Mutation Burden (TMB) by determining the number of mutations per patient. 2.13. Immunotherapy response score generation Immunotherapy signature feature selection was performed using differentially expressed genes between CR (Complete Remission) and NR (Non-Responders) samples in the [71]GSE178926 dataset. Sample identity information was established based on Rutella et al. study ([72]17). We selected genes based on their association with CR compared to NR samples, utilizing Elastic Net penalized logistic regression (glmft algorithm, glmnet package) with alpha = 0.6 and lambda = 0.06 to accommodate the high degree of correlation observed among some genes. We opted for Leave-One-Out Cross-Validation (LOOCV) to maximize our training set for validation, especially given our small dataset size (n = 33), as LOOCV exhibits low bias. 2.14. Chemotherapeutic response prediction Chemotherapeutic response predictive model is based on the Cancer Genome Project (CGP) data from the Genomics of Drug Sensitivity in Cancer (GDSC) project, using gene expression and drug sensitivity data from cancer cell lines. The R package “pRRophetic” utilized ridge regression to estimate the half-maximal inhibitory concentration (IC50) for each sample ([73]29). Model training was performed using blood-type cell lines through 10-fold cross-validation based on the GDSC’s training set. A total of 45 cell lines derived from hematological cancers were utilized. To mitigate batch effects, we employed the ‘combat’ algorithm with default settings and considered only tissue types labeled as ‘blood’ Duplicate gene expression data were averaged. 2.15. Statistical analyses The statistical analyses were conducted utilizing R software, version 4.1.2. The comparison of two groups was established via Wilcoxon-rank sum test, while analysis of more than three groups was accomplished through execution of Kruskal-Wallis test. The benchmark for significance was established with p-value, wherein a level of 0.05 or lower was deemed significant, while a level of 0.01 or lower was deemed extremely significant. 3. Results 3.1. Reanalysis of scRNA-seq data identifies active and inactive m^6A methylation cells in AML we conducted a comprehensive reanalysis of scRNA-seq data obtained from 13,653 cells from 12 patients diagnosed with AML ([74]GSE116256) ([75]18, [76]20). Cell identity information was established based on prior studies encompassing five leukemic cell types (LSPC, ProMono-like, Mono-like, GMP-like and cDC-like) and seven non-leukemic immune cell types (B, T, natural killer, plasma, monocytes, CTL and cDCs) ([77] Supplementary Figure 1A ). Our study examined 23 key genes (see method) involved in the m^6A modification process in AML cells. There is no evidence of cell-specific expression for these genes ([78] Supplementary Figure 1B ). We used AUCell to score m6A sets within individual cells to further understand m^6A activity. The AUC values across all cells showed two peaks, with 8465 cells showing relatively higher AUC values when the AUC threshold was set to 0.048 ([79] Figure 1A ). Next, we categorized AML cells into two distinct groups based on their m^6A modification: m^6A active (AUC value > 0.048) vs. m^6A inactive (AUC value ≤ 0.048). We performed differential gene analysis on cells with varying m^6A activity and observed 59 significantly upregulated genes and 46 significantly downregulated genes (|log2FC| > 0.25 and p-value < 0.01, [80]Figure 1B ). Furthermore, we performed pathway enrichment analysis on these genes, revealing their impact on myeloid leukocyte cytokine production, mRNA metabolism, and immune effector functions ([81] Figure 1C ). We found that the Hallmark gene set, which is comprised of a broad range of biological processes, provided a more comprehensive representation of differences between m^6A-active and m^6A-inactive cells ([82] Figure 1D ). Finally, we found that a higher proportion of leukemic cells (χ^2 = 88.81, p-value = 4.34e−21, [83]Supplementary Figure 1C ), such as LSPCs, Mono-like, ProMono-like, GMP-like and cDC-like blasts were identified as m^6A-active cells than immune cells (χ^2 = 422.95, p-value = 8.17e−84, [84]Figure 1E ). This suggests that m^6A modification activity may play a critical role in regulating a wide range of biological processes and underscores the importance of considering the overall functional landscape of cellular processes when studying the effects of m^6A modification. Figure 1. [85]Figure 1 [86]Open in a new tab Identification of active and inactive m^6A methylation cells in AML. (A) Score of 23 m^6A modification activity. The threshold was chosen as 0.048 and the m^6A modification score of 8,465 cells exceeded the threshold value (The dotted lines overlaying each distribution represent Gaussian fits to the distribution data.). (B) The volcano plot illustrates the differential expression of genes between m^6A modification-active cells and m^6A modification-inactive cells. (C) GO gene set enrichment analysis results were shown as lollipop plot where on x-axis, -log of FDR adjusted p-value for GO terms were shown. (D) Barplot of significantly activated pathways in the m^6A-active (blue) and m^6A-inactive (green) cells. (E) Stacked barplots showing the frequencies of m^6A-active and m^6A-inactive cells in 14 cell types (p-value > 0.05: ns, p-value < 0.05: *, 0.05 < p-value < 0.01:**, 0.01 < p-value < 0.001:***, 0.001 < p-value < 0.0001:****). 3.2. Effects of m^6A modification status on gene regulatory networks and biological processes in leukemic cells We identified leukemic cells, including LSPCs, cDC-like, GMP-like, Mono-like and ProMono-like blasts, which exhibited the most significant impact on m^6A modification, as evidenced by the observation of more up- and down-regulated genes in their active and inactive states compared to other cell types ([87] Figure 2A ). To fully appreciate the complexity of m^6A modification, it is essential to complement gene expression analysis with an understanding of the underlying gene regulatory network. In this regard, we have constructed regulatory networks for leukemic and immune cells with distinct m^6A modification patterns using SCENIC, a transcription factor-based gene regulatory network. Compared with immune cells, leukemia cells showed greater differences in transcription factor activity between m^6A-active and m^6A-inactive cells ([88] Figure 2B ). Some transcription factors have higher activity in leukemic cells with active m^6A modification, such as BATF, GABPA, E2F8, E2F2, E2F3, E2F7, ELF1, YY1, HOXA9 ([89]30–[90]35) ([91] Supplementary Figure 1D , [92]Supplementary Table 1 ). The transcription factors ELF1, YY1, and E2F are key regulatory elements with a ubiquitous impact across diverse cell types, exerting a significant influence on the modulation of gene expression changes associated with M^6A modifications ([93] Figure 2C ; [94]Supplementary Figures 1D, E ). Tumor necrosis factor α-inducible protein 8 (TNFAIP8) is a novel anti-apoptotic molecule that plays a role in AML chemoresistance ([95]31, [96]36). ELF1 was reported to be responsible for the upregulation of TNFAIP8 expression in human AML patients ([97]31). YY1 has been reported to bind to the promoter region of METTL3 and promote its expression, resulting in increased AML cell proliferation ([98]34). E2F transcription factor 1 (E2F1) was reported to be involved in AML cell differentiation recently ([99]35). Figure 2. [100]Figure 2 [101]Open in a new tab Effect of m^6A modification status on transcriptional regulatory networks and pathway enrichment in four types of leukemia cells (A) Bubble plot showing the genes that were differentially expressed between m^6A-active and m^6A-inactive in each cell type. Node sizes correspond to -log10 of FDR adjusted p-value. (B) The volcano plots of differential transcription factor activity analysis of single-cell RNA sequencing data performed on each cell type comparing m^6A-active cells vs. m^6A-inactive cells. (C) Regulatory networks of transcription factors in four major types of leukemia cells (cDC-like, GMP-like, LSPCs and ProMono-like). (D) Pathway enrichment network of four major types of leukemia cells. To identify the unique m^6A-related genes in each of these cell types, we developed a three-step workflow ([102] Supplementary Figure 1F ). Firstly, we obtained the marker genes for each cell type, which enabled us to accurately classify the cells. Secondly, we performed differential expression analysis between m^6A active and inactive cells in each cell type. Finally, we obtained a list of unique m^6A-related genes for each cell type by overlapping the two sets of genes. We performed pathway enrichment analysis of the identified m^6A-related genes for each cell type, which revealed their biological significance. In cDC-like blasts, m^6A modification affected pathways related to antigen processing and presentation and 3’-UTR-mediated mRNA stabilization ([103] Figure 2D ). In GMP-like blasts, m^6A modification affected pathways related to negative regulation of mRNA metabolic process and maintenance of DNA methylation ([104] Figure 2D ). In LSPCs, m^6A modification affected pathways related to DNA topological change and mRNA methylation ([105] Figure 2D ). In ProMono-like blasts, m^6A modification impacted pathways related to negative regulation of mRNA metabolic process and mRNA modification ([106] Figure 2D ). In Mono-like blasts, m^6A modification affected pathways related to RNA metabolic process and transport ([107] Supplementary Figure 1G ). These findings provide insights into the distinct roles of m^6A modification in different cell types and highlight the importance of this epigenetic mechanism in regulating cellular function and its relationship to AML progression. 3.3. The impact of m^6A modification on gene regulatory networks in AML To better understand how leukemia cell or immune cell global regulatory networks are altered in m^6A modification pattern, a gene regulatory network (GRN) reasoning methodology called bigSCale([108]25, [109]26) was implemented to m^6A-active or m^6A-inactive cells ([110] Supplementary Table 2 ). In leukemic cells, the m^6A-active and m^6A-inactive GRN exhibited the similar densities ([111] Figure 3A ). To assess the biological relevance of the differences between m^6A-active and m^6A-inactive networks in leukemic cells, the major distinguishing variables PageRank delineating their topology were investigated. We distinguished the top 5 influencer genes ranked by PageRank, as a proxy for a gene’s influence on the network. Many of these genes have been previously linked to AML pathology, such as CDK1 and CKS1B ([112]37, [113]38) ([114] Figure 3A ). As an illustration, two genes, CDK1 and CKS1B, known for their involvement in G1/S and G2/M phase transitions of the eukaryotic cell cycle, exhibited no significant differential expression (p-value > 0.05) but concurrently showed increased centrality measures (as depicted in [115]Figure 3A ). This finding is particularly intriguing, given that leukemic cells, despite being the most deregulated cell type in the initial AML analysis, did not originally highlight the significance of CDK1 and CKS1B in this context. In each LPSCs, cell cycle distribution was determined by Zeng et al. ([116]20). Cycling LSPCs exhibited greater m^6A modification activity ([117] Supplementary Figure 1H ). However, in immune cells, the topology of the m^6A-active GRN showed a relatively higher density in active cells compared with inactive cells, as reflected by gene-gene relationships and modularity ([118] Figure 3B ). Overall, our findings suggest that m^6A modification pattern in leukemia cells or immune cells may have different effects on GRN topology, which could provide important insights into the pathological mechanisms of diseases. Figure 3. [119]Figure 3 [120]Open in a new tab Gene regulatory network of m^6A-inactive and m^6A-active cells in leukemic cells and immune cells. (A) Gene regulatory network of m^6A-inactive and m^6A-active cells in leukemic cells (inactive: Cutoff = 0.9, Node = 284, Edge = 883; active: Cutoff = 0.9, Node = 256, Edge = 873). (B) Gene regulatory network of m^6A-inactive and m^6A-active cells in immune cells (inactive: Cutoff = 0.9, Node = 146, Edge = 260; active: Cutoff = 0.9, Node = 2036, Edge = 11078). 3.4. Classification of AML subtypes using the m^6A-related genes in leukemia cells We integrated the unique m^6A-related genes of cDC-like blasts, GMP-like blasts, LSPCs, Mono-like blasts and ProMono-like blasts, and performed a consensus clustering analysis on TCGA LAML samples ([121] Supplementary Table 3 , see method). Our clustering analysis revealed three distinct subtypes with high consistency and robustness, which were identified based on the cumulative distribution function (CDF) of the consensus matrix ([122] Figure 4A ). Subsequently, we conducted survival analysis on these subtypes using Kaplan-Meier curves and found significant differences in OS between the subtypes ([123] Figure 4B ). We performed differential gene expression analysis on each subtype and identified pathways that were significantly enriched ([124] Figure 4C ). The first subtype, characterized by the poorest survival, demonstrated significant enrichment in immune system processes such as negative regulation of immune system processes, innate immune response, and adaptive immune response, in addition to pathways related to cell activation, cytokine stimulus, and inflammatory response. Based on these findings, we defined this subtype as the immune dysregulation (ID) subtype ([125] Figure 4C ). The second subtype, with intermediate survival, exhibited significant enrichment in pathways related to ovulation, response to xenobiotic stimulus, and neutrophil-mediated immunity, along with pathways related to chemotaxis, blood circulation, and hormone processing. We defined this subtype as the hormone regulation (HR) subtype ([126] Figure 4C ). The third subtype, which had the best survival, demonstrated significant enrichment in pathways related to cell morphogenesis, extracellular matrix organization, and lipid transport, as well as pathways related to cell junction organization and signaling by VEGF. Based on these findings, we have defined a new subtype, called the cellular adaptation (CA) subtype ([127] Figure 4C ; [128]Supplementary Table 4 ). Figure 4. [129]Figure 4 [130]Open in a new tab Classification of AML subtypes using the m^6A-related genes in leukemia cells by K-means analysis. (A) K = 3 was identified as the optimal value for consensus clustering. (B) Kaplan-Meier curves of overall survival (OS) among the three subtypes in the TCGA LAML cohort. (C) The enrichment statistics of ssGSEA signaling pathways of immune dysregulation subtype, hormone regulation subtype and cellular adaptation subtype. (D) The expression levels of m^6A-associated genes of immune dysregulation subtype, hormone regulation subtype and cellular adaptation subtype. (E) Sankey plot showing the correlation between the classification of AML subtypes using the m^6A-related genes and French-American-British classification of acute myeloid leukemia. (F) The survival curve demonstrates the prognostic outcomes of FAB classification from TCGA-LAML. (G) Kaplan-Meier curves of OS among the three subtypes in two other GEO datasets ([131]GSE146173 and [132]GSE106291). To comprehensively characterize the landscape of m^6A modifications across subtypes, we have examined the expression levels of m^6A-associated genes. Our analysis revealed that each subtype exhibits a distinctive m^6A expression profile, suggestive of subtype-specific regulation of m^6A modification ([133] Figure 4D ). For example, the immune dysregulation subtype was characterized by high expression of IGF2BP3, IGF2BP2, and FMR1, and low expression of ELAVL1, FTO, and METTL14 ([134] Figure 4D ). We further investigated the correlation between AML subtype classification using m^6A-related genes and the French-American-British (FAB) classification AML ([135]39). The classification obtained from m^6A-related genes, although not specifically correlated with FAB classifications, demonstrates a robust prognostic diagnostic capability ([136] Figures 4E, F ). The three subtypes are characterized from different aspects, including m^6A gene expression, m^6A modification activity of leukemic cells, and subtype distribution. To examine the robustness of our subtype identification, we have developed a composite machine-learning classifier based on m^6A-related genes and the DEGs among the three subtypes (see method, [137]Supplementary Table 5 ). We further validated the stability and reproducibility of the three subtypes in two additional GEO datasets, providing strong evidence of their statistical robustness ([138] Figure 4G ; [139]Supplementary Figure 2 ; [140]Supplementary Tables 6 , [141]7 ). 3.5. Comparison of genomic alterations of the three AML subtypes We utilized oncoplot to assess the genomic alterations in all subtypes of the TCGA LAML cohort ([142] Figures 5A, B ). Our analysis revealed that RUNX1, DNMT3A, and IDH2 mutations were most frequently observed in the immune dysregulation subtype, while being least prevalent in the other two subtypes ([143] Figure 5B ). Additionally, TTN, MUC16, KIT, ZFHX4, and FCGBP mutations were mainly observed in the cellular adaptation subtype ([144] Figure 5B ). Although the TMB of the immune dysregulation subtype was found to be higher than the remaining two subtypes, the difference was not statistically significant ([145] Figure 5C ). Similarly, the Burden of Copy Number gain was higher in the cellular adaptation subtype compared to the other two subtypes, whereas the Burden of Copy Number Loss was higher in the immune dysregulation subtype than in the other two subtypes; however, these differences failed to achieve statistical significance ([146] Figure 5D ). Notably, the copy number variation in KMT2A was specifically present in the immune dysregulation subtype ([147] Figure 5E ), highlighting the potential role of this gene in driving immune dysregulation in AML. Figure 5. [148]Figure 5 [149]Open in a new tab Comparison of genomic alterations of the three AML subtypes. (A) The genomic alterations among the three subtypes of the TCGA LAML cohort. (B) Mutation percentage of mostly mutated genes. (C) Tumor mutant burden difference among the three subtypes in the TCGA LAML cohort. (D) The Burden of Copy Number gain and the Burden of Copy Number Loss among the three subtypes in the TCGA LAML cohort. (E) Distinct Copy number alterations (CNA) profile among the three subtypes. 3.6. The three AML subtypes exhibited different immune statuses The immune differences in the three subtypes of AML were explored through a comprehensive analysis of the immune microenvironment. We employed the ESTIMATE algorithm to calculate immune scores, revealing that patients in the immune dysregulation subtype exhibited significantly higher ESTIMATE scores compared to those in the other two subtypes ([150] Figure 6A ). However, the prognosis of the cellular adaptation subtype was found to be better than that of the other two subtypes, despite its higher immune score. Further, the activation states of immune-related pathways also were measured ([151] Figure 6B ). Several pathways crucial for immune function activation, including ‘Antigen processing and presentation,’ ‘NK cell cytotoxicity,’ and the ‘TCR signaling pathway,’ consistently exhibited upregulation in both the immune dysregulation subtype and the cellular adaptation subtype. Consistent with the ESTIMATE algorithm, all immune pathways within the tumor microenvironment get the highest activation level in the immune dysregulation subtype. We conducted a more detailed examination of the expression levels of 78 immunomodulators in each subtype (as shown in [152]Figure 6C ). Notably, the majority of differentially expressed immunomodulators exhibited the highest levels in both the immune dysregulation subtype and the cellular adaptation subtype. This includes key genes related to antigen presentation (HLA-B, HLA-C, HLA-DPA1, HLA-DPB1, HLA-DQA1, HLA-DQA2, HLA-DQB1, HLA-DQB2, HLA-DRA, HLA-DRB1, HLA-DRB5, MICA and MICB). The total expressions of 78 immunomodulators were also the highest in the immune dysregulation subtype. However, it must be emphasized that unlike the cellular adaptation subtype, the suppressor proteins, such as CD274, CD276, PDCD1LG2, BTN3A1, BTN3A2, SLAMF7 and C10orf54 in the immune dysregulation subtype are also activated ([153] Figures 6C, D ; [154]Supplementary Figure 3A ). This is particularly remarkable because there was a significant difference in survival risk between the two subtypes ([155] Figures 4B, F ), but the immune dysregulation subtype and the cell adaptation subtype exhibits highly active immune functions. This compelled us to further analyze the key differences that determine the survival risk in these two subtypes and conducted pathway enrichment analysis on the top 100 differentially expressed genes in each subgroup. As expected, the immune dysregulation subtype not only exhibited highly activated immune responses, but also highly enriched pathways involved in negative regulation of immune responses ([156] Figure 6E ). Further GSEA confirmed our findings: in the immune dysregulation subtype, immune response and immune response processes were negatively regulated ([157] Figure 6F ), while the cell adaptation subtype exhibited enrichment in pathways related to DNA synthesis and protein synthesis ([158] Figure 6G ). Figure 6. [159]Figure 6 [160]Open in a new tab The three AML subtypes exhibited different immune statuses. (A) The ESTIMATE scores among the three subtypes in the TCGA LAML cohort. (B) The activation degree of 17 immune pathways in each tumor sample among the three subtypes. (C) The expression levels of 78 immunomodulators among the three subtypes. (D) Boxplot showing the activation status of suppressor proteins in different subtypes. (E) The pathway enrichment analysis on the top 100 differentially expressed genes in CA and ID subgroup. (F, G) GSEA results showing the activated signaling pathways in ID and CA subgroup. p-value > 0.05: ns, p-value < 0.05:*, 0.05 < p-value < 0.01: **, 0.01